Simultaneous prediction of multiple outcomes using revised stacking algorithms

Bioinformatics. 2020 Jan 1;36(1):65-72. doi: 10.1093/bioinformatics/btz531.

Abstract

Motivation: HIV is difficult to treat because its virus mutates at a high rate and mutated viruses easily develop resistance to existing drugs. If the relationships between mutations and drug resistances can be determined from historical data, patients can be provided personalized treatment according to their own mutation information. The HIV Drug Resistance Database was built to investigate the relationships. Our goal is to build a model using data in this database, which simultaneously predicts the resistance of multiple drugs using mutation information from sequences of viruses for any new patient.

Results: We propose two variations of a stacking algorithm which borrow information among multiple prediction tasks to improve multivariate prediction performance. The most attractive feature of our proposed methods is the flexibility with which complex multivariate prediction models can be constructed using any univariate prediction models. Using cross-validation studies, we show that our proposed methods outperform other popular multivariate prediction methods.

Availability and implementation: An R package is being developed. In the meantime, R code can be requested by email.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computational Biology / methods
  • Drug Resistance, Viral* / genetics
  • HIV Infections* / virology
  • HIV-1* / drug effects
  • HIV-1* / genetics
  • Humans
  • Mutation
  • Software