Application of transfer learning for cancer drug sensitivity prediction

BMC Bioinformatics. 2018 Dec 28;19(Suppl 17):497. doi: 10.1186/s12859-018-2465-y.

Abstract

Background: In precision medicine, scarcity of suitable biological data often hinders the design of an appropriate predictive model. In this regard, large scale pharmacogenomics studies, like CCLE and GDSC hold the promise to mitigate the issue. However, one cannot directly employ data from multiple sources together due to the existing distribution shift in data. One way to solve this problem is to utilize the transfer learning methodologies tailored to fit in this specific context.

Results: In this paper, we present two novel approaches for incorporating information from a secondary database for improving the prediction in a target database. The first approach is based on latent variable cost optimization and the second approach considers polynomial mapping between the two databases. Utilizing CCLE and GDSC databases, we illustrate that the proposed approaches accomplish a better prediction of drug sensitivities for different scenarios as compared to the existing approaches.

Conclusion: We have compared the performance of the proposed predictive models with database-specific individual models as well as existing transfer learning approaches. We note that our proposed approaches exhibit superior performance compared to the abovementioned alternative techniques for predicting sensitivity for different anti-cancer compounds, particularly the nonlinear mapping model shows the best overall performance.

Keywords: CCLE; Cost optimization; Drug sensitivity prediction; GDSC; Latent variable; Nonlinear mapping; Pharmacogenomic studies; Transfer learning.

MeSH terms

  • Algorithms*
  • Antineoplastic Agents / therapeutic use*
  • Area Under Curve
  • Databases, Factual
  • Gene Expression Regulation, Neoplastic
  • Humans
  • Neoplasms / drug therapy*
  • Neoplasms / genetics

Substances

  • Antineoplastic Agents