Application of transfer learning for cancer drug sensitivity prediction

Saugato Rahman Dhruba; Raziur Rahman; Kevin Matlock; Souparno Ghosh; Ranadip Pal

doi:10.1186/s12859-018-2465-y

Application of transfer learning for cancer drug sensitivity prediction

BMC Bioinformatics. 2018 Dec 28;19(Suppl 17):497. doi: 10.1186/s12859-018-2465-y.

Authors

Saugato Rahman Dhruba¹, Raziur Rahman¹, Kevin Matlock¹, Souparno Ghosh², Ranadip Pal³

Affiliations

¹ Department of Electrical and Computer Engineering, Texas Tech University, 1012 Boston Ave, Lubbock, 79409, TX, USA.
² Department of Mathematics and Statistics, Texas Tech University, 1108 Memorial Circle, Lubbock, 79409, TX, USA.
³ Department of Electrical and Computer Engineering, Texas Tech University, 1012 Boston Ave, Lubbock, 79409, TX, USA. ranadip.pal@ttu.edu.

Abstract

Background: In precision medicine, scarcity of suitable biological data often hinders the design of an appropriate predictive model. In this regard, large scale pharmacogenomics studies, like CCLE and GDSC hold the promise to mitigate the issue. However, one cannot directly employ data from multiple sources together due to the existing distribution shift in data. One way to solve this problem is to utilize the transfer learning methodologies tailored to fit in this specific context.

Results: In this paper, we present two novel approaches for incorporating information from a secondary database for improving the prediction in a target database. The first approach is based on latent variable cost optimization and the second approach considers polynomial mapping between the two databases. Utilizing CCLE and GDSC databases, we illustrate that the proposed approaches accomplish a better prediction of drug sensitivities for different scenarios as compared to the existing approaches.

Conclusion: We have compared the performance of the proposed predictive models with database-specific individual models as well as existing transfer learning approaches. We note that our proposed approaches exhibit superior performance compared to the abovementioned alternative techniques for predicting sensitivity for different anti-cancer compounds, particularly the nonlinear mapping model shows the best overall performance.

Keywords: CCLE; Cost optimization; Drug sensitivity prediction; GDSC; Latent variable; Nonlinear mapping; Pharmacogenomic studies; Transfer learning.

MeSH terms

Algorithms*
Antineoplastic Agents / therapeutic use*
Area Under Curve
Databases, Factual
Gene Expression Regulation, Neoplastic
Humans
Neoplasms / drug therapy*
Neoplasms / genetics

Substances

Antineoplastic Agents

Grants and funding

R01 GM122084/GM/NIGMS NIH HHS/United States