Modeling gene-wise dependencies improves the identification of drug response biomarkers in cancer studies

Bioinformatics. 2017 May 1;33(9):1362-1369. doi: 10.1093/bioinformatics/btw836.

Abstract

Motivation: In recent years, vast advances in biomedical technologies and comprehensive sequencing have revealed the genomic landscape of common forms of human cancer in unprecedented detail. The broad heterogeneity of the disease calls for rapid development of personalized therapies. Translating the readily available genomic data into useful knowledge that can be applied in the clinic remains a challenge. Computational methods are needed to aid these efforts by robustly analyzing genome-scale data from distinct experimental platforms for prioritization of targets and treatments.

Results: We propose a novel, biologically motivated, Bayesian multitask approach, which explicitly models gene-centric dependencies across multiple and distinct genomic platforms. We introduce a gene-wise prior and present a fully Bayesian formulation of a group factor analysis model. In supervised prediction applications, our multitask approach leverages similarities in response profiles of groups of drugs that are more likely to be related to true biological signal, which leads to more robust performance and improved generalization ability. We evaluate the performance of our method on molecularly characterized collections of cell lines profiled against two compound panels, namely the Cancer Cell Line Encyclopedia and the Cancer Therapeutics Response Portal. We demonstrate that accounting for the gene-centric dependencies enables leveraging information from multi-omic input data and improves prediction and feature selection performance. We further demonstrate the applicability of our method in an unsupervised dimensionality reduction application by inferring genes essential to tumorigenesis in the pancreatic ductal adenocarcinoma and lung adenocarcinoma patient cohorts from The Cancer Genome Atlas.

Availability and implementation: : The code for this work is available at https://github.com/olganikolova/gbgfa.

Contact: : nikolova@ohsu.edu or margolin@ohsu.edu.

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Adenocarcinoma / drug therapy
  • Adenocarcinoma / genetics
  • Adenocarcinoma / metabolism
  • Antineoplastic Agents / therapeutic use
  • Bayes Theorem
  • Biomarkers, Pharmacological*
  • Cell Line
  • Cell Transformation, Neoplastic
  • Genes, Neoplasm*
  • Genomics / methods*
  • Humans
  • Lung Neoplasms / drug therapy
  • Lung Neoplasms / genetics
  • Lung Neoplasms / metabolism
  • Models, Genetic*
  • Neoplasms / drug therapy
  • Neoplasms / genetics
  • Neoplasms / metabolism*
  • Pancreatic Neoplasms / drug therapy
  • Pancreatic Neoplasms / genetics
  • Pancreatic Neoplasms / metabolism
  • Precision Medicine / methods*
  • Unsupervised Machine Learning

Substances

  • Antineoplastic Agents
  • Biomarkers, Pharmacological