Identification of candidate cancer drivers by integrative Epi-DNA and Gene Expression (iEDGE) data analysis

Sci Rep. 2019 Nov 15;9(1):16904. doi: 10.1038/s41598-019-52886-z.

Abstract

The emergence of large-scale multi-omics data warrants method development for data integration. Genomic studies from cancer patients have identified epigenetic and genetic regulators - such as methylation marks, somatic mutations, and somatic copy number alterations (SCNAs), among others - as predictive features of cancer outcome. However, identification of "driver genes" associated with a given alteration remains a challenge. To this end, we developed a computational tool, iEDGE, to model cis and trans effects of (epi-)DNA alterations and identify potential cis driver genes, where cis and trans genes denote those genes falling within and outside the genomic boundaries of a given (epi-)genetic alteration, respectively. iEDGE first identifies the cis and trans gene expression signatures associated with the presence/absence of a particular epi-DNA alteration across samples. It then applies tests of statistical mediation to determine the cis genes predictive of the trans gene expression. Finally, cis and trans effects are annotated by pathway enrichment analysis to gain insights into the underlying regulatory networks. We used iEDGE to perform integrative analysis of SCNAs and gene expression data from breast cancer and 18 additional cancer types included in The Cancer Genome Atlas (TCGA). Notably, cis gene drivers identified by iEDGE were found to be significantly enriched for known driver genes from multiple compendia of validated oncogenes and tumor suppressors, suggesting that the remainder are of equal importance. Furthermore, predicted drivers were enriched for functionally relevant cancer genes with amplification-driven dependencies, which are of potential prognostic and therapeutic value. All the analyses results are accessible at https://montilab.bu.edu/iEDGE. In summary, integrative analysis of SCNAs and gene expression using iEDGE successfully identified known cancer driver genes and putative cancer therapeutic targets across 19 cancer types in the TCGA. The proposed method can easily be applied to the integration of gene expression profiles with other epi-DNA assays in a variety of disease contexts.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Breast Neoplasms / genetics
  • Breast Neoplasms / pathology
  • Cell Transformation, Neoplastic / genetics*
  • Cell Transformation, Neoplastic / pathology
  • DNA Copy Number Variations
  • Epigenesis, Genetic / physiology
  • Epigenome*
  • Female
  • Gene Dosage
  • Gene Expression Regulation, Neoplastic
  • Gene Regulatory Networks
  • Genes, BRCA1
  • Genes, BRCA2
  • Genetic Association Studies / methods*
  • Genomics / methods*
  • Humans
  • Mutation
  • Oncogenes*
  • Transcriptome*