A Dual Model for Prioritizing Cancer Mutations in the Non-coding Genome Based on Germline and Somatic Events

PLoS Comput Biol. 2015 Nov 20;11(11):e1004583. doi: 10.1371/journal.pcbi.1004583. eCollection 2015 Nov.

Abstract

We address here the issue of prioritizing non-coding mutations in the tumoral genome. To this aim, we created two independent computational models. The first (germline) model estimates purifying selection based on population SNP data. The second (somatic) model estimates tumor mutation density based on whole genome tumor sequencing. We show that each model reflects a different set of constraints acting either on the normal or tumor genome, and we identify the specific genome features that most contribute to these constraints. Importantly, we show that the somatic mutation model carries independent functional information that can be used to narrow down the non-coding regions that may be relevant to cancer progression. On this basis, we identify positions in non-coding RNAs and the non-coding parts of mRNAs that are both under purifying selection in the germline and protected from mutation in tumors, thus introducing a new strategy for future detection of cancer driver elements in the expressed non-coding genome.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Genome, Human / genetics*
  • Humans
  • Models, Genetic*
  • Mutation / genetics*
  • Neoplasms / genetics*
  • RNA, Untranslated / genetics*
  • Sequence Analysis, DNA

Substances

  • RNA, Untranslated

Grants and funding

This project was funded in part by "Plan Cancer – Systems Biology" grant #bio2014-04 to DG, SM and Amor. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.