Personalized disease signatures through information-theoretic compaction of big cancer data

Proc Natl Acad Sci U S A. 2018 Jul 24;115(30):7694-7699. doi: 10.1073/pnas.1804214115. Epub 2018 Jul 5.

Abstract

Every individual cancer develops and grows in its own specific way, giving rise to a recognized need for the development of personalized cancer diagnostics. This suggested that the identification of patient-specific oncogene markers would be an effective diagnostics approach. However, tumors that are classified as similar according to the expression levels of certain oncogenes can eventually demonstrate divergent responses to treatment. This implies that the information gained from the identification of tumor-specific biomarkers is still not sufficient. We present a method to quantitatively transform heterogeneous big cancer data to patient-specific transcription networks. These networks characterize the unbalanced molecular processes that deviate the tissue from the normal state. We study a number of datasets spanning five different cancer types, aiming to capture the extensive interpatient heterogeneity that exists within a specific cancer type as well as between cancers of different origins. We show that a relatively small number of altered molecular processes suffices to accurately characterize over 500 tumors, showing extreme compaction of the data. Every patient is characterized by a small specific subset of unbalanced processes. We validate the result by verifying that the processes identified characterize other cancer patients as well. We show that different patients may display similar oncogene expression levels, albeit carrying biologically distinct tumors that harbor different sets of unbalanced molecular processes. Thus, tumors may be inaccurately classified and addressed as similar. These findings highlight the need to expand the notion of tumor-specific oncogenic biomarkers to patient-specific, comprehensive transcriptional networks for improved patient-tailored diagnostics.

Keywords: cancer diagnostics; information theory; intertumor heterogeneity; patient-specific gene expression signatures; surprisal analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Genetic*
  • Gene Expression Regulation, Neoplastic*
  • Gene Regulatory Networks*
  • Humans
  • Neoplasms* / classification
  • Neoplasms* / genetics
  • Neoplasms* / metabolism
  • Patient-Specific Modeling*
  • Transcriptome*