Assessment of intratumoral heterogeneity with mutations and gene expression profiles

PLoS One. 2019 Jul 16;14(7):e0219682. doi: 10.1371/journal.pone.0219682. eCollection 2019.

Abstract

Intratumoral heterogeneity (ITH) refers to the presence of distinct tumor cell populations. It provides vital information for the clinical prognosis, drug responsiveness, and personalized treatment of cancer patients. As genomic ITH in various cancers affects the expression patterns of genes, the expression profile could be utilized for determining ITH level. Herein, we present a novel approach to directly detect high ITH defined as a larger number of subclones from the gene expression pattern through machine learning approaches. We examined associations between gene expression profile and ITH of 12 cancer types from The Cancer Genome Atlas (TCGA) database. Using stomach adenocarcinoma (STAD) showing high association, we evaluated the performance of our method in predicting ITH by employing three machine learning algorithms using gene expression profile data. We classified tumors into high and low heterogeneity groups using the learning model through the selection of LASSO feature. The result showed that support vector machines (SVMs) outperformed other algorithms (AUC = 0.84 in SVMs and 0.82 in Naïve Bayes) and we were able to improve predictive power by using both combined data from mutation and expression. Furthermore, we evaluated the prediction ability of each model using simulation data generated by mixing cell lines of the Cancer Cell Line Encyclopedia (CCLE), and obtained consistent results with using real dataset. Our approach could be utilized for discriminating tumors with heterogeneous cell populations to characterize ITH.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adenocarcinoma / genetics*
  • Algorithms
  • Area Under Curve
  • Bayes Theorem
  • Cell Line, Tumor
  • Computer Simulation
  • Databases, Factual
  • Gene Expression Profiling*
  • Gene Expression Regulation, Neoplastic
  • Genetic Heterogeneity
  • Genome, Human
  • Genomics
  • Humans
  • Mutation*
  • Prognosis
  • ROC Curve
  • Stomach Neoplasms / genetics*
  • Support Vector Machine
  • Transcriptome

Grants and funding

This research was supported by the Samsung Medical Center, and the Basic Science Research Program through the National Research Foundation of Korea (NRF), funded by the Ministry of Science, ICT & Future Planning (JGJ: 2017R1A2B1007347) and (JGJ: 2018R1D1A1B07048531). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.