Identification of relevant genetic alterations in cancer using topological data analysis

Nat Commun. 2020 Jul 30;11(1):3808. doi: 10.1038/s41467-020-17659-7.

Abstract

Large-scale cancer genomic studies enable the systematic identification of mutations that lead to the genesis and progression of tumors, uncovering the underlying molecular mechanisms and potential therapies. While some such mutations are recurrently found in many tumors, many others exist solely within a few samples, precluding detection by conventional recurrence-based statistical approaches. Integrated analysis of somatic mutations and RNA expression data across 12 tumor types reveals that mutations of cancer genes are usually accompanied by substantial changes in expression. We use topological data analysis to leverage this observation and uncover 38 elusive candidate cancer-associated genes, including inactivating mutations of the metalloproteinase ADAMTS12 in lung adenocarcinoma. We show that ADAMTS12-/- mice have a five-fold increase in the susceptibility to develop lung tumors, confirming the role of ADAMTS12 as a tumor suppressor gene. Our results demonstrate that data integration through topological techniques can increase our ability to identify previously unreported cancer-related alterations.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • ADAMTS Proteins / genetics*
  • Adenocarcinoma of Lung / genetics*
  • Animals
  • Cell Line, Tumor
  • Computational Biology / methods
  • Data Analysis
  • Genetic Predisposition to Disease / genetics*
  • Lung Neoplasms / genetics*
  • Mice
  • Mice, Inbred C57BL
  • Mice, Knockout
  • Mutation / genetics
  • Neoplasm Recurrence, Local / genetics
  • Oncogenes / genetics

Substances

  • ADAMTS Proteins
  • Adamts12 protein, mouse