Shared Cancer Dataset Analysis Identifies and Predicts the Quantitative Effects of Pan-Cancer Somatic Driver Variants

Cancer Res. 2023 Jan 4;83(1):74-88. doi: 10.1158/0008-5472.CAN-22-1038.

Abstract

Driver mutations endow tumors with selective advantages and produce an array of pathogenic effects. Determining the function of somatic variants is important for understanding cancer biology and identifying optimal therapies. Here, we compiled a shared dataset from several cancer genomic databases. Two measures were applied to 535 cancer genes based on observed and expected frequencies of driver variants as derived from cancer-specific rates of somatic mutagenesis. The first measure comprised a binary classifier based on a binomial test; the second was tumor variant amplitude (TVA), a continuous measure representing the selective advantage of individual variants. TVA outperformed all other computational tools in terms of its correlation with experimentally derived functional scores of cancer mutations. TVA also highly correlated with drug response, overall survival, and other clinical implications in relevant cancer genes. This study demonstrates how a selective advantage measure based on a large cancer dataset significantly impacts our understanding of the spectral effect of driver variants in cancer. The impact of this information will increase as cancer treatment becomes more precise and personalized to tumor-specific mutations.

Significance: A new selective advantage estimation assists in oncogenic driver identification and relative effect measurements, enabling better prognostication, therapy selection, and prioritization.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology*
  • Humans
  • Mutation
  • Neoplasms* / genetics
  • Oncogenes