Machine Learning Classification and Structure-Functional Analysis of Cancer Mutations Reveal Unique Dynamic and Network Signatures of Driver Sites in Oncogenes and Tumor Suppressor Genes

J Chem Inf Model. 2018 Oct 22;58(10):2131-2150. doi: 10.1021/acs.jcim.8b00414. Epub 2018 Oct 3.

Abstract

In this study, we developed two cancer-specific machine learning classifiers for prediction of driver mutations in cancer-associated genes that were validated on canonical data sets of functionally validated mutations and applied to a large cancer genomics data set. By examining sequence, structure, and ensemble-based integrated features, we have shown that evolutionary conservation scores play a critical role in classification of cancer drivers and provide the strongest signal in the machine learning prediction. Through extensive comparative analysis with structure-functional experiments and multicenter mutational calling data from Pan Cancer Atlas studies, we have demonstrated the robustness of our models and addressed the validity of computational predictions. To address the interpretability of cancer-specific classification models and obtain novel insights about molecular signatures of driver mutations, we have complemented machine learning predictions with structure-functional analysis of cancer driver mutations in several important oncogenes and tumor suppressor genes. By examining structural and dynamic signatures of known mutational hotspots and the predicted driver mutations, we have shown that the greater flexibility of specific functional regions targeted by driver mutations in oncogenes may facilitate activating conformational changes, while loss-of-function driver mutations in tumor suppressor genes can preferentially target structurally rigid positions that mediate allosteric communications in residue interaction networks and modulate protein binding interfaces. By revealing molecular signatures of cancer driver mutations, our results highlighted limitations of the binary driver/passenger classification, suggesting that functionally relevant cancer mutations may span a continuum spectrum of driverlike effects. Based on this analysis, we propose for experimental testing a group of novel potential driver mutations that can act by altering structure, global dynamics, and allosteric interaction networks in important cancer genes.

MeSH terms

  • Gene Expression Regulation, Neoplastic
  • Genes, Tumor Suppressor*
  • Humans
  • Machine Learning*
  • Mutation
  • Neoplasms / genetics*
  • Oncogenes / genetics*