Predicting the Effect of Single and Multiple Mutations on Protein Structural Stability

Molecules. 2018 Jan 27;23(2):251. doi: 10.3390/molecules23020251.

Abstract

Predicting how a point mutation alters a protein's stability can guide pharmaceutical drug design initiatives which aim to counter the effects of serious diseases. Conducting mutagenesis studies in physical proteins can give insights about the effects of amino acid substitutions, but such wet-lab work is prohibitive due to the time as well as financial resources needed to assess the effect of even a single amino acid substitution. Computational methods for predicting the effects of a mutation on a protein structure can complement wet-lab work, and varying approaches are available with promising accuracy rates. In this work we compare and assess the utility of several machine learning methods and their ability to predict the effects of single and double mutations. We in silico generate mutant protein structures, and compute several rigidity metrics for each of them. We use these as features for our Support Vector Regression (SVR), Random Forest (RF), and Deep Neural Network (DNN) methods. We validate the predictions of our in silico mutations against experimental Δ Δ G stability data, and attain Pearson Correlation values upwards of 0.71 for single mutations, and 0.81 for double mutations. We perform ablation studies to assess which features contribute most to a model's success, and also introduce a voting scheme to synthesize a single prediction from the individual predictions of the three models.

Keywords: DNN; RF; SVR; machine learning; protein mutational study; rigidity analysis.

MeSH terms

  • Amino Acid Substitution
  • Computer Simulation
  • Decision Trees*
  • Mutation*
  • Neural Networks, Computer*
  • Protein Conformation
  • Protein Stability
  • Proteins / chemistry*
  • Support Vector Machine*
  • Thermodynamics

Substances

  • Proteins