Convolution Neural Network-Based Prediction of Protein Thermostability

J Chem Inf Model. 2019 Nov 25;59(11):4833-4843. doi: 10.1021/acs.jcim.9b00220. Epub 2019 Oct 28.

Abstract

Most natural proteins exhibit poor thermostability, which limits their industrial application. Computer-aided rational design is an efficient purpose-oriented method that can improve protein thermostability. Numerous machine-learning-based methods have been designed to predict the changes in protein thermostability induced by mutations. However, all of these methods have certain limitations due to existing mutation coding methods that overlook protein sequence features. Here we propose a method to predict protein thermostability using convolutional neural networks based on an in-depth study of thermostability-related protein properties. This method comprises a three-dimensional coding algorithm, including protein mutation information and a strategy to extract neighboring features at protein mutation sites based on multiscale convolution. The accuracies on the S1615 and S388 data sets, which are widely used for protein thermostability predictions, reached 86.4 and 87%, respectively. The Matthews correlation coefficient was nearly double those produced using other methods. Furthermore, a model was constructed to predict the thermostability of Rhizomucor miehei lipase mutants based on the S3661 data set, a single amino acid mutation data set screened from the ProTherm protein thermodynamics database. Compared with the RIF strategy, which consists of three algorithms, i.e., Rosetta ddg monomer, I Mutant 3.0, and FoldX, the accuracy of the proposed method was higher (75.0 vs 66.7%), and the negative sample resolution was simultaneously enhanced. These results indicate that our prediction method more effectively assessed the protein thermostability and distinguished its features, making it a powerful tool to devise mutations that enhance the thermostability of proteins, particularly enzymes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Humans
  • Models, Chemical
  • Models, Molecular
  • Neural Networks, Computer
  • Point Mutation
  • Protein Stability
  • Proteins / chemistry*
  • Proteins / genetics
  • Temperature
  • Thermodynamics

Substances

  • Proteins