Computational prediction of protein folding rate using structural parameters and network centrality measures

Comput Biol Med. 2023 Mar:155:106436. doi: 10.1016/j.compbiomed.2022.106436. Epub 2023 Feb 15.

Abstract

Protein folding is a complex physicochemical process whereby a polymer of amino acids samples numerous conformations in its unfolded state before settling on an essentially unique native three-dimensional (3D) structure. To understand this process, several theoretical studies have used a set of 3D structures, identified different structural parameters, and analyzed their relationships using the natural logarithmic protein folding rate (ln(kf)). Unfortunately, these structural parameters are specific to a small set of proteins that are not capable of accurately predicting ln(kf) for both two-state (TS) and non-two-state (NTS) proteins. To overcome the limitations of the statistical approach, a few machine learning (ML)-based models have been proposed using limited training data. However, none of these methods can explain plausible folding mechanisms. In this study, we evaluated the predictive capabilities of ten different ML algorithms using eight different structural parameters and five different network centrality measures based on newly constructed datasets. In comparison to the other nine regressors, support vector machine was found to be the most appropriate for predicting ln(kf) with mean absolute differences of 1.856, 1.55, and 1.745 for the TS, NTS, and combined datasets, respectively. Furthermore, combining structural parameters and network centrality measures improves the prediction performance compared to individual parameters, indicating that multiple factors are involved in the folding process.

Keywords: Folding type; Logarithmic folding rate; Machine learning algorithms; Network centrality measures; Non-two-state; Structural parameters; Support vector machine; Two-state.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acids / chemistry
  • Models, Theoretical
  • Protein Folding*
  • Proteins* / chemistry

Substances

  • Proteins
  • Amino Acids