Predicting Structural Susceptibility of Proteins to Proteolytic Processing

Int J Mol Sci. 2023 Jun 28;24(13):10761. doi: 10.3390/ijms241310761.

Abstract

The importance of 3D protein structure in proteolytic processing is well known. However, despite the plethora of existing methods for predicting proteolytic sites, only a few of them utilize the structural features of potential substrates as predictors. Moreover, to our knowledge, there is currently no method available for predicting the structural susceptibility of protein regions to proteolysis. We developed such a method using data from CutDB, a database that contains experimentally verified proteolytic events. For prediction, we utilized structural features that have been shown to influence proteolysis in earlier studies, such as solvent accessibility, secondary structure, and temperature factor. Additionally, we introduced new structural features, including length of protruded loops and flexibility of protein termini. To maximize the prediction quality of the method, we carefully curated the training set, selected an appropriate machine learning method, and sampled negative examples to determine the optimal positive-to-negative class size ratio. We demonstrated that combining our method with models of protease primary specificity can outperform existing bioinformatics methods for the prediction of proteolytic sites. We also discussed the possibility of utilizing this method for bioinformatics prediction of other post-translational modifications.

Keywords: protease substrates; proteases; regulatory proteolysis; substrate identification.

MeSH terms

  • Peptide Hydrolases* / metabolism
  • Protein Processing, Post-Translational
  • Protein Structure, Secondary
  • Proteins* / chemistry
  • Proteolysis
  • Substrate Specificity

Substances

  • Proteins
  • Peptide Hydrolases