Predicting Proteolysis in Complex Proteomes Using Deep Learning

Int J Mol Sci. 2021 Mar 17;22(6):3071. doi: 10.3390/ijms22063071.

Abstract

Both protease- and reactive oxygen species (ROS)-mediated proteolysis are thought to be key effectors of tissue remodeling. We have previously shown that comparison of amino acid composition can predict the differential susceptibilities of proteins to photo-oxidation. However, predicting protein susceptibility to endogenous proteases remains challenging. Here, we aim to develop bioinformatics tools to (i) predict cleavage site locations (and hence putative protein susceptibilities) and (ii) compare the predicted vulnerabilities of skin proteins to protease- and ROS-mediated proteolysis. The first goal of this study was to experimentally evaluate the ability of existing protease cleavage site prediction models (PROSPER and DeepCleave) to identify experimentally determined MMP9 cleavage sites in two purified proteins and in a complex human dermal fibroblast-derived extracellular matrix (ECM) proteome. We subsequently developed deep bidirectional recurrent neural network (BRNN) models to predict cleavage sites for 14 tissue proteases. The predictions of the new models were tested against experimental datasets and combined with amino acid composition analysis (to predict ultraviolet radiation (UVR)/ROS susceptibility) in a new web app: the Manchester proteome susceptibility calculator (MPSC). The BRNN models performed better in predicting cleavage sites in native dermal ECM proteins than existing models (DeepCleave and PROSPER), and application of MPSC to the skin proteome suggests that: compared with the elastic fiber network, fibrillar collagens may be susceptible primarily to protease-mediated proteolysis. We also identify additional putative targets of oxidative damage (dermatopontin, fibulins and defensins) and protease action (laminins and nidogen). MPSC has the potential to identify potential targets of proteolysis in disparate tissues and disease states.

Keywords: aging; biomarkers; deep-learning; degradomics; extracellular matrix; machine learning; protease; skin.

MeSH terms

  • Amino Acids / metabolism
  • Deep Learning*
  • Extracellular Matrix Proteins / metabolism
  • Humans
  • Neural Networks, Computer
  • Peptide Hydrolases / metabolism
  • Proteolysis* / radiation effects
  • Proteome / metabolism*
  • Reactive Oxygen Species / metabolism
  • Reproducibility of Results
  • Software
  • Ultraviolet Rays

Substances

  • Amino Acids
  • Extracellular Matrix Proteins
  • Proteome
  • Reactive Oxygen Species
  • Peptide Hydrolases

Grants and funding