PROSPER: an integrated feature-based tool for predicting protease substrate cleavage sites

Jiangning Song; Hao Tan; Andrew J Perry; Tatsuya Akutsu; Geoffrey I Webb; James C Whisstock; Robert N Pike

doi:10.1371/journal.pone.0050300

PROSPER: an integrated feature-based tool for predicting protease substrate cleavage sites

PLoS One. 2012;7(11):e50300. doi: 10.1371/journal.pone.0050300. Epub 2012 Nov 29.

Authors

Jiangning Song¹, Hao Tan, Andrew J Perry, Tatsuya Akutsu, Geoffrey I Webb, James C Whisstock, Robert N Pike

Affiliation

¹ Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Australia. Jiangning.Song@monash.edu

Abstract

The ability to catalytically cleave protein substrates after synthesis is fundamental for all forms of life. Accordingly, site-specific proteolysis is one of the most important post-translational modifications. The key to understanding the physiological role of a protease is to identify its natural substrate(s). Knowledge of the substrate specificity of a protease can dramatically improve our ability to predict its target protein substrates, but this information must be utilized in an effective manner in order to efficiently identify protein substrates by in silico approaches. To address this problem, we present PROSPER, an integrated feature-based server for in silico identification of protease substrates and their cleavage sites for twenty-four different proteases. PROSPER utilizes established specificity information for these proteases (derived from the MEROPS database) with a machine learning approach to predict protease cleavage sites by using different, but complementary sequence and structure characteristics. Features used by PROSPER include local amino acid sequence profile, predicted secondary structure, solvent accessibility and predicted native disorder. Thus, for proteases with known amino acid specificity, PROSPER provides a convenient, pre-prepared tool for use in identifying protein substrates for the enzymes. Systematic prediction analysis for the twenty-four proteases thus far included in the database revealed that the features we have included in the tool strongly improve performance in terms of cleavage site prediction, as evidenced by their contribution to performance improvement in terms of identifying known cleavage sites in substrates for these enzymes. In comparison with two state-of-the-art prediction tools, PoPS and SitePrediction, PROSPER achieves greater accuracy and coverage. To our knowledge, PROSPER is the first comprehensive server capable of predicting cleavage sites of multiple proteases within a single substrate sequence using machine learning techniques. It is freely available at http://lightning.med.monash.edu.au/PROSPER/.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Animals
Artificial Intelligence
Catalysis
Cattle
Computational Biology / methods
Granzymes / chemistry
Humans
Hydrolysis
Mice
Models, Statistical
Peptide Hydrolases / chemistry*
Peptides / chemistry
Protein Binding
Protein Conformation
Protein Processing, Post-Translational
Proteins / chemistry*
ROC Curve
Software
Solvents / chemistry
Substrate Specificity

Substances

Peptides
Proteins
Solvents
Peptide Hydrolases
Granzymes

Grants and funding

This work was supported by grants from the National Health and Medical Research Council of Australia (NHMRC) (490989), the Australian Research Council (ARC) (LP110200333), the Chinese Academy of Sciences (CAS), the Japan Society for the Promotion of Science (S11156), the Knowledge Innovation Program of CAS (KSCX2-EW-G-8) and Tianjin Municipal Science & Technology Commission (10ZCKFSY05600). JS is an NHMRC Peter Doherty Fellow and a Recipient of the Hundred Talents Program of CAS. AJP is an NHMRC Peter Doherty Fellow. JCW is an ARC Federation Fellow and an honorary NHMRC Principal Research Fellow. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.