Prediction of enzyme classes from 3D structure: a general model and examples of experimental-theoretic scoring of peptide mass fingerprints of Leishmania proteins

J Proteome Res. 2009 Sep;8(9):4372-82. doi: 10.1021/pr9003163.

Abstract

The number of protein and peptide structures included in Protein Data Bank (PDB) and Gen Bank without functional annotation has increased. Consequently, there is a high demand for theoretical models to predict these functions. Here, we trained and validated, with an external set, a Markov Chain Model (MCM) that classifies proteins by their possible mechanism of action according to Enzyme Classification (EC) number. The methodology proposed is essentially new, and enables prediction of all EC classes with a single equation without the need for an equation for each class or nonlinear models with multiple outputs. In addition, the model may be used to predict whether one peptide presents a positive or negative contribution of the activity of the same EC class. The model predicts the first EC number for 106 out of 151 (70.2%) oxidoreductases, 178/178 (100%) transferases, 223/223 (100%) hydrolases, 64/85 (75.3%) lyases, 74/74 (100%) isomerases, and 100/100 (100%) ligases, as well as 745/811 (91.9%) nonenzymes. It is important to underline that this method may help us predict new enzyme proteins or select peptide candidates that improve enzyme activity, which may be of interest for the prediction of new drugs or drug targets. To illustrate the model's application, we report the 2D-Electrophoresis (2DE) isolation from Leishmania infantum as well as MADLI TOF Mass Spectra characterization and theoretical study of the Peptide Mass Fingerprints (PMFs) of a new protein sequence. The theoretical study focused on MASCOT, BLAST alignment, and alignment-free QSAR prediction of the contribution of 29 peptides found in the PMF of the new protein to specific enzyme action. This combined strategy may be used to identify and predict peptides of prokaryote and eukaryote parasites and their hosts as well as other superior organisms, which may be of interest in drug development or target identification.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Databases, Protein
  • Discriminant Analysis
  • Electrophoresis, Gel, Two-Dimensional
  • Enzymes / chemistry
  • Enzymes / classification*
  • Leishmania infantum / enzymology*
  • Models, Molecular
  • Peptide Mapping
  • Protein Conformation
  • Protozoan Proteins / chemistry
  • Protozoan Proteins / classification*
  • Quantitative Structure-Activity Relationship
  • Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization

Substances

  • Enzymes
  • Protozoan Proteins