Intelligent Consensus Modeling for Proline Cis-Trans Isomerization Prediction

IEEE/ACM Trans Comput Biol Bioinform. 2014 Jan-Feb;11(1):26-32. doi: 10.1109/TCBB.2013.132.

Abstract

Proline cis-trans isomerization (CTI) plays a key role in the rate-determining steps of protein folding. Accurate prediction of proline CTI is of great importance for the understanding of protein folding, splicing, cell signaling, and transmembrane active transport in both the human body and animals. Our goal is to develop a state-of-the-art proline CTI predictor based on a biophysically motivated intelligent consensus modeling through the use of sequence information only (i.e., position specific scores generated by PSI-BLAST). The current computational proline CTI predictors reach about 70-73 percent Q2 accuracies and about 0.40 Matthew correlation coefficient (Mcc) through the use of sequence-based evolutionary information as well as predicted protein secondary structure information. However, our approach that utilizes a novel decision tree-based consensus model with a powerful randomized-metal earning technique has achieved 86.58 percent Q2 accuracy and 0.74 Mcc, on the same proline CTI data set, which is a better result than those of any existing computational proline CTI predictors reported in the literature.

MeSH terms

  • Computational Biology / methods*
  • Decision Trees
  • Isomerism
  • Machine Learning*
  • Proline / chemistry*
  • Proline / metabolism
  • Protein Folding
  • Reproducibility of Results
  • Sequence Analysis, Protein / methods*

Substances

  • Proline