A two-layer classification framework for protein fold recognition

J Theor Biol. 2015 Jan 21:365:32-9. doi: 10.1016/j.jtbi.2014.09.032. Epub 2014 Sep 30.

Abstract

Protein fold recognition is one of the interesting studies in bioinformatic to predicting the tertiary structure of proteins. In this paper, an individual method and a fusion method are proposed for protein fold recognition. A Two Layer Classification Framework (TLCF) is proposed as individual method. This framework comprises of two layers: in the first layer, the structural class of protein is predicted. The classifier in this layer classifies the instances into four structural classes: all alpha, all beta, alpha/beta, and alpha+beta. Then, the classification results will be added as a new feature to further training and testing datasets. Using the results of the first layer, we employ another classifier for predicting 27 folding classes in the second layer. The results indicate that the proposed approach is very effective to improve the prediction accuracy where the measured values of MCC, specificity, and sensitivity are promising. TLCF(⁎) is proposed as a fusion method that exploits TLCF as a base model. The experimental results indicate that the proposed methods improve prediction accuracy by 2-10% on a benchmark dataset.

Keywords: Ensemble classifiers; Fusion system; Supervised learning.

MeSH terms

  • Databases, Protein*
  • Protein Folding*
  • Protein Structure, Secondary
  • Proteins / chemistry*
  • Proteins / genetics
  • Sequence Analysis, Protein / methods*

Substances

  • Proteins