A two-layer classification framework for protein fold recognition

Reza Zohouri Aram; Nasrollah Moghadam Charkari

doi:10.1016/j.jtbi.2014.09.032

A two-layer classification framework for protein fold recognition

J Theor Biol. 2015 Jan 21:365:32-9. doi: 10.1016/j.jtbi.2014.09.032. Epub 2014 Sep 30.

Authors

Reza Zohouri Aram¹, Nasrollah Moghadam Charkari²

Affiliations

¹ Faculty of Electrical & Computer Engineering, University of Tarbiat Modares, Tehran, Iran. Electronic address: r.zohori@modares.ac.ir.
² Faculty of Electrical & Computer Engineering, University of Tarbiat Modares, Tehran, Iran. Electronic address: charkari@modares.ac.ir.

PMID: 25277719
DOI: 10.1016/j.jtbi.2014.09.032

Abstract

Protein fold recognition is one of the interesting studies in bioinformatic to predicting the tertiary structure of proteins. In this paper, an individual method and a fusion method are proposed for protein fold recognition. A Two Layer Classification Framework (TLCF) is proposed as individual method. This framework comprises of two layers: in the first layer, the structural class of protein is predicted. The classifier in this layer classifies the instances into four structural classes: all alpha, all beta, alpha/beta, and alpha+beta. Then, the classification results will be added as a new feature to further training and testing datasets. Using the results of the first layer, we employ another classifier for predicting 27 folding classes in the second layer. The results indicate that the proposed approach is very effective to improve the prediction accuracy where the measured values of MCC, specificity, and sensitivity are promising. TLCF(⁎) is proposed as a fusion method that exploits TLCF as a base model. The experimental results indicate that the proposed methods improve prediction accuracy by 2-10% on a benchmark dataset.

Keywords: Ensemble classifiers; Fusion system; Supervised learning.

MeSH terms

Databases, Protein*
Protein Folding*
Protein Structure, Secondary
Proteins / chemistry*
Proteins / genetics
Sequence Analysis, Protein / methods*

Substances

Proteins