Enhanced Protein Fold Prediction Method Through a Novel Feature Extraction Technique

IEEE Trans Nanobioscience. 2015 Sep;14(6):649-59. doi: 10.1109/TNB.2015.2450233.

Abstract

Information of protein 3-dimensional (3D) structures plays an essential role in molecular biology, cell biology, biomedicine, and drug design. Protein fold prediction is considered as an immediate step for deciphering the protein 3D structures. Therefore, protein fold prediction is one of fundamental problems in structural bioinformatics. Recently, numerous taxonomic methods have been developed for protein fold prediction. Unfortunately, the overall prediction accuracies achieved by existing taxonomic methods are not satisfactory although much progress has been made. To address this problem, we propose a novel taxonomic method, called PFPA, which is featured by combining a novel feature set through an ensemble classifier. Particularly, the sequential evolution information from the profiles of PSI-BLAST and the local and global secondary structure information from the profiles of PSI-PRED are combined to construct a comprehensive feature set. Experimental results demonstrate that PFPA outperforms the state-of-the-art predictors. To be specific, when tested on the independent testing set of a benchmark dataset, PFPA achieves an overall accuracy of 73.6%, which is the leading accuracy ever reported. Moreover, PFPA performs well without significant performance degradation on three updated large-scale datasets, indicating the robustness and generalization of PFPA. Currently, a webserver that implements PFPA is freely available on http://121.192.180.204:8080/PFPA/Index.html.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Protein Folding*
  • Protein Structure, Secondary
  • Proteins / chemistry*
  • Proteins / metabolism*
  • Sequence Analysis, Protein / methods*

Substances

  • Proteins