Protein Remote Homology Detection Based on an Ensemble Learning Approach

Biomed Res Int. 2016:2016:5813645. doi: 10.1155/2016/5813645. Epub 2016 May 8.

Abstract

Protein remote homology detection is one of the central problems in bioinformatics. Although some computational methods have been proposed, the problem is still far from being solved. In this paper, an ensemble classifier for protein remote homology detection, called SVM-Ensemble, was proposed with a weighted voting strategy. SVM-Ensemble combined three basic classifiers based on different feature spaces, including Kmer, ACC, and SC-PseAAC. These features consider the characteristics of proteins from various perspectives, incorporating both the sequence composition and the sequence-order information along the protein sequences. Experimental results on a widely used benchmark dataset showed that the proposed SVM-Ensemble can obviously improve the predictive performance for the protein remote homology detection. Moreover, it achieved the best performance and outperformed other state-of-the-art methods.

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Models, Theoretical
  • Proteins / genetics*
  • Sequence Homology, Amino Acid*

Substances

  • Proteins