Improved protein fold assignment using support vector machines

Int J Bioinform Res Appl. 2005;1(3):319-34. doi: 10.1504/IJBRA.2005.007909.

Abstract

Because of the relatively large gap of knowledge between number of protein sequences and protein structures, the ability to construct a computational model predicting structure from sequence information has become an important area of research. The knowledge of a protein's structure is crucial in understanding its biological role. In this work, we present a support vector machine based method for recognising a protein's fold from sequence information alone, where this sequence has less similarity with sequences of known structures. We have focused on improving multi-class classification, parameter tuning, descriptor design, and feature selection. The current implementation demonstrates better prediction accuracy than previous similar approaches, and has similar performance when compared with straightforward threading.

MeSH terms

  • Amino Acid Sequence
  • Proteins* / chemistry
  • Sequence Analysis, Protein
  • Software
  • Support Vector Machine*

Substances

  • Proteins