Prediction of the parallel/antiparallel orientation of beta-strands using amino acid pairing preferences and support vector machines

J Theor Biol. 2010 Apr 7;263(3):360-8. doi: 10.1016/j.jtbi.2009.12.019. Epub 2009 Dec 24.

Abstract

In principle, structural information of protein sequences with no detectable homology to a protein of known structure could be obtained by predicting the arrangement of their secondary structural elements. Although some ab initio methods for protein structure prediction have been reported, the long-range interactions required to accurately predict tertiary structures of beta-sheet containing proteins are still difficult to simulate. To remedy this problem and facilitate de novo prediction of beta-sheet containing protein structures, we developed a support vector machine (SVM) approach that classified parallel and antiparallel orientation of beta-strands by using the information of interstrand amino acid pairing preferences. Based on a second-order statistics on the relative frequencies of each possible interstrand amino acid pair, we defined an average amino acid pairing encoding matrix (APEM) for encoding beta-strands as input in the prediction model. As a result, a prediction accuracy of 86.89% and a Matthew's correlation coefficient value of 0.71 have been achieved through 7-fold cross-validation on a non-redundant protein dataset from PISCES. Although several issues still remain to be studied, the method presented here to some extent could indicate the important contribution of the amino acid pairs to the beta-strand orientation, and provide a possible way to further be combined with other algorithms making a full 'identification' of beta-strands.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acids / chemistry*
  • Models, Theoretical*
  • Proteins / chemistry

Substances

  • Amino Acids
  • Proteins