Machine learning methods for protein structure prediction

IEEE Rev Biomed Eng. 2008:1:41-9. doi: 10.1109/RBME.2008.2008239.

Abstract

Machine learning methods are widely used in bioinformatics and computational and systems biology. Here, we review the development of machine learning methods for protein structure prediction, one of the most fundamental problems in structural biology and bioinformatics. Protein structure prediction is such a complex problem that it is often decomposed and attacked at four different levels: 1-D prediction of structural features along the primary sequence of amino acids; 2-D prediction of spatial relationships between amino acids; 3-D prediction of the tertiary structure of a protein; and 4-D prediction of the quaternary structure of a multiprotein complex. A diverse set of both supervised and unsupervised machine learning methods has been applied over the years to tackle these problems and has significantly contributed to advancing the state-of-the-art of protein structure prediction. In this paper, we review the development and application of hidden Markov models, neural networks, support vector machines, Bayesian methods, and clustering methods in 1-D, 2-D, 3-D, and 4-D protein structure predictions.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Review

MeSH terms

  • Bayes Theorem
  • Markov Chains
  • Models, Molecular*
  • Neural Networks, Computer*
  • Protein Conformation*
  • Proteins / chemistry*
  • Proteins / genetics*
  • Sequence Analysis, Protein / methods*

Substances

  • Proteins