An information-theoretic classification of amino acids for the assessment of interfaces in protein-protein docking

J Mol Model. 2013 Sep;19(9):3901-10. doi: 10.1007/s00894-013-1916-7. Epub 2013 Jul 5.

Abstract

Docking represents a versatile and powerful method to predict the geometry of protein-protein complexes. However, despite significant methodical advances, the identification of good docking solutions among a large number of false solutions still remains a difficult task. We have previously demonstrated that the formalism of mutual information (MI) from information theory can be adapted to protein docking, and we have now extended this approach to enhance its robustness and applicability. A large dataset consisting of 22,934 docking decoys derived from 203 different protein-protein complexes was used for an MI-based optimization of reduced amino acid alphabets representing the protein-protein interfaces. This optimization relied on a clustering analysis that allows one to estimate the mutual information of whole amino acid alphabets by considering all structural features simultaneously, rather than by treating them individually. This clustering approach is fast and can be applied in a similar fashion to the generation of reduced alphabets for other biological problems like fold recognition, sequence data mining, or secondary structure prediction. The reduced alphabets derived from the present work were converted into a scoring function for the evaluation of docking solutions, which is available for public use via the web service score-MI: http://score-MI.biochem.uni-erlangen.de.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Amino Acids / chemistry*
  • Binding Sites*
  • Internet
  • Models, Molecular*
  • Molecular Sequence Data
  • Protein Binding
  • Protein Conformation
  • Protein Interaction Domains and Motifs
  • Proteins / chemistry*
  • Proteins / metabolism
  • Software
  • User-Computer Interface

Substances

  • Amino Acids
  • Proteins