A critical assessment of information-guided protein-protein docking predictions

Mol Cell Proteomics. 2013 Mar;12(3):679-86. doi: 10.1074/mcp.M112.020198. Epub 2012 Dec 13.

Abstract

The structures of protein complexes are increasingly predicted via protein-protein docking (PPD) using ambiguous interaction data to help guide the docking. These data often are incomplete and contain errors and therefore could lead to incorrect docking predictions. In this study, we performed a series of PPD simulations to examine the effects of incompletely and incorrectly assigned interface residues on the success rate of PPD predictions. The results for a widely used PPD benchmark dataset obtained using a new interface information-driven PPD (IPPD) method developed in this work showed that the success rate for an acceptable top-ranked model varied, depending on the information content used, from as high as 95% when contact relationships (though not contact distances) were known for all residues to 78% when only the interface/non-interface state of the residues was known. However, the success rates decreased rapidly to ∼40% when the interface/non-interface state of 20% of the residues was assigned incorrectly, and to less than 5% for a 40% incorrect assignment. Comparisons with results obtained by re-ranking a global search and with those reported for other data-guided PPD methods showed that, in general, IPPD performed better than re-ranking when the information used was more complete and more accurate, but worse when it was not, and that when using bioinformatics-predicted information on interface residues, IPPD and other data-guided PPD methods performed poorly, at a level similar to simulations with a 40% incorrect assignment. These results provide guidelines for using information about interface residues to improve PPD predictions and reveal a bottleneck for such improvement imposed by the low accuracy of current bioinformatic interface residue predictions.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Binding Sites
  • Computational Biology / methods*
  • Computer Simulation
  • Models, Molecular
  • Protein Binding
  • Protein Interaction Mapping / methods*
  • Protein Structure, Tertiary
  • Proteins / chemistry
  • Proteins / metabolism*
  • Reproducibility of Results

Substances

  • Proteins