Comparing protein-ligand docking programs is difficult

Jason C Cole; Christopher W Murray; J Willem M Nissink; Richard D Taylor; Robin Taylor

doi:10.1002/prot.20497

Comparing protein-ligand docking programs is difficult

Proteins. 2005 Aug 15;60(3):325-32. doi: 10.1002/prot.20497.

Authors

Jason C Cole¹, Christopher W Murray, J Willem M Nissink, Richard D Taylor, Robin Taylor

Affiliation

¹ Cambridge Crystallographic Data Centre, Cambridge, United Kingdom.

PMID: 15937897
DOI: 10.1002/prot.20497

Abstract

There is currently great interest in comparing protein-ligand docking programs. A review of recent comparisons shows that it is difficult to draw conclusions of general applicability. Statistical hypothesis testing is required to ensure that differences in pose-prediction success rates and enrichment rates are significant. Numerical measures such as root-mean-square deviation need careful interpretation and may profitably be supplemented by interaction-based measures and visual inspection of dockings. Test sets must be of appropriate diversity and of good experimental reliability. The effects of crystal-packing interactions may be important. The method used for generating starting ligand geometries and positions may have an appreciable effect on docking results. For fair comparison, programs must be given search problems of equal complexity (e.g. binding-site regions of the same size) and approximately equal time in which to solve them. Comparisons based on rescoring require local optimization of the ligand in the space of the new objective function. Re-implementations of published scoring functions may give significantly different results from the originals. Ostensibly minor details in methodology may have a profound influence on headline success rates.

Publication types

Review

MeSH terms

Algorithms
Artificial Intelligence
Binding Sites
Computational Biology / methods*
Computer Simulation
Crystallization
Crystallography, X-Ray
Databases, Protein
Ligands
Models, Molecular
Molecular Structure
Programming Languages
Protein Binding
Proteins / chemistry*
Proteomics / methods*
Reproducibility of Results
Software*

Substances

Ligands
Proteins