DOCKGROUND system of databases for protein recognition studies: unbound structures for docking

Proteins. 2007 Dec 1;69(4):845-51. doi: 10.1002/prot.21714.

Abstract

Computational docking approaches are important as a source of protein-protein complexes structures and as a means to understand the principles of protein association. A key element in designing better docking approaches, including search procedures, potentials, and scoring functions is their validation on experimentally determined structures. Thus, the databases of such structures (benchmark sets) are important. The previous, first release of the DOCKGROUND resource (Douguet et al., Bioinformatics 2006; 22:2612-2618) implemented a comprehensive database of cocrystallized (bound) protein-protein complexes in a relational database of annotated structures. The current release adds important features to the set of bound structures, such as regularly updated downloadable datasets: automatically generated nonredundant set, built according to most common criteria, and a manually curated set that includes only biological nonobligate complexes along with a number of additional useful characteristics. The main focus of the current release is unbound (experimental and simulated) protein-protein complexes. Complexes from the bound dataset are used to identify crystallized unbound analogs. If such analogs do not exist, the unbound structures are simulated by rotamer library optimization. Thus, the database contains comprehensive sets of complexes suitable for large scale benchmarking of docking algorithms. Advanced methodologies for simulating unbound conformations are being explored for the next release. The future releases will include datasets of modeled protein-protein complexes, and systematic sets of docking decoys obtained by different docking algorithms. The growing DOCKGROUND resource is designed to become a comprehensive public environment for developing and validating new docking methodologies.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Computer Simulation*
  • Crystallography, X-Ray / methods
  • Databases, Protein
  • Dimerization
  • Genomics
  • Models, Molecular
  • Molecular Conformation
  • Protein Binding
  • Protein Conformation
  • Protein Interaction Mapping
  • Proteins / chemistry*
  • Proteomics / methods*
  • Software

Substances

  • Proteins