SuperStar: improved knowledge-based interaction fields for protein binding sites

J Mol Biol. 2001 Mar 30;307(3):841-59. doi: 10.1006/jmbi.2001.4452.

Abstract

SuperStar is an empirical method for identifying interaction sites in proteins, based entirely on experimental information about non-bonded interactions occurring in small-molecule crystal structures, taken from the IsoStar database. We describe recent modifications and additions to SuperStar, validating the results on a test set of 122 X-ray structures of protein-ligand complexes. In this validation, propensity maps are generated for all the binding sites of these proteins, using four different probes: a charged NH(+)(3) nitrogen atom, a carbonyl oxygen atom, a hydroxyl oxygen atom and a methyl carbon atom. Next, the maps are compared with the experimentally observed positions of ligand atoms of these types. A peak-searching algorithm is introduced that highlights potential interaction hot spots. For the three hydrogen-bonding probes - NH(+)(3) nitrogen atom, carbonyl oxygen atom and hydroxyl oxygen atom - the average distance from the ligand atom to the nearest SuperStar peak is 1.0-1.2 A (0.8-1.0 A for solvent-inaccessible ligand atoms). For the methyl carbon atom probe, this distance is about 1.5 A, probably because interactions to methyl groups are much less directional. The most important addition to SuperStar is the enabling of propensity maps around metal centres - Ca(2+), Mg(2+) and Zn(2+) - in protein binding sites. The results are validated on a test set of 24 protein-ligand complexes that have a metal ion in their binding site. Coordination geometries are derived automatically, using only the protein atoms that coordinate to the metal ion. The correct coordination geometry is derived in approximately 75 % of the cases. If the derived geometry is assumed during the SuperStar calculation, the average distance from a ligand atom coordinating to the metal ion to the nearest peak in the propensity map for an oxygen probe is 0.87(7) A. If the correct coordination geometry is imposed, this distance reduces to 0.59(7)A. This indicates that the SuperStar predictions around metal-binding sites are at least as good as those around other protein groups. Using clustering techniques, a non-redundant set of probes is selected from the set of probes available in the IsoStar database. The performance in SuperStar of all these probes is tested on the test set of protein-ligand complexes. With the exception of the "ether oxygen" probe and the "any NH(+)" probe, all new probes perform as well as the four probes introduced first.

MeSH terms

  • Algorithms
  • Binding Sites
  • Carbon / metabolism
  • Cluster Analysis
  • Computer Simulation*
  • Crystallography, X-Ray
  • Databases as Topic
  • Hydrogen / metabolism
  • Ligands
  • Metals / metabolism*
  • Models, Molecular
  • Nitrogen / metabolism
  • Oxygen / metabolism
  • Pliability
  • Protein Binding
  • Protein Conformation
  • Proteins / chemistry*
  • Proteins / metabolism*
  • Reproducibility of Results
  • Water / chemistry
  • Water / metabolism

Substances

  • Ligands
  • Metals
  • Proteins
  • Water
  • Carbon
  • Hydrogen
  • Nitrogen
  • Oxygen