SuperStar: a knowledge-based approach for identifying interaction sites in proteins

J Mol Biol. 1999 Jun 18;289(4):1093-108. doi: 10.1006/jmbi.1999.2809.

Abstract

An empirical method for identifying interaction sites in proteins is described and validated. The method is based entirely on experimental information about non-bonded interactions occurring in small-molecule crystal structures. These data are used in the form of scatterplots that show the experimentally observed distribution of one functional group (the "contact group" or "probe") around another. A template molecule (e.g. a protein binding site) is broken down into structure fragments and the scatterplots, showing the distribution of a chosen probe around these structure fragments, are superimposed on the corresponding parts of the template. The scatterplots are then translated into a three-dimensional map that shows the propensity of the probe at different positions around the template molecule. The method is illustrated for l -arabinose-binding protein, complexed with l -arabinose and with d -fucose, and for dihydrofolate reductase complexed with methotrexate. The method is validated on 122 X-ray structures of protein-ligand complexes. For all the binding sites of these proteins, propensity maps are generated for four different probes: a charged NH+3nitrogen, a carbonyl oxygen, a hydroxyl oxygen and a methyl carbon atom. Next, the maps are compared with the experimentally observed positions of ligand atoms of these types. For 74% of these ligand atoms (84% of the solvent-inaccessible ones) the calculated propensity of the matching probe at the experimental positions is higher than expected by chance. For 68% of the atoms (82% of the solvent-inaccessible ones) the propensity of the matching probe is higher than that of the other three probes. These results indicate that the approach generally gives good predictions for protein-ligand interactions. The potential applications of the propensity maps range from an aid in manual docking and structure-based drug design to their use in pharmacophore development.

MeSH terms

  • Artificial Intelligence*
  • Binding Sites
  • Carrier Proteins / chemistry
  • Carrier Proteins / metabolism
  • Escherichia coli Proteins
  • Protein Conformation
  • Proteins / chemistry
  • Proteins / metabolism*
  • Software Validation
  • Tetrahydrofolate Dehydrogenase / chemistry
  • Tetrahydrofolate Dehydrogenase / metabolism

Substances

  • AraF protein, E coli
  • Carrier Proteins
  • Escherichia coli Proteins
  • Proteins
  • Tetrahydrofolate Dehydrogenase