Protein-ligand interfaces are polarized: discovery of a strong trend for intermolecular hydrogen bonds to favor donors on the protein side with implications for predicting and designing ligand complexes

J Comput Aided Mol Des. 2018 Apr;32(4):511-528. doi: 10.1007/s10822-018-0105-2. Epub 2018 Feb 12.

Abstract

Understanding how proteins encode ligand specificity is fascinating and similar in importance to deciphering the genetic code. For protein-ligand recognition, the combination of an almost infinite variety of interfacial shapes and patterns of chemical groups makes the problem especially challenging. Here we analyze data across non-homologous proteins in complex with small biological ligands to address observations made in our inhibitor discovery projects: that proteins favor donating H-bonds to ligands and avoid using groups with both H-bond donor and acceptor capacity. The resulting clear and significant chemical group matching preferences elucidate the code for protein-native ligand binding, similar to the dominant patterns found in nucleic acid base-pairing. On average, 90% of the keto and carboxylate oxygens occurring in the biological ligands formed direct H-bonds to the protein. A two-fold preference was found for protein atoms to act as H-bond donors and ligand atoms to act as acceptors, and 76% of all intermolecular H-bonds involved an amine donor. Together, the tight chemical and geometric constraints associated with satisfying donor groups generate a hydrogen-bonding lock that can be matched only by ligands bearing the right acceptor-rich key. Measuring an index of H-bond preference based on the observed chemical trends proved sufficient to predict other protein-ligand complexes and can be used to guide molecular design. The resulting Hbind and Protein Recognition Index software packages are being made available for rigorously defining intermolecular H-bonds and measuring the extent to which H-bonding patterns in a given complex match the preference key.

Keywords: Drug design; Interaction patterns; Ligand optimization; Lipinski’s Rule of 5; Protein–ligand recognition; Specificity determinants.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Amino Acids
  • Databases, Protein
  • Drug Design
  • Hydrogen Bonding
  • Hydrophobic and Hydrophilic Interactions
  • Ligands
  • Models, Molecular*
  • Molecular Structure
  • Protein Binding
  • Proteins / chemistry*
  • Software
  • Structure-Activity Relationship
  • Surface Properties

Substances

  • Amino Acids
  • Ligands
  • Proteins