Electrostatic-field and surface-shape similarity for virtual screening and pose prediction

J Comput Aided Mol Des. 2019 Oct;33(10):865-886. doi: 10.1007/s10822-019-00236-6. Epub 2019 Oct 24.

Abstract

We introduce a new method for rapid computation of 3D molecular similarity that combines electrostatic field comparison with comparison of molecular surface-shape and directional hydrogen-bonding preferences (called "eSim"). Rather than employing heuristic "colors" or user-defined molecular feature types to represent conformation-dependent molecular electrostatics, eSim calculates the similarity of the electrostatic fields of two molecules (in addition to shape and hydrogen-bonding). We present detailed virtual screening performance data on the standard 102 target DUD-E set. In its moderately fast screening mode, eSim running on a single computing core is capable of processing over 60 molecules per second. In this mode, eSim performed significantly better than all alternate methods for which full DUD-E data were available (mean ROC area of 0.74, p [Formula: see text], by paired t-test, compared with the best performing alternate method). In addition, for 92 targets of the DUD-E set where multiple ligand-bound crystal structures were available, screening performance was assessed using alternate ligands or sets thereof (in their bound poses) as similarity targets. Using the joint alignment of five ligands for each protein target, mean ROC area exceeded 0.82 for the 92 targets. Design-focused application of ligand similarity methods depends on accurate predictions of geometric molecular relationships. We comprehensively assessed pose prediction accuracy by curating nearly 400,000 bound ligand pose pairs across the DUD-E targets. Overall, beginning from agnostic initial poses, we observed an 80% success rate for RMSD [Formula: see text] Å among the top 20 predicted eSim poses. These examples were split roughly 50/50 into cases with high direct atomic overlap (where a shared scaffold exists between a pair) and low direct atomic overlap (where where a ligand pair has dissimilar scaffolds but largely occupies the same space). Within the high direct atomic overlap subset, the pose prediction success rate was 93%. For the more challenging subset (where dissimilar scaffolds are to be aligned), the success rate was 70%. The eSim approach enables both large-scale screening and rational design of ligands and is rooted in physically meaningful, non-heuristic, molecular comparisons.

Keywords: ForceGen; Ligand alignment; Ligand-based modeling; Molecular similarity; Pose prediction; ROCS; Surflex; Virtual-screening.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Computer Simulation
  • Drug Design*
  • Drug Evaluation, Preclinical*
  • Humans
  • Ligands
  • Models, Molecular
  • Molecular Structure
  • Pharmaceutical Preparations / chemistry*
  • Pharmaceutical Preparations / metabolism
  • Protein Binding
  • Protein Conformation
  • Proteins / chemistry*
  • Proteins / metabolism
  • Static Electricity*

Substances

  • Ligands
  • Pharmaceutical Preparations
  • Proteins