HIt Discovery using docking ENriched by GEnerative Modeling (HIDDEN GEM): A novel computational workflow for accelerated virtual screening of ultra-large chemical libraries

Mol Inform. 2024 Jan;43(1):e202300207. doi: 10.1002/minf.202300207. Epub 2023 Dec 19.

Abstract

Recent rapid expansion of make-on-demand, purchasable, chemical libraries comprising dozens of billions or even trillions of molecules has challenged the efficient application of traditional structure-based virtual screening methods that rely on molecular docking. We present a novel computational methodology termed HIDDEN GEM (HIt Discovery using Docking ENriched by GEnerative Modeling) that greatly accelerates virtual screening. This workflow uniquely integrates machine learning, generative chemistry, massive chemical similarity searching and molecular docking of small, selected libraries in the beginning and the end of the workflow. For each target, HIDDEN GEM nominates a small number of top-scoring virtual hits prioritized from ultra-large chemical libraries. We have benchmarked HIDDEN GEM by conducting virtual screening campaigns for 16 diverse protein targets using Enamine REAL Space library comprising 37 billion molecules. We show that HIDDEN GEM yields the highest enrichment factors as compared to state of the art accelerated virtual screening methods, while requiring the least computational resources. HIDDEN GEM can be executed with any docking software and employed by users with limited computational resources.

Keywords: Generative Modeling; High Throughput Virtual Screening; Molecular Docking; Similarity Search.

MeSH terms

  • Molecular Docking Simulation
  • Small Molecule Libraries* / chemistry
  • Software*
  • Workflow

Substances

  • Small Molecule Libraries