Using Domain-Specific Fingerprints Generated Through Neural Networks to Enhance Ligand-Based Virtual Screening

J Chem Inf Model. 2021 Feb 22;61(2):664-675. doi: 10.1021/acs.jcim.0c01208. Epub 2021 Jan 26.

Abstract

Similarity-based virtual screening is a fundamental tool in the early drug discovery process and relies heavily on molecular fingerprints. We propose a novel strategy of generating domain-specific fingerprints by training neural networks on target-specific bioactivity datasets and using the activation as a new molecular representation. The neural network is expected to combine information of already known bioactive compounds with unique information of the molecular structure and by doing so enrich the fingerprint. We evaluate this strategy on a large kinase-specific bioactivity dataset. A comparison of five neural network architectures and their fingerprints to the well-established extended-connectivity fingerprint (ECFP) and an autoencoder shows that our neural fingerprint produces better results in the similarity search. Most importantly, the neural fingerprint performs well even when specific targets are not included during training. Surprisingly, while Graph Neural Networks (GNNs) are thought to offer an advantageous alternative, the best performing neural fingerprints were based on traditional fully connected layers using the ECFP4 as the input. The neural fingerprint is freely available at: https://github.com/kochgroup/kinase_nnfp.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Drug Discovery*
  • Ligands
  • Molecular Structure
  • Neural Networks, Computer*

Substances

  • Ligands