A machine learning model trained on a high-throughput antibacterial screen increases the hit rate of drug discovery

PLoS Comput Biol. 2022 Oct 13;18(10):e1010613. doi: 10.1371/journal.pcbi.1010613. eCollection 2022 Oct.

Abstract

Screening for novel antibacterial compounds in small molecule libraries has a low success rate. We applied machine learning (ML)-based virtual screening for antibacterial activity and evaluated its predictive power by experimental validation. We first binarized 29,537 compounds according to their growth inhibitory activity (hit rate 0.87%) against the antibiotic-resistant bacterium Burkholderia cenocepacia and described their molecular features with a directed-message passing neural network (D-MPNN). Then, we used the data to train an ML model that achieved a receiver operating characteristic (ROC) score of 0.823 on the test set. Finally, we predicted antibacterial activity in virtual libraries corresponding to 1,614 compounds from the Food and Drug Administration (FDA)-approved list and 224,205 natural products. Hit rates of 26% and 12%, respectively, were obtained when we tested the top-ranked predicted compounds for growth inhibitory activity against B. cenocepacia, which represents at least a 14-fold increase from the previous hit rate. In addition, more than 51% of the predicted antibacterial natural compounds inhibited ESKAPE pathogens showing that predictions expand beyond the organism-specific dataset to a broad range of bacteria. Overall, the developed ML approach can be used for compound prioritization before screening, increasing the typical hit rate of drug discovery.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Anti-Bacterial Agents / pharmacology
  • Drug Discovery*
  • Machine Learning
  • Small Molecule Libraries* / pharmacology
  • United States

Substances

  • Small Molecule Libraries
  • Anti-Bacterial Agents

Grants and funding