A machine learning model trained on a high-throughput antibacterial screen increases the hit rate of drug discovery

A S M Zisanur Rahman; Chengyou Liu; Hunter Sturm; Andrew M Hogan; Rebecca Davis; Pingzhao Hu; Silvia T Cardona

doi:10.1371/journal.pcbi.1010613

A machine learning model trained on a high-throughput antibacterial screen increases the hit rate of drug discovery

PLoS Comput Biol. 2022 Oct 13;18(10):e1010613. doi: 10.1371/journal.pcbi.1010613. eCollection 2022 Oct.

Authors

A S M Zisanur Rahman¹, Chengyou Liu², Hunter Sturm³, Andrew M Hogan¹, Rebecca Davis³, Pingzhao Hu^{2

4

5}, Silvia T Cardona^{1

6}

Affiliations

¹ Department of Microbiology, University of Manitoba, Winnipeg, Manitoba, Canada.
² Department of Electrical and Computer Engineering, University of Manitoba, Winnipeg, Manitoba, Canada.
³ Department of Chemistry, University of Manitoba, Winnipeg, Manitoba, Canada.
⁴ Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, Canada.
⁵ Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, Manitoba, Canada.
⁶ Department of Medical Microbiology & Infectious Diseases, University of Manitoba, Winnipeg, Canada.

Abstract

Screening for novel antibacterial compounds in small molecule libraries has a low success rate. We applied machine learning (ML)-based virtual screening for antibacterial activity and evaluated its predictive power by experimental validation. We first binarized 29,537 compounds according to their growth inhibitory activity (hit rate 0.87%) against the antibiotic-resistant bacterium Burkholderia cenocepacia and described their molecular features with a directed-message passing neural network (D-MPNN). Then, we used the data to train an ML model that achieved a receiver operating characteristic (ROC) score of 0.823 on the test set. Finally, we predicted antibacterial activity in virtual libraries corresponding to 1,614 compounds from the Food and Drug Administration (FDA)-approved list and 224,205 natural products. Hit rates of 26% and 12%, respectively, were obtained when we tested the top-ranked predicted compounds for growth inhibitory activity against B. cenocepacia, which represents at least a 14-fold increase from the previous hit rate. In addition, more than 51% of the predicted antibacterial natural compounds inhibited ESKAPE pathogens showing that predictions expand beyond the organism-specific dataset to a broad range of bacteria. Overall, the developed ML approach can be used for compound prioritization before screening, increasing the typical hit rate of drug discovery.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Anti-Bacterial Agents / pharmacology
Drug Discovery*
Machine Learning
Small Molecule Libraries* / pharmacology
United States

Substances

Small Molecule Libraries
Anti-Bacterial Agents

Grants and funding

169121 /CIHR/Canada