Using Machine Learning and Molecular Docking to Leverage Urease Inhibition Data for Virtual Screening

Int J Mol Sci. 2023 May 3;24(9):8180. doi: 10.3390/ijms24098180.

Abstract

Urease is a metalloenzyme that catalyzes the hydrolysis of urea, and its modulation has an important role in both the agricultural and medical industry. Even though numerous molecules have been tested against ureases of different species, their clinical translation has been limited due to chemical and metabolic stability as well as side effects. Therefore, screening new compounds against urease would be of interest in part due to rising concerns regarding antibiotic resistance. In this work, we collected and curated a diverse set of 2640 publicly available small-molecule inhibitors of jack bean urease and developed a classifier using a random forest machine learning method with high predictive performance. In addition, the physicochemical features of compounds were paired with molecular docking and protein-ligand fingerprint analysis to gather insight into the current activity landscape. We observed that the docking score could not differentiate active from inactive compounds within each chemical family, but scores were correlated with compound activity when all compounds were considered. Additionally, a decision tree model was built based on 2D and 3D Morgan fingerprints to mine patterns of the known active-class compounds. The final machine learning model showed good prediction performance against the test set (81% and 77% precision for active and inactive compounds, respectively). Finally, this model was employed, as a proof-of-concept, on an in-house library to predict new hits that were then tested against urease and found to be active. This is, to date, the largest, most diverse dataset of compounds used to develop predictive in silico models. Overall, the results highlight the usefulness of using machine learning classifiers and molecular docking to predict novel urease inhibitors.

Keywords: H. pylori; QSAR; jack bean urease; machine learning; protein–ligand interactions; random forest; urease.

MeSH terms

  • Computer Simulation
  • Enzyme Inhibitors* / chemistry
  • Molecular Docking Simulation
  • Urea
  • Urease* / metabolism

Substances

  • Urease
  • Enzyme Inhibitors
  • Urea