Developing a SARS-CoV-2 main protease binding prediction random forest model for drug repurposing for COVID-19 treatment

Exp Biol Med (Maywood). 2023 Nov;248(21):1927-1936. doi: 10.1177/15353702231209413. Epub 2023 Nov 24.

Abstract

The coronavirus disease 2019 (COVID-19) global pandemic resulted in millions of people becoming infected with the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus and close to seven million deaths worldwide. It is essential to further explore and design effective COVID-19 treatment drugs that target the main protease of SARS-CoV-2, a major target for COVID-19 drugs. In this study, machine learning was applied for predicting the SARS-CoV-2 main protease binding of Food and Drug Administration (FDA)-approved drugs to assist in the identification of potential repurposing candidates for COVID-19 treatment. Ligands bound to the SARS-CoV-2 main protease in the Protein Data Bank and compounds experimentally tested in SARS-CoV-2 main protease binding assays in the literature were curated. These chemicals were divided into training (516 chemicals) and testing (360 chemicals) data sets. To identify SARS-CoV-2 main protease binders as potential candidates for repurposing to treat COVID-19, 1188 FDA-approved drugs from the Liver Toxicity Knowledge Base were obtained. A random forest algorithm was used for constructing predictive models based on molecular descriptors calculated using Mold2 software. Model performance was evaluated using 100 iterations of fivefold cross-validations which resulted in 78.8% balanced accuracy. The random forest model that was constructed from the whole training dataset was used to predict SARS-CoV-2 main protease binding on the testing set and the FDA-approved drugs. Model applicability domain and prediction confidence on drugs predicted as the main protease binders discovered 10 FDA-approved drugs as potential candidates for repurposing to treat COVID-19. Our results demonstrate that machine learning is an efficient method for drug repurposing and, thus, may accelerate drug development targeting SARS-CoV-2.

Keywords: SARS-CoV-2; binding activity; drug repurposing; main protease; predictive model; random forest.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Antiviral Agents / pharmacology
  • Antiviral Agents / therapeutic use
  • COVID-19 Drug Treatment
  • COVID-19*
  • Coronavirus 3C Proteases
  • Drug Repositioning / methods
  • Humans
  • Molecular Docking Simulation
  • Protease Inhibitors / chemistry
  • Protease Inhibitors / metabolism
  • Protease Inhibitors / therapeutic use
  • Random Forest
  • SARS-CoV-2

Substances

  • 3C-like proteinase, SARS-CoV-2
  • Antiviral Agents
  • Coronavirus 3C Proteases
  • Protease Inhibitors