Modeling structure-activity relationships with machine learning to identify GSK3-targeted small molecules as potential COVID-19 therapeutics

Front Endocrinol (Lausanne). 2023 Mar 6:14:1084327. doi: 10.3389/fendo.2023.1084327. eCollection 2023.

Abstract

Coronaviruses induce severe upper respiratory tract infections, which can spread to the lungs. The nucleocapsid protein (N protein) plays an important role in genome replication, transcription, and virion assembly in SARS-CoV-2, the virus causing COVID-19, and in other coronaviruses. Glycogen synthase kinase 3 (GSK3) activation phosphorylates the viral N protein. To combat COVID-19 and future coronavirus outbreaks, interference with the dependence of N protein on GSK3 may be a viable strategy. Toward this end, this study aimed to construct robust machine learning models to identify GSK3 inhibitors from Food and Drug Administration-approved and investigational drug libraries using the quantitative structure-activity relationship approach. A non-redundant dataset consisting of 495 and 3070 compounds for GSK3α and GSK3β, respectively, was acquired from the ChEMBL database. Twelve sets of molecular descriptors were used to define these inhibitors, and machine learning algorithms were selected using the LazyPredict package. Histogram-based gradient boosting and light gradient boosting machine algorithms were used to develop predictive models that were evaluated based on the root mean square error and R-squared value. Finally, the top two drugs (selinexor and ruboxistaurin) were selected for molecular dynamics simulation based on the highest predicted activity (negative log of the half-maximal inhibitory concentration, pIC50 value) to further investigate the structural stability of the protein-ligand complexes. This artificial intelligence-based virtual high-throughput screening approach is an effective strategy for accelerating drug discovery and finding novel pharmacological targets while reducing the cost and time.

Keywords: GSK3; QSAR; coronaviruses; machine learning; molecular descriptors.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Artificial Intelligence
  • COVID-19*
  • Glycogen Synthase Kinase 3 / metabolism
  • Humans
  • Machine Learning
  • SARS-CoV-2
  • Structure-Activity Relationship
  • United States

Substances

  • Glycogen Synthase Kinase 3

Grants and funding

The Korea Drug Development Fund funded by the Ministry of Science and ICT; the Ministry of Trade, Industry, and Energy; and the Ministry of Health and Welfare (HN21C1058) supported this work. The National Research Foundation of Korea (NRF-2022M3A9G1014520, 2019M3D1A1078940, and 2019R1A6A1A11051471) also supported this study.