Prediction of bioactivities of microsomal prostaglandin E2 synthase-1 inhibitors by machine learning algorithms

Chem Biol Drug Des. 2023 Jun;101(6):1307-1321. doi: 10.1111/cbdd.14214. Epub 2023 Feb 20.

Abstract

There is a strong interest in the development of microsomal prostaglandin E2 synthase-1 (mPGES-1) inhibitors of their potential to safely and effectively treat inflammation. Herein, 70 QSAR models were built on the dataset (735 mPGES-1 inhibitors) characterized with RDKit descriptors by multiple linear regression (MLR), support vector machine (SVM), random forest (RF), deep neural networks (DNN), and eXtreme Gradient Boosting (XGBoost). The other three regression models on the dataset are represented by SMILES using self-attention recurrent neural networks (RNN) and Graph Convolutional Networks (GCN). For the best model (Model C2), which was developed by SVM with RDKit descriptors, the coefficient of determination (R2 ) of 0.861 and root mean squared error (RMSE) of 0.235 were achieved for the test set. Additionally, R2 of 0.692 and RMSE of 0.383 were obtained on the external test set. We investigated the applicability domain (AD) of Model C2 with the rivality index (RI), the prediction of Model C2 on 78.92% of molecules in the test set, and 78.33% of molecules in the external test set were reliable. After dissecting the RDKit descriptors of Model C2, we found important physicochemical properties of highly active mPGES-1 inhibitors. Besides, by analyzing the attention weight of each atom of each inhibitor from the attention layer, we found that the benzamide group and the trifluoromethyl cyclohexane group are favorable substructures for mPGES-1 inhibitors.

Keywords: applicability domain (AD); machine learning (ML); microsomal prostaglandin E2 synthase-1 (mPGES-1) inhibitor; quantitative structure-activity relationship (QSAR); self-attention recurrent neural networks (RNN).

MeSH terms

  • Algorithms*
  • Machine Learning
  • Prostaglandin-E Synthases
  • Prostaglandins
  • Quantitative Structure-Activity Relationship*
  • Support Vector Machine

Substances

  • Prostaglandin-E Synthases
  • Prostaglandins