Curated Database and Preliminary AutoML QSAR Model for 5-HT1A Receptor

Pharmaceutics. 2021 Oct 16;13(10):1711. doi: 10.3390/pharmaceutics13101711.

Abstract

Introduction of a new drug to the market is a challenging and resource-consuming process. Predictive models developed with the use of artificial intelligence could be the solution to the growing need for an efficient tool which brings practical and knowledge benefits, but requires a large amount of high-quality data. The aim of our project was to develop quantitative structure-activity relationship (QSAR) model predicting serotonergic activity toward the 5-HT1A receptor on the basis of a created database. The dataset was obtained using ZINC and ChEMBL databases. It contained 9440 unique compounds, yielding the largest available database of 5-HT1A ligands with specified pKi value to date. Furthermore, the predictive model was developed using automated machine learning (AutoML) methods. According to the 10-fold cross-validation (10-CV) testing procedure, the root-mean-squared error (RMSE) was 0.5437, and the coefficient of determination (R2) was 0.74. Moreover, the Shapley Additive Explanations method (SHAP) was applied to assess a more in-depth understanding of the influence of variables on the model's predictions. According to to the problem definition, the developed model can efficiently predict the affinity value for new molecules toward the 5-HT1A receptor on the basis of their structure encoded in the form of molecular descriptors. Usage of this model in screening processes can significantly improve the process of discovery of new drugs in the field of mental diseases and anticancer therapy.

Keywords: 5-HT1A receptor; AutoML; Mordred descriptors; QSAR; SHAP; curated database; pKi.