Convergent QSAR Models for the Prediction of Cruzain Inhibitors

ACS Omega. 2023 Oct 13;8(42):38961-38982. doi: 10.1021/acsomega.3c03376. eCollection 2023 Oct 24.

Abstract

Chagas disease is a parasitosis caused by Trypanosoma cruzi. Cruzain, the major cysteine protease from T. cruzi, is an excellent therapeutic target in the search for antichagasic drugs. It is important in the role of cell invasion, replication, differentiation, and metabolism of the parasite. In this work, we developed and assessed multiple quantitative structure-activity relationship (QSAR) models for a set of 61 cruzain inhibitors. These models include two-dimensional (2D) QSAR, three-dimensional (3D) QSAR, such as comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA), and Hologram QSAR (HQSAR). In total, we generated 10 major and 114 minor model variations. Molecular docking was used to successfully align the molecules. All CoMFA and CoMSIA models, which incorporate multiple fields, demonstrated robustness in our analysis. Steric fields exhibited satisfactory convergence in the contour maps, while the electrostatic field converged into a single small region. The HQSAR model taking into consideration only Atoms and Connectivity, with fragment sizes ranging from two to five atoms, was considered the best of the HQSAR variations, despite exhibiting a higher level of deviance. In total, 78 model variations meet the minimum requirements to be considered acceptable. We found that using as few as five descriptors it is possible to obtain robust results with 2D-QSAR. Models such as Random Forest, Tree Ensemble, Linear Regression, and HQSAR are recommended for working with large data sets, while the 3D-QSAR models are intended to study the geometry of the ligands, to optimize them into new and better performing antichagasics. Virtual Screening of a set of hydrazones, guided by the top-performing models, identified promising candidates for experimental validation. Among them, dv007 and dv015 exhibited consistently high predicted pIC50 values (7.26 and 7.24, respectively), making them compelling candidates for further drug development.