Classification models for predicting the bioactivity of pan-TRK inhibitors and SAR analysis

Xiaoman Zhao; Yue Kong; Yueshan Ji; Xiulan Xin; Liang Chen; Guang Chen; Changyuan Yu

doi:10.1007/s11030-023-10735-2

Classification models for predicting the bioactivity of pan-TRK inhibitors and SAR analysis

Mol Divers. 2023 Nov 1. doi: 10.1007/s11030-023-10735-2. Online ahead of print.

Authors

Xiaoman Zhao^{1

2}, Yue Kong¹, Yueshan Ji¹, Xiulan Xin², Liang Chen², Guang Chen¹, Changyuan Yu³

Affiliations

¹ College of Life Science and Technology, Beijing University of Chemical Technology, 15 BeiSanHuan East Road, Beijing, 100029, People's Republic of China.
² College of Bio engineering, No. 9 Liangshuihe 1st Street, Beijing, 100176, People's Republic of China.
³ College of Life Science and Technology, Beijing University of Chemical Technology, 15 BeiSanHuan East Road, Beijing, 100029, People's Republic of China. yucy@buct.edu.cn.

PMID: 37910346
DOI: 10.1007/s11030-023-10735-2

Abstract

Tropomyosin receptor kinases (TRKs) are important broad-spectrum anticancer targets. The oncogenic rearrangement of the NTRK gene disrupts the extracellular structural domain and epitopes for therapeutic antibodies, making small-molecule inhibitors essential for treating NTRK fusion-driven tumors. In this work, several algorithms were used to construct descriptor-based and nondescriptor-based models, and the models were evaluated by outer 10-fold cross-validation. To find a model with good generalization ability, the dataset was partitioned by random and cluster-splitting methods to construct in- and cross-domain models, respectively. Among the 48 models built, the model with the combination of the deep neural network (DNN) algorithm and extended connectivity fingerprints 4 (ECFP4) descriptors achieved excellent performance in both dataset divisions. The results indicate that the DNN algorithm has a strong generalization prediction ability, and the richness of features plays a vital role in predicting unknown spatial molecules. Additionally, we combined the clustering results and decision tree models of fingerprint descriptors to perform structure-activity relationship analysis. It was found that nitrogen-containing aromatic heterocyclic and benzo heterocyclic structures play a crucial role in enhancing the activity of TRK inhibitors. Workflow for generating predictive models for TRK inhibitors.

Keywords: Classification model; Deep neural network (DNN); Structure clustering; Structure–activity relationship (SAR) analysis; Tropomyosin receptor kinases (TRKs) inhibitor.

Grants and funding

SG030801/The Research on National Reference Material and Product Development of Natural Products