Classification models for predicting the bioactivity of pan-TRK inhibitors and SAR analysis

Mol Divers. 2023 Nov 1. doi: 10.1007/s11030-023-10735-2. Online ahead of print.

Abstract

Tropomyosin receptor kinases (TRKs) are important broad-spectrum anticancer targets. The oncogenic rearrangement of the NTRK gene disrupts the extracellular structural domain and epitopes for therapeutic antibodies, making small-molecule inhibitors essential for treating NTRK fusion-driven tumors. In this work, several algorithms were used to construct descriptor-based and nondescriptor-based models, and the models were evaluated by outer 10-fold cross-validation. To find a model with good generalization ability, the dataset was partitioned by random and cluster-splitting methods to construct in- and cross-domain models, respectively. Among the 48 models built, the model with the combination of the deep neural network (DNN) algorithm and extended connectivity fingerprints 4 (ECFP4) descriptors achieved excellent performance in both dataset divisions. The results indicate that the DNN algorithm has a strong generalization prediction ability, and the richness of features plays a vital role in predicting unknown spatial molecules. Additionally, we combined the clustering results and decision tree models of fingerprint descriptors to perform structure-activity relationship analysis. It was found that nitrogen-containing aromatic heterocyclic and benzo heterocyclic structures play a crucial role in enhancing the activity of TRK inhibitors. Workflow for generating predictive models for TRK inhibitors.

Keywords: Classification model; Deep neural network (DNN); Structure clustering; Structure–activity relationship (SAR) analysis; Tropomyosin receptor kinases (TRKs) inhibitor.