Prediction of compound-target interaction using several artificial intelligence algorithms and comparison with a consensus-based strategy

Karina Jimenes-Vargas; Alejandro Pazos; Cristian R Munteanu; Yunierkis Perez-Castillo; Eduardo Tejera

doi:10.1186/s13321-024-00816-1

Prediction of compound-target interaction using several artificial intelligence algorithms and comparison with a consensus-based strategy

J Cheminform. 2024 Mar 7;16(1):27. doi: 10.1186/s13321-024-00816-1.

Authors

Karina Jimenes-Vargas^{1

2}, Alejandro Pazos^{3

4

5}, Cristian R Munteanu^{3

4

5}, Yunierkis Perez-Castillo⁶, Eduardo Tejera⁷

Affiliations

¹ Bio-Cheminformatics Research Group, Universidad de Las Américas, Quito, 170504, Ecuador. karina.jimenes@udla.edu.ec.
² Departament of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruña, Campus Elviña s/n, 15071, A Coruña, Spain. karina.jimenes@udla.edu.ec.
³ Departament of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruña, Campus Elviña s/n, 15071, A Coruña, Spain.
⁴ CITIC-Research Center of Information and Communication Technologies, Universidade da Coruña, 15071, A Coruña, Spain.
⁵ Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruna (CHUAC), 15006, A Coruna, Spain.
⁶ Bio-Cheminformatics Research Group, Universidad de Las Américas, Quito, 170504, Ecuador.
⁷ Bio-Cheminformatics Research Group, Universidad de Las Américas, Quito, 170504, Ecuador. eduardo.tejera@udla.edu.ec.

Abstract

For understanding a chemical compound's mechanism of action and its side effects, as well as for drug discovery, it is crucial to predict its possible protein targets. This study examines 15 developed target-centric models (TCM) employing different molecular descriptions and machine learning algorithms. They were contrasted with 17 third-party models implemented as web tools (WTCM). In both sets of models, consensus strategies were implemented as potential improvement over individual predictions. The findings indicate that TCM reach f1-score values greater than 0.8. Comparing both approaches, the best TCM achieves values of 0.75, 0.61, 0.25 and 0.38 for true positive/negative rates (TPR, TNR) and false negative/positive rates (FNR, FPR); outperforming the best WTCM. Moreover, the consensus strategy proves to have the most relevant results in the top $20 %$ of target profiles. TCM consensus reach TPR and FNR values of 0.98 and 0; while on WTCM reach values of 0.75 and 0.24. The implemented computational tool with the TCM and their consensus strategy at: https://bioquimio.udla.edu.ec/tidentification01/ . Scientific Contribution: We compare and discuss the performances of 17 public compound-target interaction prediction models and 15 new constructions. We also explore a compound-target interaction prioritization strategy using a consensus approach, and we analyzed the challenging involved in interactions modeling.

Keywords: Ligan-based modeling; Machine learning; QSAR; Target fishing; Target identification.