Enhancing Acute Oral Toxicity Predictions by using Consensus Modeling and Algebraic Form-Based 0D-to-2D Molecular Encodes

César R García-Jacas; Yovani Marrero-Ponce; Fernando Cortés-Guzmán; José Suárez-Lezcano; Felix O Martinez-Rios; Luis A García-González; Mario Pupo-Meriño; Karina Martinez-Mayorga

doi:10.1021/acs.chemrestox.9b00011

Enhancing Acute Oral Toxicity Predictions by using Consensus Modeling and Algebraic Form-Based 0D-to-2D Molecular Encodes

Chem Res Toxicol. 2019 Jun 17;32(6):1178-1192. doi: 10.1021/acs.chemrestox.9b00011. Epub 2019 May 17.

Authors

César R García-Jacas¹, Yovani Marrero-Ponce^{2

3}, Fernando Cortés-Guzmán⁴, José Suárez-Lezcano⁵, Felix O Martinez-Rios⁶, Luis A García-González⁷, Mario Pupo-Meriño⁷, Karina Martinez-Mayorga⁴

Affiliations

¹ Departamento de Ciencias de la Computación , Centro de Investigación Científica y de Educación Superior de Ensenada , Ensenada , Baja California , México.
² Universidad San Francisco de Quito, Grupo de Medicina Molecular y Traslacional, Colegio de Ciencias de la Salud , Escuela de Medicina, Edificio de Especialidades Médicas , Quito , Pichincha , Ecuador.
³ Grupo de Investigación Ambiental, Programas Ambientales, Facultad de Ingenierías , Fundacion Universitaria Tecnologico Comfenalco-Cartagena , Cr44 DN 30 A, 91 , Cartagena , Bolívar , Colombia.
⁴ Instituto de Química , Universidad Nacional Autónoma de México , Ciudad de México , México.
⁵ Pontificia Universidad Católica del Ecuador Sede Esmeraldas , Esmeraldas , Ecuador.
⁶ Facultad de Ingeniería , Universidad Panamericana , Ciudad de México , México.
⁷ Grupo de Investigación de Bioinformática , Universidad de las Ciencias Informáticas , La Habana , Cuba.

PMID: 31066547
DOI: 10.1021/acs.chemrestox.9b00011

Abstract

Quantitative structure-activity relationships (QSAR) are introduced to predict acute oral toxicity (AOT), by using the QuBiLS-MAS (acronym for quadratic, bilinear and N-Linear maps based on graph-theoretic electronic-density matrices and atomic weightings) framework for the molecular encoding. Three training sets were employed to build the models: EPA training set (5931 compounds), EPA-full training set (7413 compounds), and Zhu training set (10 152 compounds). Additionally, the EPA test set (1482 compounds) was used for the validation of the QSAR models built on the EPA training set, while the ProTox (425 compounds) and T3DB (284 compounds) external sets were employed for the assessment of all the models. The k-nearest neighbor, multilayer perceptron, random forest, and support vector machine procedures were employed to build several base (individual) models. The base models with R_EPA-training ≥ 0.75 ( R = correlation coefficient) and MAE_EPA-training ≤ 0.5 (MAE = mean absolute error) were retained to build consensus models. As a result, two consensus models based on the minimum operator and denoted as M19 and M22, as well as a consensus model based on the weighted average operator and denoted as M24, were selected as the best ones for each training set considered. According to the applicability domain (AD) analysis performed, model M19 (built on the EPA training set) has MAE_test-AD = 0.4044, MAE_ProTox-AD = 0.4067 and MAE_T3DB-AD = 0.2586 on the EPA test set, ProTox external set, and T3DB external set, respectively; whereas model M22 (built on the EPA-full set) and model M24 (built on the Zhu set) present MAE_ProTox-AD = 0.3992 and MAE_T3DB-AD = 0.2286, and MAE_ProTox-AD = 0.3773 and MAE_T3DB-AD = 0.2471 on the two external sets accounted for, respectively. These outcomes were compared and statistically validated with respect to 14 QSAR methods (e.g., admetSAR, ProTox-II) from the literature. As a result, model M22 presents the best overall performance. In addition, a retrospective study on 261 withdrawn drugs due to their toxic/side effects was performed, to assess the usefulness of prospectively using the QSAR models proposed in the labeling of chemicals. A comparison with regard to the methods from the literature was also made. As a result, model M22 has the best ability of labeling a compound as toxic according to the globally harmonized system of classification and labeling of chemicals. Therefore, it can be concluded that the models proposed, especially model M22, constitute prominent tools for studying AOT, at providing the best results among all the methods examined. A freely available software was also developed to be used in virtual screening tasks ( http://tomocomd.com/apps/ptoxra ).

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Administration, Oral
Animals
Cluster Analysis*
Humans
Quantitative Structure-Activity Relationship
Support Vector Machine*
Toxicity Tests, Acute*