Usage of model combination in computational toxicology

Pablo Rodríguez-Belenguer; Eric March-Vila; Manuel Pastor; Victor Mangas-Sanjuan; Emilio Soria-Olivas

doi:10.1016/j.toxlet.2023.10.013

Usage of model combination in computational toxicology

Toxicol Lett. 2023 Nov 1:389:34-44. doi: 10.1016/j.toxlet.2023.10.013. Epub 2023 Oct 27.

Authors

Pablo Rodríguez-Belenguer¹, Eric March-Vila², Manuel Pastor², Victor Mangas-Sanjuan³, Emilio Soria-Olivas⁴

Affiliations

¹ Research Programme on Biomedical Informatics (GRIB), Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Hospital del Mar Medical Research Institute, 08003 Barcelona, Spain; Department of Pharmacy and Pharmaceutical Technology and Parasitology, Universitat de València, 46100 Valencia, Spain.
² Research Programme on Biomedical Informatics (GRIB), Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Hospital del Mar Medical Research Institute, 08003 Barcelona, Spain.
³ Department of Pharmacy and Pharmaceutical Technology and Parasitology, Universitat de València, 46100 Valencia, Spain; Interuniversity Research Institute for Molecular Recognition and Technological Development, Universitat Politècnica de València, 46100 Valencia, Spain.
⁴ IDAL, Intelligent Data Analysis Laboratory, ETSE, Universitat de València, 46100 Valencia, Spain. Electronic address: emilio.soria@uv.es.

PMID: 37890682
DOI: 10.1016/j.toxlet.2023.10.013

Abstract

New Approach Methodologies (NAMs) have ushered in a new era in the field of toxicology, aiming to replace animal testing. However, despite these advancements, they are not exempt from the inherent complexities associated with the study's endpoint. In this review, we have identified three major groups of complexities: mechanistic, chemical space, and methodological. The mechanistic complexity arises from interconnected biological processes within a network that are challenging to model in a single step. In the second group, chemical space complexity exhibits significant dissimilarity between compounds in the training and test series. The third group encompasses algorithmic and molecular descriptor limitations and typical class imbalance problems. To address these complexities, this work provides a guide to the usage of a combination of predictive Quantitative Structure-Activity Relationship (QSAR) models, known as metamodels. This combination of low-level models (LLMs) enables a more precise approach to the problem by focusing on different sub-mechanisms or sub-processes. For mechanistic complexity, multiple Molecular Initiating Events (MIEs) or levels of information are combined to form a mechanistic-based metamodel. Regarding the complexity arising from chemical space, two types of approaches were reviewed to construct a fragment-based chemical space metamodel: those with and without structure sharing. Metamodels with structure sharing utilize unsupervised strategies to identify data patterns and build low-level models for each cluster, which are then combined. For situations without structure sharing due to pharmaceutical industry intellectual property, the use of prediction sharing, and federated learning approaches have been reviewed. Lastly, to tackle methodological complexity, various algorithms are combined to overcome their limitations, diverse descriptors are employed to enhance problem definition and balanced dataset combinations are used to address class imbalance issues (methodological-based metamodels). Remarkably, metamodels consistently outperformed classical QSAR models across all cases, highlighting the importance of alternatives to classical QSAR models when faced with such complexities.

Keywords: Complexities; Machine learning; Metamodel; NAMs; QSAR.

Publication types

Review

MeSH terms

Algorithms*
Animals
Quantitative Structure-Activity Relationship*