A review on machine learning methods for in silico toxicity prediction

Gabriel Idakwo; Joseph Luttrell; Minjun Chen; Huixiao Hong; Zhaoxian Zhou; Ping Gong; Chaoyang Zhang

doi:10.1080/10590501.2018.1537118

A review on machine learning methods for in silico toxicity prediction

J Environ Sci Health C Environ Carcinog Ecotoxicol Rev. 2018;36(4):169-191. doi: 10.1080/10590501.2018.1537118. Epub 2019 Jan 10.

Authors

Gabriel Idakwo¹, Joseph Luttrell¹, Minjun Chen², Huixiao Hong², Zhaoxian Zhou¹, Ping Gong³, Chaoyang Zhang¹

Affiliations

¹ a School of Computing Sciences and Computer Engineering , University of Southern Mississippi , Hattiesburg , Mississippi , USA.
² b Division of Bioinformatics and Biostatistics, National Center for Toxicological Science , US Food and Drug Administration , Jefferson , Arkansas , USA.
³ c Environmental Laboratory , US Army Engineer Research and Development Center , Vicksburg , Mississippi , USA.

PMID: 30628866
DOI: 10.1080/10590501.2018.1537118

Abstract

In silico toxicity prediction plays an important role in the regulatory decision making and selection of leads in drug design as in vitro/vivo methods are often limited by ethics, time, budget, and other resources. Many computational methods have been employed in predicting the toxicity profile of chemicals. This review provides a detailed end-to-end overview of the application of machine learning algorithms to Structure-Activity Relationship (SAR)-based predictive toxicology. From raw data to model validation, the importance of data quality is stressed as it greatly affects the predictive power of derived models. Commonly overlooked challenges such as data imbalance, activity cliff, model evaluation, and definition of applicability domain are highlighted, and plausible solutions for alleviating these challenges are discussed.

Keywords: Toxicity prediction; machine learning; model reliability; molecular descriptors; prediction accuracy; structure-activity relationship.

Publication types

Research Support, U.S. Gov't, Non-P.H.S.
Review

MeSH terms

Algorithms
Computer Simulation
Environmental Pollutants / toxicity*
Machine Learning
Quantitative Structure-Activity Relationship
Support Vector Machine
Toxicity Tests / methods*

Substances

Environmental Pollutants