A review on machine learning methods for in silico toxicity prediction

J Environ Sci Health C Environ Carcinog Ecotoxicol Rev. 2018;36(4):169-191. doi: 10.1080/10590501.2018.1537118. Epub 2019 Jan 10.

Abstract

In silico toxicity prediction plays an important role in the regulatory decision making and selection of leads in drug design as in vitro/vivo methods are often limited by ethics, time, budget, and other resources. Many computational methods have been employed in predicting the toxicity profile of chemicals. This review provides a detailed end-to-end overview of the application of machine learning algorithms to Structure-Activity Relationship (SAR)-based predictive toxicology. From raw data to model validation, the importance of data quality is stressed as it greatly affects the predictive power of derived models. Commonly overlooked challenges such as data imbalance, activity cliff, model evaluation, and definition of applicability domain are highlighted, and plausible solutions for alleviating these challenges are discussed.

Keywords: Toxicity prediction; machine learning; model reliability; molecular descriptors; prediction accuracy; structure-activity relationship.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Review

MeSH terms

  • Algorithms
  • Computer Simulation
  • Environmental Pollutants / toxicity*
  • Machine Learning
  • Quantitative Structure-Activity Relationship
  • Support Vector Machine
  • Toxicity Tests / methods*

Substances

  • Environmental Pollutants