Machine learning in the identification, prediction and exploration of environmental toxicology: Challenges and perspectives

J Hazard Mater. 2022 Sep 15:438:129487. doi: 10.1016/j.jhazmat.2022.129487. Epub 2022 Jun 27.

Abstract

Over the past few decades, data-driven machine learning (ML) has distinguished itself from hypothesis-driven studies and has recently received much attention in environmental toxicology. However, the use of ML in environmental toxicology remains in the early stages, with knowledge gaps, technical bottlenecks in data quality, high-dimensional/heterogeneous/small-sample data analysis and model interpretability, and a lack of an in-depth understanding of environmental toxicology. Given the above problems, we review the recent progress in the literature and highlight state-of-the-art toxicological studies using ML (such as learning and predicting toxicity in complicated biosystems and multiple-factor environmental scenarios of long-term and large-scale pollution). Beyond predicting simple biological endpoints by integrating untargeted omics and adverse outcome pathways, ML development should focus on revealing toxicological mechanisms. The integration of data-driven ML with other methods (e.g., omics analysis and adverse outcome pathway frameworks) endows ML with widely promising application in revealing toxicological mechanisms. High-quality databases and interpretable algorithms are urgently needed for toxicology and environmental science. Addressing the core issues and future challenges for ML in this review may narrow the knowledge gap between environmental toxicity and computational science and facilitate the control of environmental risk in the future.

Keywords: Big data; Chemical; Environmental health; Machine learning; Toxicity.

Publication types

  • Review
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Databases, Factual
  • Ecotoxicology*
  • Hazardous Substances
  • Machine Learning*

Substances

  • Hazardous Substances