Data-Driven Machine Learning in Environmental Pollution: Gains and Problems

Environ Sci Technol. 2022 Feb 15;56(4):2124-2133. doi: 10.1021/acs.est.1c06157. Epub 2022 Jan 27.

Abstract

The complexity and dynamics of the environment make it extremely difficult to directly predict and trace the temporal and spatial changes in pollution. In the past decade, the unprecedented accumulation of data, the development of high-performance computing power, and the rise of diverse machine learning (ML) methods provide new opportunities for environmental pollution research. The ML methodology has been used in satellite data processing to obtain ground-level concentrations of atmospheric pollutants, pollution source apportionment, and spatial distribution modeling of water pollutants. However, unlike the active practices of ML in chemical toxicity prediction, advanced algorithms such as deep neural networks in environmental process studies of pollutants are still deficient. In addition, over 40% of the environmental applications of ML go to air pollution, and its application range and acceptance in other aspects of environmental science remain to be increased. The use of ML methods to revolutionize environmental science and its problem-solving scenarios has its own challenges. Several issues should be taken into consideration, such as the tradeoff between model performance and interpretability, prerequisites of the machine learning model, model selection, and data sharing.

Keywords: Artificial intelligence; Big data; Environmental processes; Machine learning; Pollution.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Air Pollutants* / analysis
  • Air Pollution* / analysis
  • Algorithms
  • Environmental Monitoring / methods
  • Environmental Pollutants*
  • Machine Learning

Substances

  • Air Pollutants
  • Environmental Pollutants