Understanding the driving mechanisms of site contamination in China through a data-driven approach

Environ Pollut. 2024 Feb 1:342:123105. doi: 10.1016/j.envpol.2023.123105. Epub 2023 Dec 7.

Abstract

China currently faces significant environmental risks stemming from contaminated sites. The driving mechanism of site contamination, influenced by various drivers, remain obscured due to a dearth of quantitative methodologies and comprehensive data. Here, we used a data-driven causality inference approach to construct an interpretable random forest (RF) model. Results show that: (1) the trained RF model demonstrated remarkable predictive accuracy for identifying contaminated sites, with an accuracy rate of 0.89. In contrast to conventional correlation analysis, the RF model excels in discerning the key drivers through non-linear and genuine causal relationships between these drivers and site contamination. (2) Among the 25 potential drivers, we identified 18 key drivers of site contamination. These drivers encompass a broad spectrum of factors, including production and operational data, pollutant control level, site protection capability, pollutant characteristics, and physical-geographical conditions. (3) Each key driver exerts varying impacts on site pollution, with diverse directions, intensities, and underlying patterns. The partial dependence plots (PDPs) illuminate the role of each key driver, its critical value contributing to site pollution, and the interplay between these drivers. The key drivers facilitate the realization of three primary contamination processes: uncontrolled release, effective migration, and persistent accumulation. In light of our findings, environmental managers can proactively prevent site contamination by regulating single, dual, and multiple key drivers to disrupt critical pollution processes. This research offers valuable insights for devising targeted strategies and interventions aimed at mitigating environmental risks associated with contaminated sites in China.

Keywords: Driving mechanism; Interpretable random forest model; Key driver; Partial dependence plot; Site contamination.

MeSH terms

  • Automobile Driving*
  • China
  • Environmental Pollutants* / analysis
  • Environmental Pollution / analysis

Substances

  • Environmental Pollutants