Machine Learning Models of Groundwater Arsenic Spatial Distribution in Bangladesh: Influence of Holocene Sediment Depositional History

Environ Sci Technol. 2020 Aug 4;54(15):9454-9463. doi: 10.1021/acs.est.0c03617. Epub 2020 Jul 23.

Abstract

Recent advances in machine learning methods offer the opportunity to improve risk assessment and to decipher factors influencing the spatial variability of groundwater arsenic ([As]gw). A systematic comparison reveals that boosted regression trees (BRT) and random forest (RF) outperform logistic regression. The probability of [As]gw exceeding 5 μg/L (approximate median value of Bangladesh [As]gw), 10 μg/L (WHO provisional guideline value), and 50 μg/L (Bangladesh drinking water standard) is modeled by BRT and RF methods for Bangladesh and its four subregions demarcated by major rivers. Of the 109 geo-environmental and hydrochemical predictor variables, phosphorus and iron emerge as the most important across spatial scales, consistent with known As mobilization mechanisms. Well depth is significant only when hydrochemical parameters are not considered, consistent with prior studies. A peak of probability of [As]gw exceedance at ∼30 m depth is evident in the partial dependence plots (PDPs) for spatial-parameter-only models but not in the equivalent all-parameter models, suggesting that sediment depositional history explains interdependent spatial patterns of groundwater As-P-Fe in Holocene aquifers. The South region exhibits a decrease of probability of [As]gw exceedance below 150 m depth in PDPs for spatial-parameter-only and all-parameter models, supporting that the deeper Pleistocene aquifer is a low-As water resource.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Arsenic* / analysis
  • Bangladesh
  • Environmental Monitoring
  • Geologic Sediments
  • Groundwater*
  • Machine Learning
  • Water Pollutants, Chemical* / analysis

Substances

  • Water Pollutants, Chemical
  • Arsenic