Modeling regional-scale groundwater arsenic hazard in the transboundary Ganges River Delta, India and Bangladesh: Infusing physically-based model with machine learning

Sci Total Environ. 2020 Dec 15:748:141107. doi: 10.1016/j.scitotenv.2020.141107. Epub 2020 Jul 25.

Abstract

For the last few decades, toxic levels of arsenic (As) in groundwater from the aquifers of the Ganges River delta, India and Bangladesh, have been known to cause serious public health concerns. Innumerable studies have advocated the control of geomorphologic, geologic, hydrogeologic, biogeochemical, and anthropogenic factors on arsenic mobilization, flow, and distribution patterns within the Ganges River delta. We have developed transboundary regional-scale models for computing the probability of groundwater As concentrations to exceed the WHO permissible thresholds for drinking water of 10 μg/L within the Ganges River delta as a function of the various geomorphologic-(hydro)geologic-hydrostratigraphic-anthropogenic controlling factors, using statistical methods and artificial intelligence (AI) [i.e., machine learning] techniques namely, Random Forest (RF), Boosted Regression Trees (BRT) and Logistic Regression (LR) algorithms, followed by probabilistic delineation the high As-hazard zones within the delta. A "hybrid multi-modeling approach" was adapted for this study, which involved the introduction of hydrostratigraphic parameters (aquifer connectivity and surficial aquitard thickness) derived from a high-resolution transboundary hydrostratigraphic model developed for the Ganges River delta aquifer system, as predictors for modeling groundwater As probabilities within the delta. The RF model outperforms the BRT and LR model in terms of model performance. Model outputs suggest the dominant influence of surficial aquitard thickness and groundwater-fed irrigated area (%) on groundwater As. While, the north-central and southern regions of the Ganges River delta show low As-hazard (<10 μg/L), the western and north-eastern regions demonstrate elevated hazard level (>10 μg/L). An estimated 30.3 million people are found to be exposed to elevated groundwater As within the study area. Thus, our study demonstrates that such hybrid, predictive models are not only helpful in delineating the regional-scale distribution of groundwater As-hazard zones in the areas with limited As data but is also useful in identifying the possible exogenous forcing that may have led to the worst, natural pollution in human history.

Keywords: Arsenic hazard; Ganges River; Hydrostratigraphy; Machine learning; Random forest model.