Application of data-mining technique and hydro-chemical data for evaluating vulnerability of groundwater in Indo-Gangetic Plain

J Environ Manage. 2022 Sep 15:318:115582. doi: 10.1016/j.jenvman.2022.115582. Epub 2022 Jun 27.

Abstract

Vulnerability of groundwater is critical for the sustainable development of groundwater resources, especially in freshwater-limited coastal Indo-Gangetic plains. Here, we intend to develop an integrated novel approach for delineating groundwater vulnerability using hydro-chemical analysis and data-mining methods, i.e., Decision Tree (DT) and K-Nearest Neighbor (KNN) via k-fold cross-validation (CV) technique. A total of 110 of groundwater samples were obtained during the dry and wet seasons to generate an inventory map. Four K-fold CV approach was used to delineate the vulnerable region from sixteen vulnerability causal factors. The statistical error metrics i.e., receiver operating characteristic-area under the curve (AUC-ROC) and other advanced metrices were adopted to validate model outcomes. The results demonstrated the excellent ability of the proposed models to recognize the vulnerability of groundwater zones in the Indo-Gangetic plain. The DT model revealed higher performance (AUC = 0.97) followed by KNN model (AUC = 0.95). The north-central and north-eastern parts are more vulnerable due to high salinity, Nitrate (NO3-), Fluoride (F-) and Arsenic (As) concentrations. Policy-makers and groundwater managers can utilize the proposed integrated novel approach and the outcome of groundwater vulnerability maps to attain sustainable groundwater development and safeguard human-induced activities at the regional level.

Keywords: Groundwater development; Indo-gangetic plain; Machine learning; Vulnerability; Water resource.

MeSH terms

  • Arsenic* / analysis
  • Data Mining
  • Environmental Monitoring / methods
  • Fluorides / analysis
  • Groundwater* / analysis
  • Humans
  • Water Pollutants, Chemical* / analysis

Substances

  • Water Pollutants, Chemical
  • Arsenic
  • Fluorides