Innovative interpretable AI-guided water quality evaluation with risk adversarial analysis in river streams considering spatial-temporal effects

Environ Pollut. 2024 Apr 22:350:124015. doi: 10.1016/j.envpol.2024.124015. Online ahead of print.

Abstract

Water security remains a critical issue given the looming threats of industrial pollution, necessitating comprehensive assessments of water quality to address seasonal fluctuations and influential factors while formulating effective strategies for decision makers. This study introduces a novel approach for evaluating water quality within a complex riverine zone in South Korea: Han River that encompasses five river streams situated at each junction of North and South streams (including Gyeongan Stream) that ultimately leading towards Paldang Lake. By utilizing the monthly water characteristic data from the year 2013-2022 across 14 different locations, the significant seasonal trends and potential influences on water quality are identified. The water quality here is calculated with the proposed method of sub-index water quality index (s-WQI). A combinatorial prediction approach of s-WQI for each location is conducted through a collective of data preprocessing approaches including Hampel filtering and feature selection in prior to the machine learning predictions. In return, light gradient boosting (LGB) is the most accurate predictor by outperforming other prediction algorithms, especially through LGB-Pearson and LGB-Spearman combinations for North and South stream intersections, and LGB-Pearson for Paldang Lake. To further evaluate the robustness of this evaluation and extending the results to a foreseeable scenario, a seasonal based Monte-Carlo Simulation with 10,000 attempts targeting the water characteristic distributions obtained from each location considered are carried out to identify the risk bounds within. The results are further interpreted with SHAP analysis on identifying the contributions of each water characteristics towards the water quality through local and global spectrum. This research yields practical implications, offering tailored strategies for water quality enhancement and early warning systems. The integration of AI-based prediction and feature selection underscores the transformative potential of computational techniques in advancing data-driven water quality assessments, shaping the future of environmental science research.

Keywords: Feature selection; Integrated water quality management; Interpretable AI; Machine learning; Monte-Carlo simulation risk assessment; Water quality assessment.