Optimization of water quality index models using machine learning approaches

Water Res. 2023 Sep 1:243:120337. doi: 10.1016/j.watres.2023.120337. Epub 2023 Jul 11.

Abstract

To optimize the water quality index (WQI) assessment model, this study upgraded the parameter weight values and aggregation functions. We determined the combined weights based on machine learning and game theory to improve the accuracy of the models, and proposed new aggregation functions to reduce the uncertainty of the model. A new water quality assessment system was established, and took the Chaobai River Basin as a case study. To optimize the weight, two combined weights were established based on game theory. The weight CWAE was combined by the Analytic Hierarchy Process (AHP) and Entropy Weight Method (EWM). The weight CWAL was combined by AHP and machine learning (LightGBM). CWAL was judged to be an optimal composite weight by comparing the coefficient of variation (CV) values and the Kaiser-Meyer-Olkin (KMO) extracted values. To reduce the uncertainty of the model, we proposed two aggregation functions, the Sinusoidal Weighted Mean (SWM) and the Log-weighted Quadratic Mean (LQM). The three water quality assessment models (WQIS, WQIL and WQIW) were established based on the optimal weights besides. All three models had good reliability. Both WQIS and WQIW models had low eclipsing problems (25.49% and 18.63%). The accuracy of the models was ranked as WQIS > WQIW > WQIL. The uncertainty of WQIs (0.000) in assessing poor water quality was low, and so was WQIW (0.259) in assessing good water quality. Overall, the WQIS model was recommended for assessing poor water quality and the WQIW model was recommended for assessing good water quality. The assessment results of WQIS showed that the Chaobai River Basin was "slightly polluted", and the water quality upstream was better than that downstream. TN was the main pollutant in the basin, and there was slight pollution with CODMn, CODCr, BOD5, etc. There was little metal contamination, only a few months exceeded Class I. The model established in this study can provide a reference for the same type work of water quality assessment. The assessment results can provide a scientific basis for the protection of the regional water environment.

Keywords: Chaobai River Basin; Combined weight; Game theory; LightGBM; Machine learning; Water quality assessment.

MeSH terms

  • China
  • Environmental Monitoring / methods
  • Machine Learning
  • Reproducibility of Results
  • Rivers
  • Water Pollutants, Chemical* / analysis
  • Water Quality*

Substances

  • Water Pollutants, Chemical