Prediction of long-term water quality using machine learning enhanced by Bayesian optimisation

Environ Pollut. 2023 Feb 1:318:120870. doi: 10.1016/j.envpol.2022.120870. Epub 2022 Dec 13.

Abstract

Water quality assessment is critical to better recognise the importance of water in human society. In this study, a new framework to predict long-term water quality is proposed by using Bayesian-optimised machine learning methods and key pollution indicators collected from monitoring stations in the Pearl River Estuary, Guangdong, China. The optimised stacked generalisation (SG-op) model achieved the best performance with the highest accuracy (0.992) and Kappa coefficient (0.987). Feature importance of the prediction model was consistent with key pollution indicators. The Spearman rank correlation coefficient was used to determine the significance level of the variation trends of different pollution indicators. The results show that the total phosphorus (TOP), dissolved oxygen (DO), chemical oxygen demand (COD), and petroleum (PET) among the key pollution indicators were on an upward trend in the study area. This framework can be applied to efficiently predict future water quality and to provide technical support for emergency pollution control.

Keywords: Bayesian optimisation; Key pollution indicators; Machine learning; Trend analysis; Water quality prediction.

MeSH terms

  • Bayes Theorem
  • China
  • Environmental Monitoring / methods
  • Humans
  • Machine Learning
  • Phosphorus / analysis
  • Rivers
  • Water Pollutants, Chemical* / analysis
  • Water Quality*

Substances

  • Water Pollutants, Chemical
  • Phosphorus