A novel multivariate time series prediction of crucial water quality parameters with Long Short-Term Memory (LSTM) networks

J Contam Hydrol. 2023 Nov:259:104262. doi: 10.1016/j.jconhyd.2023.104262. Epub 2023 Oct 30.

Abstract

Intelligent prediction of water quality plays a pivotal role in water pollution control, water resource protection, emergency decision-making for sudden water pollution incidents, tracking and evaluation of water quality changes in river basins, and is crucial to ensuring water security. The primary methodology employed in this paper for water quality prediction is as follows: (1) utilizing the comprehensive pollution index method and Mann-Kendall (MK) trend analysis method, an assessment is made of the pollution status and change trend within the basin, while simultaneously extracting the principal water quality parameters based on their respective pollution share rates; (2) employing the spearman method, an analysis is conducted to identify the influential factors impacting each key parameter; (3) subsequently, a water quality parameter prediction model, based on Long Short-Term Memory (LSTM) analysis, is constructed using the aforementioned driving factor analysis outcomes. The developed LSTM model in this study showed good prediction performance. The average coefficient of determination (R2) of the prediction of crucial water quality parameters such as total nitrogen (TN) and dissolved oxygen (DO) reached 0.82 and 0.86 respectively. Additionally, the error analysis of WQI prediction results showed that >75% of the prediction errors were in the range of 0-0.15. The comparative analysis revealed that the LSTM model outperforms both the random forest (RF) model in time series prediction and demonstrates superior robustness and applicability compared to the AutoRegressive Moving Average with eXogenous inputs model (ARMAX). Hence, the model developed in this study offers valuable technical assistance for water quality prediction and early warning systems, particularly in economically disadvantaged regions with limited monitoring capabilities. This contribution facilitates resource optimization and promotes sustainable development.

Keywords: Driving factors; LSTM; Machine learning; Time series prediction.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Factor Analysis, Statistical
  • Memory, Short-Term*
  • Time Factors
  • Water Pollution
  • Water Quality*