Statistical comparison between SARIMA and ANN's performance for surface water quality time series prediction

Environ Sci Pollut Res Int. 2021 Feb 27. doi: 10.1007/s11356-021-13086-3. Online ahead of print.

Abstract

The performance comparison studies of the autoregressive integrated moving average model (ARIMA) and the artificial neural network (ANN) were mostly carried out between the selected model structures through trial-and-error, strongly influenced by model structure uncertainty. This research aims to make up for this inadequacy. First, a surface water quality prediction case study including eight monitoring sites in China was introduced. Second, the ARIMA and ANN's performance was compared statistically between 6912 Seasonal ARIMA (SARIMA) and 110,592 feedforward ANN with different model structures, based on the mean square error (MSE) distributions depicted by boxplots. In a statistical view, the ANN models obtained a significantly lower median value and a more concentrated distribution of validation MSEs, which indicated lighter overfitting and better generalization ability. Furthermore, the optimal SARIMA models' performance is inferior to even the median of the ANN models in the case study. In contrast with the previous comparisons among selected models, the statistical comparison in this study shows lower uncertainty.

Keywords: ANN; ARIMA; Grid sampling; Statistical comparison; Surface water quality; Time series prediction.