A stacked machine learning model for multi-step ahead prediction of lake surface water temperature

Sci Total Environ. 2023 Sep 10:890:164323. doi: 10.1016/j.scitotenv.2023.164323. Epub 2023 May 20.

Abstract

Lake surface water temperature is one of the most important physical and ecological indices of lakes, which has frequently been used as the indicator to evaluate the impact of climate change on lakes. Knowing the dynamics of lake surface water temperature is thus of great significance. The past decades have witnessed the development of different modeling tools to forecast lake surface water temperature, yet, simple models with fewer input variables, while maintaining high forecasting accuracy are scarce. Impact of forecast horizons on model performance has seldom been investigated. To fill the gap, in this study, a novel machine learning algorithm by stacking multilayer perceptron and random forest (MLP-RF) was employed to forecast daily lake surface water temperature using daily air temperature as the exogenous input variable, with the Bayesian Optimization procedure applied for tuning the hyperparameters. Prediction models were developed using long-term observed data from eight Polish lakes. The MLP-RF stacked model showed very good forecasting capabilities for all lakes and forecast horizons, far better than shallow multilayer perceptron neural network, a model coupling wavelet transform and multilayer perceptron neural network, non-linear regression and air2water models. A reduction in model performance was observed as the forecast horizon increased. However, the model also performs well with a forecast horizon of several days (e.g., 7 days ahead, testing stage: R2 - [0.932, 0.990], RMSE °C - [0.77, 1.83], MAE °C - [0.55, 1.38]). In addition, the MLP-RF stacked model has proven to be reliable for both intermediate temperatures and minimum and maximum peaks. The model proposed in this study will be useful to the scientific community in predicting lake surface water temperature, thus contributing to studies on such sensitive aquatic ecosystems as lakes.

Keywords: Air temperature; Ensemble models; Lakes; Machine learning; Surface water temperature.

MeSH terms

  • Bayes Theorem
  • Ecosystem*
  • Lakes*
  • Machine Learning
  • Temperature
  • Water

Substances

  • Water