Time series and regression methods for univariate environmental forecasting: An empirical evaluation

Sci Total Environ. 2023 Jun 1:875:162580. doi: 10.1016/j.scitotenv.2023.162580. Epub 2023 Mar 9.

Abstract

One of the most common and valuable applications of science to the environment is to forecast the future, as it affects human lives in many aspects. However, it is not yet clear which methods -conventional time series or regression- deliver the highest performance in univariate time series forecasting. This study attempts to answer that question with a large-scale comparative evaluation that includes 68 environmental variables over three frequencies (hourly, daily, monthly), forecasted in one to twelve steps into the future, and evaluated over six statistical time series and fourteen regression methods. Results suggest that the strongest representatives of the time series methods (ARIMA, Theta) exhibit high accuracies, but certain regression methods (Huber, Extra Trees, Random Forest, Light Gradient Boosting Machines, Gradient Boosting Machines, Ridge, Bayesian Ridge) deliver even more promising results for all forecasting horizons. Finally, depending on the specific use case, the suitable method should be employed, as certain methods are more appropriate for different frequencies and some have an advantageous trade-off between computational time and performance.

Keywords: Environment; Forecasting; Machine learning; Regression; Time series.