An ensemble learning strategy for panel time series forecasting of excess mortality during the COVID-19 pandemic

Appl Soft Comput. 2022 Oct:128:109422. doi: 10.1016/j.asoc.2022.109422. Epub 2022 Aug 1.

Abstract

Quantifying and analyzing excess mortality in crises such as the ongoing COVID-19 pandemic is crucial for policymakers. Traditional measures fail to take into account differences in the level, long-term secular trends, and seasonal patterns in all-cause mortality across countries and regions. This paper develops and empirically investigates the forecasting performance of a novel, flexible and dynamic ensemble learning with a model selection strategy (DELMS) for the seasonal time series forecasting of monthly respiratory disease death data across a pool of 61 heterogeneous countries. The strategy is based on a Bayesian model averaging (BMA) of heterogeneous time series methods involving both the selection of the subset of best forecasters (model confidence set), the identification of the best holdout period for each contributed model, and the determination of optimal weights using out-of-sample predictive accuracy. A model selection strategy is also developed to remove the outlier models and to combine the models with reasonable accuracy in the ensemble. The empirical outcomes of this large set of experiments show that the accuracy of the BMA approach is significantly improved with DELMS when selecting a flexible and dynamic holdout period and removing the outlier models. Additionally, the forecasts of respiratory disease deaths for each country are highly accurate and exhibit a high correlation (94%) with COVID-19 deaths in 2020.

Keywords: Bayesian model averaging (BMA); Ensemble learning; Forecasting; Layered learning; Machine learning; Multiple learning processes; Panel data; Respiratory disease deaths; SARS-CoV-2; Time series.