Spatio-temporal prediction of the COVID-19 pandemic in US counties: modeling with a deep LSTM neural network

Sci Rep. 2021 Nov 5;11(1):21715. doi: 10.1038/s41598-021-01119-3.

Abstract

Prediction of complex epidemiological systems such as COVID-19 is challenging on many grounds. Commonly used compartmental models struggle to handle an epidemiological process that evolves rapidly and is spatially heterogeneous. On the other hand, machine learning methods are limited at the beginning of the pandemics due to small data size for training. We propose a deep learning approach to predict future COVID-19 infection cases and deaths 1 to 4 weeks ahead at the fine granularity of US counties. The multi-variate Long Short-term Memory (LSTM) recurrent neural network is trained on multiple time series samples at the same time, including a mobility series. Results show that adding mobility as a variable and using multiple samples to train the network improve predictive performance both in terms of bias and of variance of the forecasts. We also show that the predicted results have similar accuracy and spatial patterns with a standard ensemble model used as benchmark. The model is attractive in many respects, including the fine geographic granularity of predictions and great predictive performance several weeks ahead. Furthermore, data requirement and computational intensity are reduced by substituting a single model to multiple models folded in an ensemble model.

MeSH terms

  • Algorithms
  • COVID-19 / epidemiology*
  • Deep Learning*
  • Geography
  • Humans
  • Machine Learning
  • Memory, Short-Term
  • Models, Statistical
  • Monte Carlo Method
  • Neural Networks, Computer*
  • Population Dynamics
  • Public Health Informatics
  • Reproducibility of Results
  • SARS-CoV-2
  • Time Factors
  • United States / epidemiology