Machine learning models for the prediction of the SEIRD variables for the COVID-19 pandemic based on a deep dependence analysis of variables

Comput Biol Med. 2021 Jul:134:104500. doi: 10.1016/j.compbiomed.2021.104500. Epub 2021 May 24.

Abstract

The SEIRD (Susceptible, Exposed, Infected, Recovered, and Dead) model is a mathematical model based on dynamic equations; widely used for characterization of the COVID-19 pandemic. In this paper, a different approach has been discussed, which is the development of predictive models for the SEIRD variables that have been based on the historical data collected, and the context variables to where this model has been applied to. Particularly, the context variables examined in this paper include total population, number of people over 65 years old, poverty index, morbidity rates, average age, and population density. For the construction of the SEIRD predictive models, this study encompasses a deep analysis of the dependence of these variables and also, their relationship with the context variables. Hence, before the development of predictive models using machine learning techniques, a methodology to analyze the interdependence of the SEIRD variables has been proposed. The dependence with the context variables is also discussed; to avoid the curse of dimensionality and multicollinearity problems, leading to better results and the reduction of the computational cost. Finally, several prediction models based on varied machine learning techniques and inputs are considered, these include temporal interdependence, temporal intra-dependence, and dependence with context variables. Each of the predictive models has been studied, as well as their quality of prediction. This paper focuses on the analysis of the quality of this approach, applied in Colombia, obtaining the results about the performance of the predictive models for the SEIRD variables. The results are very encouraging since the values obtained with the quality metrics are quite good for different prediction horizons.

Keywords: COVID-19; Data dependence analysis; Machine learning; Prediction model.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aged
  • COVID-19*
  • Humans
  • Machine Learning
  • Pandemics*
  • SARS-CoV-2