Estimation of COVID-19 epidemic curves using genetic programming algorithm

Health Informatics J. 2021 Jan-Mar;27(1):1460458220976728. doi: 10.1177/1460458220976728.

Abstract

This paper investigates the possibility of the implementation of Genetic Programming (GP) algorithm on a publicly available COVID-19 data set, in order to obtain mathematical models which could be used for estimation of confirmed, deceased, and recovered cases and the estimation of epidemiology curve for specific countries, with a high number of cases, such as China, Italy, Spain, and USA and as well as on the global scale. The conducted investigation shows that the best mathematical models produced for estimating confirmed and deceased cases achieved R2 scores of 0.999, while the models developed for estimation of recovered cases achieved the R2 score of 0.998. The equations generated for confirmed, deceased, and recovered cases were combined in order to estimate the epidemiology curve of specific countries and on the global scale. The estimated epidemiology curve for each country obtained from these equations is almost identical to the real data contained within the data set.

Keywords: COVID-19; disease spread modeling; evolutionary computing; genetic programming; machine learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • COVID-19 / diagnosis
  • COVID-19 / epidemiology*
  • COVID-19 / mortality
  • Epidemics
  • Epidemiologic Methods
  • Humans
  • Machine Learning*
  • Models, Theoretical*
  • SARS-CoV-2