Social vulnerability and initial COVID-19 community spread in the US South: a machine learning approach

BMJ Health Care Inform. 2023 Jul;30(1):e100703. doi: 10.1136/bmjhci-2022-100703.

Abstract

Background and objectives: More than 93 million COVID-19 cases and more than 1 million COVID-19 deaths have been reported in the USA by August 2022. The disproportionate effect of the pandemic and its severe impact on vulnerable communities raised concerns. This research aimed to identify and rank Social Vulnerability Index (SVI) factors highly predictive of the spread of COVID-19 in the US South at the beginning of the pandemic.

Methods: We used Extreme Gradient Boosting (XGBoost) machine learning methodology and SVI data, and the number of COVID-19 cases across all counties in the US South to predict the number of positive cases within 30 days of a county's first case.

Results: Our results showed that the percentage of mobile homes is the most important feature in predicting the increase in COVID-19. Also, population density per square mile, per capita income, percentage of housing in structures with 10+ units, percentage of people below poverty and percentage of people with no high school diploma are important predictors of COVID-19 community spread, respectively.

Conclusions: SVI can help assess the vulnerability or resilience of communities to the spread of COVID-19 and can help identify communities at high risk of COVID-19 spread.

Keywords: COVID-19.

MeSH terms

  • COVID-19*
  • Humans
  • Machine Learning
  • Pandemics
  • Poverty
  • Social Vulnerability*