Machine learning approach to dynamic risk modeling of mortality in COVID-19: a UK Biobank study

Sci Rep. 2021 Aug 19;11(1):16936. doi: 10.1038/s41598-021-95136-x.

Abstract

The COVID-19 pandemic has created an urgent need for robust, scalable monitoring tools supporting stratification of high-risk patients. This research aims to develop and validate prediction models, using the UK Biobank, to estimate COVID-19 mortality risk in confirmed cases. From the 11,245 participants testing positive for COVID-19, we develop a data-driven random forest classification model with excellent performance (AUC: 0.91), using baseline characteristics, pre-existing conditions, symptoms, and vital signs, such that the score could dynamically assess mortality risk with disease deterioration. We also identify several significant novel predictors of COVID-19 mortality with equivalent or greater predictive value than established high-risk comorbidities, such as detailed anthropometrics and prior acute kidney failure, urinary tract infection, and pneumonias. The model design and feature selection enables utility in outpatient settings. Possible applications include supporting individual-level risk profiling and monitoring disease progression across patients with COVID-19 at-scale, especially in hospital-at-home settings.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aged
  • Aged, 80 and over
  • Biological Specimen Banks
  • COVID-19 / epidemiology*
  • COVID-19 / mortality
  • Cohort Studies
  • Comorbidity
  • Female
  • Humans
  • Machine Learning
  • Male
  • Middle Aged
  • Models, Statistical*
  • Pandemics
  • Prognosis
  • Risk Factors
  • SARS-CoV-2 / physiology*
  • United Kingdom / epidemiology