Predictive Modeling of Morbidity and Mortality in Patients Hospitalized With COVID-19 and its Clinical Implications: Algorithm Development and Interpretation

Joshua M Wang; Wenke Liu; Xiaoshan Chen; Michael P McRae; John T McDevitt; David Fenyö

doi:10.2196/29514

Predictive Modeling of Morbidity and Mortality in Patients Hospitalized With COVID-19 and its Clinical Implications: Algorithm Development and Interpretation

J Med Internet Res. 2021 Jul 9;23(7):e29514. doi: 10.2196/29514.

Authors

Joshua M Wang^{1

2

3}, Wenke Liu^{1

2}, Xiaoshan Chen⁴, Michael P McRae⁵, John T McDevitt⁵, David Fenyö^{1

2

6}

Affiliations

¹ Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY, United States.
² Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY, United States.
³ Vilcek Institute of Graduate Biomedical Sciences, NYU Grossman School of Medicine, New York, NY, United States.
⁴ Department of Medicine, NYU Grossman School of Medicine, New York, NY, United States.
⁵ Department of Biomaterials, Bioengineering Institute, New York University, New York, NY, United States.
⁶ NYU Langone Health, New York, NY, United States.

PMID: 34081611
PMCID: PMC8274681
DOI: 10.2196/29514

Abstract

Background: The COVID-19 pandemic began in early 2021 and placed significant strains on health care systems worldwide. There remains a compelling need to analyze factors that are predictive for patients at elevated risk of morbidity and mortality.

Objective: The goal of this retrospective study of patients who tested positive with COVID-19 and were treated at NYU (New York University) Langone Health was to identify clinical markers predictive of disease severity in order to assist in clinical decision triage and to provide additional biological insights into disease progression.

Methods: The clinical activity of 3740 patients at NYU Langone Hospital was obtained between January and August 2020; patient data were deidentified. Models were trained on clinical data during different parts of their hospital stay to predict three clinical outcomes: deceased, ventilated, or admitted to the intensive care unit (ICU).

Results: The XGBoost (eXtreme Gradient Boosting) model that was trained on clinical data from the final 24 hours excelled at predicting mortality (area under the curve [AUC]=0.92; specificity=86%; and sensitivity=85%). Respiration rate was the most important feature, followed by SpO₂ (peripheral oxygen saturation) and being aged 75 years and over. Performance of this model to predict the deceased outcome extended 5 days prior, with AUC=0.81, specificity=70%, and sensitivity=75%. When only using clinical data from the first 24 hours, AUCs of 0.79, 0.80, and 0.77 were obtained for deceased, ventilated, or ICU-admitted outcomes, respectively. Although respiration rate and SpO₂ levels offered the highest feature importance, other canonical markers, including diabetic history, age, and temperature, offered minimal gain. When lab values were incorporated, prediction of mortality benefited the most from blood urea nitrogen and lactate dehydrogenase (LDH). Features that were predictive of morbidity included LDH, calcium, glucose, and C-reactive protein.

Conclusions: Together, this work summarizes efforts to systematically examine the importance of a wide range of features across different endpoint outcomes and at different hospitalization time points.

Keywords: COVID-19; New York City; SARS-CoV-2; coronavirus; decision making; hospital; machine learning; marker; model; morbidity; mortality; outcome; prediction; predictive modeling; severity; symptom.

©Joshua M Wang, Wenke Liu, Xiaoshan Chen, Michael P McRae, John T McDevitt, David Fenyö. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 09.07.2021.

MeSH terms

Adolescent
Adult
Aged
Algorithms*
Area Under Curve
COVID-19 / diagnosis*
COVID-19 / mortality*
Child
Child, Preschool
Diabetes Mellitus
Female
Hospitalization*
Hospitals
Humans
Infant
Infant, Newborn
Intensive Care Units
Male
Middle Aged
Morbidity
New York City / epidemiology
Pandemics
Retrospective Studies
SARS-CoV-2
Triage
Young Adult

Abstract

MeSH terms

Grants and funding