Predicting Intensive Care Delirium with Machine Learning: Model Development and External Validation

Kirby D Gong; Ryan Lu; Teya S Bergamaschi; Akaash Sanyal; Joanna Guo; Han B Kim; Hieu T Nguyen; Joseph L Greenstein; Raimond L Winslow; Robert D Stevens

doi:10.1097/ALN.0000000000004478

Predicting Intensive Care Delirium with Machine Learning: Model Development and External Validation

Anesthesiology. 2023 Mar 1;138(3):299-311. doi: 10.1097/ALN.0000000000004478.

Authors

Affiliations

¹ Johns Hopkins University School of Medicine, Baltimore, Maryland.
² Northwestern University, Evanston, Illinois.
³ Massachusetts Institute of Technology, Cambridge, Massachusetts.
⁴ Johns Hopkins University, Baltimore, Maryland.
⁵ Whiting School of Engineering at Johns Hopkins University, Baltimore, Maryland.

PMID: 36538354
DOI: 10.1097/ALN.0000000000004478

Abstract

Background: Delirium poses significant risks to patients, but countermeasures can be taken to mitigate negative outcomes. Accurately forecasting delirium in intensive care unit (ICU) patients could guide proactive intervention. Our primary objective was to predict ICU delirium by applying machine learning to clinical and physiologic data routinely collected in electronic health records.

Methods: Two prediction models were trained and tested using a multicenter database (years of data collection 2014 to 2015), and externally validated on two single-center databases (2001 to 2012 and 2008 to 2019). The primary outcome variable was delirium defined as a positive Confusion Assessment Method for the ICU screen, or an Intensive Care Delirium Screening Checklist of 4 or greater. The first model, named "24-hour model," used data from the 24 h after ICU admission to predict delirium any time afterward. The second model designated "dynamic model," predicted the onset of delirium up to 12 h in advance. Model performance was compared with a widely cited reference model.

Results: For the 24-h model, delirium was identified in 2,536 of 18,305 (13.9%), 768 of 5,299 (14.5%), and 5,955 of 36,194 (11.9%) of patient stays, respectively, in the development sample and two validation samples. For the 12-h lead time dynamic model, delirium was identified in 3,791 of 22,234 (17.0%), 994 of 6,166 (16.1%), and 5,955 of 28,440 (20.9%) patient stays, respectively. Mean area under the receiver operating characteristics curve (AUC) (95% CI) for the first 24-h model was 0.785 (0.769 to 0.801), significantly higher than the modified reference model with AUC of 0.730 (0.704 to 0.757). The dynamic model had a mean AUC of 0.845 (0.831 to 0.859) when predicting delirium 12 h in advance. Calibration was similar in both models (mean Brier Score [95% CI] 0.102 [0.097 to 0.108] and 0.111 [0.106 to 0.116]). Model discrimination and calibration were maintained when tested on the validation datasets.

Conclusions: Machine learning models trained with routinely collected electronic health record data accurately predict ICU delirium, supporting dynamic time-sensitive forecasting.

Publication types

Multicenter Study
Research Support, Non-U.S. Gov't

MeSH terms

Critical Care / methods
Delirium* / diagnosis
Hospitalization
Humans
Intensive Care Units
Machine Learning