A machine learning approach to identifying delirium from electronic health records

Jae Hyun Kim; May Hua; Robert A Whittington; Junghwan Lee; Cong Liu; Casey N Ta; Edward R Marcantonio; Terry E Goldberg; Chunhua Weng

doi:10.1093/jamiaopen/ooac042

A machine learning approach to identifying delirium from electronic health records

JAMIA Open. 2022 May 24;5(2):ooac042. doi: 10.1093/jamiaopen/ooac042. eCollection 2022 Jul.

Authors

Jae Hyun Kim¹, May Hua^{2

3}, Robert A Whittington², Junghwan Lee¹, Cong Liu¹, Casey N Ta¹, Edward R Marcantonio^{4

5

6}, Terry E Goldberg^{2

7}, Chunhua Weng¹

Affiliations

¹ Department of Biomedical Informatics, Columbia University, New York, New York, USA.
² Department of Anesthesiology, Columbia University Medical Center, New York Presbyterian Hospital, New York, New York, USA.
³ Department of Epidemiology, Columbia University Mailman School of Public Health, New York, New York, USA.
⁴ Harvard Medical School, Boston, Massachusetts, USA.
⁵ Division of General Medicine, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.
⁶ Division of Gerontology, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts, USA.
⁷ Department of Psychiatry, Columbia University Irving Medical Center, New York, New York, USA.

Abstract

The identification of delirium in electronic health records (EHRs) remains difficult due to inadequate assessment or under-documentation. The purpose of this research is to present a classification model that identifies delirium using retrospective EHR data. Delirium was confirmed with the Confusion Assessment Method for the Intensive Care Unit. Age, sex, Elixhauser comorbidity index, drug exposures, and diagnoses were used as features. The model was developed based on the Columbia University Irving Medical Center EHR data and further validated with the Medical Information Mart for Intensive Care III dataset. Seventy-six patients from Surgical/Cardiothoracic ICU were included in the model. The logistic regression model achieved the best performance in identifying delirium; mean AUC of 0.874 ± 0.033. The mean positive predictive value of the logistic regression model was 0.80. The model promises to identify delirium cases with EHR data, thereby enable a sustainable infrastructure to build a retrospective cohort of delirium.

Keywords: Confusion Assessment Method for the Intensive Care Unit (CAM-ICU); delirium; electronic health records; logistic regression; machine learning model.

Abstract

Grants and funding