Postoperative delirium prediction using machine learning models and preoperative electronic health record data

Andrew Bishara; Catherine Chiu; Elizabeth L Whitlock; Vanja C Douglas; Sei Lee; Atul J Butte; Jacqueline M Leung; Anne L Donovan

doi:10.1186/s12871-021-01543-y

Postoperative delirium prediction using machine learning models and preoperative electronic health record data

BMC Anesthesiol. 2022 Jan 3;22(1):8. doi: 10.1186/s12871-021-01543-y.

Authors

Andrew Bishara^{1

2}, Catherine Chiu¹, Elizabeth L Whitlock¹, Vanja C Douglas³, Sei Lee⁴, Atul J Butte², Jacqueline M Leung¹, Anne L Donovan⁵

Affiliations

¹ Department of Anesthesia and Perioperative Care, University of California, San Francisco, 521 Parnassus Avenue, San Francisco, CA, 94143, USA.
² Bakar Computational Health Sciences Institute, University of California San Francisco, 490 Illinois Street, San Francisco, CA, 94143, USA.
³ Weill Institute for Neurosciences and Department of Neurology, University of California, 505 Parnassus Avenue, San Francisco, CA, 94143, USA.
⁴ Division of Geriatrics, University of California, San Francisco, 505 Parnassus Avenue, San Francisco, CA, 94143, USA.
⁵ Department of Anesthesia and Perioperative Care, University of California, San Francisco, 521 Parnassus Avenue, San Francisco, CA, 94143, USA. anne.donovan@ucsf.edu.

Abstract

Background: Accurate, pragmatic risk stratification for postoperative delirium (POD) is necessary to target preventative resources toward high-risk patients. Machine learning (ML) offers a novel approach to leveraging electronic health record (EHR) data for POD prediction. We sought to develop and internally validate a ML-derived POD risk prediction model using preoperative risk features, and to compare its performance to models developed with traditional logistic regression.

Methods: This was a retrospective analysis of preoperative EHR data from 24,885 adults undergoing a procedure requiring anesthesia care, recovering in the main post-anesthesia care unit, and staying in the hospital at least overnight between December 2016 and December 2019 at either of two hospitals in a tertiary care health system. One hundred fifteen preoperative risk features including demographics, comorbidities, nursing assessments, surgery type, and other preoperative EHR data were used to predict postoperative delirium (POD), defined as any instance of Nursing Delirium Screening Scale ≥2 or positive Confusion Assessment Method for the Intensive Care Unit within the first 7 postoperative days. Two ML models (Neural Network and XGBoost), two traditional logistic regression models ("clinician-guided" and "ML hybrid"), and a previously described delirium risk stratification tool (AWOL-S) were evaluated using the area under the receiver operating characteristic curve (AUC-ROC), sensitivity, specificity, positive likelihood ratio, and positive predictive value. Model calibration was assessed with a calibration curve. Patients with no POD assessments charted or at least 20% of input variables missing were excluded.

Results: POD incidence was 5.3%. The AUC-ROC for Neural Net was 0.841 [95% CI 0. 816-0.863] and for XGBoost was 0.851 [95% CI 0.827-0.874], which was significantly better than the clinician-guided (AUC-ROC 0.763 [0.734-0.793], p < 0.001) and ML hybrid (AUC-ROC 0.824 [0.800-0.849], p < 0.001) regression models and AWOL-S (AUC-ROC 0.762 [95% CI 0.713-0.812], p < 0.001). Neural Net, XGBoost, and ML hybrid models demonstrated excellent calibration, while calibration of the clinician-guided and AWOL-S models was moderate; they tended to overestimate delirium risk in those already at highest risk.

Conclusion: Using pragmatically collected EHR data, two ML models predicted POD in a broad perioperative population with high discrimination. Optimal application of the models would provide automated, real-time delirium risk stratification to improve perioperative management of surgical patients at risk for POD.

Keywords: Delirium prevention; Geriatric surgery; Machine learning; Postoperative delirium; Risk prediction model.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Aged
Cohort Studies
Delirium / diagnosis*
Electronic Health Records / statistics & numerical data*
Female
Humans
Machine Learning*
Male
Middle Aged
Postoperative Complications / diagnosis*
Predictive Value of Tests
Preoperative Period
Reproducibility of Results
Retrospective Studies

Abstract

Publication types

MeSH terms

Grants and funding