Clinical outcome prediction using observational supervision with electronic health records and audit logs

Nandita Bhaskhar; Wui Ip; Jonathan H Chen; Daniel L Rubin

doi:10.1016/j.jbi.2023.104522

Clinical outcome prediction using observational supervision with electronic health records and audit logs

J Biomed Inform. 2023 Nov:147:104522. doi: 10.1016/j.jbi.2023.104522. Epub 2023 Oct 11.

Authors

Nandita Bhaskhar¹, Wui Ip², Jonathan H Chen³, Daniel L Rubin⁴

Affiliations

¹ Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA. Electronic address: nanbhas@stanford.edu.
² Department of Pediatrics, Stanford School of Medicine, Palo Alto, CA 94305, USA.
³ Center for Biomedical Informatics Research, Stanford University, Stanford, CA 94305, USA; Division of Hospital Medicine, Stanford School of Medicine, Palo Alto, CA 94305, USA; Clinical Excellence Research Center, Stanford School of Medicine, Palo Alto, CA 94305, USA.
⁴ Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA; Department of Radiology, Stanford University, Stanford, CA 94305, USA; Department of Medicine, Stanford School of Medicine, Palo Alto, CA 94305, USA.

PMID: 37827476
DOI: 10.1016/j.jbi.2023.104522

Abstract

Objective: Audit logs in electronic health record (EHR) systems capture interactions of providers with clinical data. We determine if machine learning (ML) models trained using audit logs in conjunction with clinical data ("observational supervision") outperform ML models trained using clinical data alone in clinical outcome prediction tasks, and whether they are more robust to temporal distribution shifts in the data.

Materials and methods: Using clinical and audit log data from Stanford Healthcare, we trained and evaluated various ML models including logistic regression, support vector machine (SVM) classifiers, neural networks, random forests, and gradient boosted machines (GBMs) on clinical EHR data, with and without audit logs for two clinical outcome prediction tasks: major adverse kidney events within 120 days of ICU admission (MAKE-120) in acute kidney injury (AKI) patients and 30-day readmission in acute stroke patients. We further tested the best performing models using patient data acquired during different time-intervals to evaluate the impact of temporal distribution shifts on model performance.

Results: Performance generally improved for all models when trained with clinical EHR data and audit log data compared with those trained with only clinical EHR data, with GBMs tending to have the overall best performance. GBMs trained with clinical EHR data and audit logs outperformed GBMs trained without audit logs in both clinical outcome prediction tasks: AUROC 0.88 (95% CI: 0.85-0.91) vs. 0.79 (95% CI: 0.77-0.81), respectively, for MAKE-120 prediction in AKI patients, and AUROC 0.74 (95% CI: 0.71-0.77) vs. 0.63 (95% CI: 0.62-0.64), respectively, for 30-day readmission prediction in acute stroke patients. The performance of GBM models trained using audit log and clinical data degraded less in later time-intervals than models trained using only clinical data.

Conclusion: Observational supervision with audit logs improved the performance of ML models trained to predict important clinical outcomes in patients with AKI and acute stroke, and improved robustness to temporal distribution shifts.

Keywords: 30-day hospital readmission; Acute ischemic stroke; Acute kidney injury (AKI); Audit logs; Electronic health records (EHR); Major adverse kidney event (MAKE); Observational supervision.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Acute Kidney Injury*
Electronic Health Records
Hospitalization
Humans
Prognosis
Stroke*

Abstract

Publication types

MeSH terms

Grants and funding