Early Prediction of Mortality for Septic Patients Visiting Emergency Room Based on Explainable Machine Learning: A Real-World Multicenter Study

Sang Won Park; Na Young Yeo; Seonguk Kang; Taejun Ha; Tae-Hoon Kim; DooHee Lee; Dowon Kim; Seheon Choi; Minkyu Kim; DongHoon Lee; DoHyeon Kim; Woo Jin Kim; Seung-Joon Lee; Yeon-Jeong Heo; Da Hye Moon; Seon-Sook Han; Yoon Kim; Hyun-Soo Choi; Dong Kyu Oh; Su Yeon Lee; MiHyeon Park; Chae-Man Lim; Jeongwon Heo; Korean Sepsis Alliance (KSA) Investigators

doi:10.3346/jkms.2024.39.e53

Early Prediction of Mortality for Septic Patients Visiting Emergency Room Based on Explainable Machine Learning: A Real-World Multicenter Study

J Korean Med Sci. 2024 Feb 5;39(5):e53. doi: 10.3346/jkms.2024.39.e53.

Authors

Sang Won Park^#^{1

2}, Na Young Yeo^#³, Seonguk Kang⁴, Taejun Ha⁵, Tae-Hoon Kim⁶, DooHee Lee⁷, Dowon Kim⁷, Seheon Choi⁷, Minkyu Kim⁷, DongHoon Lee⁷, DoHyeon Kim⁷, Woo Jin Kim^{1

8

9}, Seung-Joon Lee^{8

9}, Yeon-Jeong Heo^{8

9}, Da Hye Moon^{8

9}, Seon-Sook Han^{8

9}, Yoon Kim^{6

10}, Hyun-Soo Choi^{6

11}, Dong Kyu Oh¹², Su Yeon Lee¹², MiHyeon Park¹², Chae-Man Lim¹², Jeongwon Heo^{8

13}; Korean Sepsis Alliance (KSA) Investigators

Affiliations

¹ Department of Medical Informatics, School of Medicine, Kangwon National University, Chuncheon, Korea.
² Institute of Medical Science, School of Medicine, Kangwon National University, Chuncheon, Korea.
³ Department of Medical Bigdata Convergence, Kangwon National University, Chuncheon, Korea.
⁴ Department of Convergence Security, Kangwon National University, Chuncheon, Korea.
⁵ Department of Biomedical Research Institute, Kangwon National University Hospital, Chuncheon, Korea.
⁶ University-Industry Cooperation Foundation, Kangwon National University, Chuncheon, Korea.
⁷ Department of Research and Development, ZIOVISION Co. Ltd., Chuncheon, Korea.
⁸ Department of Internal Medicine, Kangwon National University Hospital, Chuncheon, Korea.
⁹ Department of Internal Medicine, School of Medicine, Kangwon National University, Chuncheon, Korea.
¹⁰ Department of Computer Science and Engineering, Kangwon National University, Chuncheon, Korea.
¹¹ Department of Computer Science and Engineering, Seoul National University of Science and Technology, Seoul, Korea.
¹² Department of Pulmonary and Critical Care Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea.
¹³ Department of Internal Medicine, School of Medicine, Kangwon National University, Chuncheon, Korea. doctorhjw@naver.com.

^# Contributed equally.

Abstract

Background: Worldwide, sepsis is the leading cause of death in hospitals. If mortality rates in patients with sepsis can be predicted early, medical resources can be allocated efficiently. We constructed machine learning (ML) models to predict the mortality of patients with sepsis in a hospital emergency department.

Methods: This study prospectively collected nationwide data from an ongoing multicenter cohort of patients with sepsis identified in the emergency department. Patients were enrolled from 19 hospitals between September 2019 and December 2020. For acquired data from 3,657 survivors and 1,455 deaths, six ML models (logistic regression, support vector machine, random forest, extreme gradient boosting [XGBoost], light gradient boosting machine, and categorical boosting [CatBoost]) were constructed using fivefold cross-validation to predict mortality. Through these models, 44 clinical variables measured on the day of admission were compared with six sequential organ failure assessment (SOFA) components (PaO₂/FIO₂ [PF], platelets (PLT), bilirubin, cardiovascular, Glasgow Coma Scale score, and creatinine). The confidence interval (CI) was obtained by performing 10,000 repeated measurements via random sampling of the test dataset. All results were explained and interpreted using Shapley's additive explanations (SHAP).

Results: Of the 5,112 participants, CatBoost exhibited the highest area under the curve (AUC) of 0.800 (95% CI, 0.756-0.840) using clinical variables. Using the SOFA components for the same patient, XGBoost exhibited the highest AUC of 0.678 (95% CI, 0.626-0.730). As interpreted by SHAP, albumin, lactate, blood urea nitrogen, and international normalization ratio were determined to significantly affect the results. Additionally, PF and PLTs in the SOFA component significantly influenced the prediction results.

Conclusion: Newly established ML-based models achieved good prediction of mortality in patients with sepsis. Using several clinical variables acquired at the baseline can provide more accurate results for early predictions than using SOFA components. Additionally, the impact of each variable was identified.

Keywords: Clinical Decision Support System (CDSS); Explainable Artificial Intelligence (XAI); Machine Learning; Mortality Prediction, Sepsis.

Publication types

Multicenter Study

MeSH terms

Albumins
Emergency Service, Hospital*
Humans
Lactic Acid
Machine Learning
Sepsis* / diagnosis

Substances

Albumins
Lactic Acid

Abstract

Publication types

MeSH terms

Substances

Grants and funding