Machine Learning Model to Identify Sepsis Patients in the Emergency Department: Algorithm Development and Validation

Pei-Chen Lin; Kuo-Tai Chen; Huan-Chieh Chen; Md Mohaimenul Islam; Ming-Chin Lin

doi:10.3390/jpm11111055

Machine Learning Model to Identify Sepsis Patients in the Emergency Department: Algorithm Development and Validation

J Pers Med. 2021 Oct 21;11(11):1055. doi: 10.3390/jpm11111055.

Authors

Pei-Chen Lin^{1

2}, Kuo-Tai Chen³, Huan-Chieh Chen^{4

5}, Md Mohaimenul Islam^{1

6

7}, Ming-Chin Lin^{1

5

8}

Affiliations

¹ Graduate Institute of Biomedical Informatics, College of Medicine Science and Technology, Taipei Medical University, Taipei 106, Taiwan.
² Emergency Department, Taoyuan General Hospital, Ministry of Health and Welfare, Taoyuan 330, Taiwan.
³ Emergency Department, Chi-Mei Medical Center, Tainan 710, Taiwan.
⁴ Department of Neurosurgery, Taipei Medical University-Wan Fang Hospital, Taipei 116, Taiwan.
⁵ Taipei Neuroscience Institute, Taipei Medical University, Taipei 110, Taiwan.
⁶ International Center for Health Information Technology (ICHIT), Taipei Medical University, Taipei 110, Taiwan.
⁷ Research Center of Big Data and Meta-Analysis, Wan Fang Hospital, Taipei Medical University, Taipei 116, Taiwan.
⁸ Department of Neurosurgery, Shuang Ho Hospital, Taipei Medical University, New Taipei City 235, Taiwan.

Abstract

Accurate stratification of sepsis can effectively guide the triage of patient care and shared decision making in the emergency department (ED). However, previous research on sepsis identification models focused mainly on ICU patients, and discrepancies in model performance between the development and external validation datasets are rarely evaluated. The aim of our study was to develop and externally validate a machine learning model to stratify sepsis patients in the ED. We retrospectively collected clinical data from two geographically separate institutes that provided a different level of care at different time periods. The Sepsis-3 criteria were used as the reference standard in both datasets for identifying true sepsis cases. An eXtreme Gradient Boosting (XGBoost) algorithm was developed to stratify sepsis patients and the performance of the model was compared with traditional clinical sepsis tools; quick Sequential Organ Failure Assessment (qSOFA) and Systemic Inflammatory Response Syndrome (SIRS). There were 8296 patients (1752 (21%) being septic) in the development and 1744 patients (506 (29%) being septic) in the external validation datasets. The mortality of septic patients in the development and validation datasets was 13.5% and 17%, respectively. In the internal validation, XGBoost achieved an area under the receiver operating characteristic curve (AUROC) of 0.86, exceeding SIRS (0.68) and qSOFA (0.56). The performance of XGBoost deteriorated in the external validation (the AUROC of XGBoost, SIRS and qSOFA was 0.75, 0.57 and 0.66, respectively). Heterogeneity in patient characteristics, such as sepsis prevalence, severity, age, comorbidity and infection focus, could reduce model performance. Our model showed good discriminative capabilities for the identification of sepsis patients and outperformed the existing sepsis identification tools. Implementation of the ML model in the ED can facilitate timely sepsis identification and treatment. However, dataset discrepancies should be carefully evaluated before implementing the ML approach in clinical practice. This finding reinforces the necessity for future studies to perform external validation to ensure the generalisability of any developed ML approaches.

Keywords: emergency department; intensive care unit; machine learning; sepsis; septic shock.