Early Prediction of Mortality, Severity, and Length of Stay in the Intensive Care Unit of Sepsis Patients Based on Sepsis 3.0 by Machine Learning Models

Longxiang Su; Zheng Xu; Fengxiang Chang; Yingying Ma; Shengjun Liu; Huizhen Jiang; Hao Wang; Dongkai Li; Huan Chen; Xiang Zhou; Na Hong; Weiguo Zhu; Yun Long

doi:10.3389/fmed.2021.664966

Early Prediction of Mortality, Severity, and Length of Stay in the Intensive Care Unit of Sepsis Patients Based on Sepsis 3.0 by Machine Learning Models

Front Med (Lausanne). 2021 Jun 28:8:664966. doi: 10.3389/fmed.2021.664966. eCollection 2021.

Authors

Longxiang Su¹, Zheng Xu², Fengxiang Chang², Yingying Ma², Shengjun Liu¹, Huizhen Jiang³, Hao Wang¹, Dongkai Li¹, Huan Chen¹, Xiang Zhou¹, Na Hong², Weiguo Zhu^{3

4}, Yun Long¹

Affiliations

¹ Department of Critical Care Medicine, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Beijing, China.
² Digital Health China Technologies Co., Ltd., Beijing, China.
³ Department of Information Center, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Beijing, China.
⁴ Department of Primary Care and Family Medicine, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Beijing, China.

Abstract

Background: Early prediction of the clinical outcome of patients with sepsis is of great significance and can guide treatment and reduce the mortality of patients. However, it is clinically difficult for clinicians. Methods: A total of 2,224 patients with sepsis were involved over a 3-year period (2016-2018) in the intensive care unit (ICU) of Peking Union Medical College Hospital. With all the key medical data from the first 6 h in the ICU, three machine learning models, logistic regression, random forest, and XGBoost, were used to predict mortality, severity (sepsis/septic shock), and length of ICU stay (LOS) (>6 days, ≤ 6 days). Missing data imputation and oversampling were completed on the dataset before introduction into the models. Results: Compared to the mortality and LOS predictions, the severity prediction achieved the best classification results, based on the area under the operating receiver characteristics (AUC), with the random forest classifier (sensitivity = 0.65, specificity = 0.73, F1 score = 0.72, AUC = 0.79). The random forest model also showed the best overall performance (mortality prediction: sensitivity = 0.50, specificity = 0.84, F1 score = 0.66, AUC = 0.74; LOS prediction: sensitivity = 0.79, specificity = 0.66, F1 score = 0.69, AUC = 0.76) among the three models. The predictive ability of the SOFA score itself was inferior to that of the above three models. Conclusions: Using the random forest classifier in the first 6 h of ICU admission can provide a comprehensive early warning of sepsis, which will contribute to the formulation and management of clinical decisions and the allocation and management of resources.

Keywords: machine learning; outcome; prediction; sepsis; sequential (sepsis-related) organ failure assessment.