Evaluate prognostic accuracy of SOFA component score for mortality among adults with sepsis by machine learning method

BMC Infect Dis. 2023 Feb 6;23(1):76. doi: 10.1186/s12879-023-08045-x.

Abstract

Introduction: Sepsis has the characteristics of high incidence, high mortality of ICU patients. Early assessment of disease severity and risk stratification of death in patients with sepsis, and further targeted intervention are very important. The purpose of this study was to develop machine learning models based on sequential organ failure assessment (SOFA) components to early predict in-hospital mortality in ICU patients with sepsis and evaluate model performance.

Methods: Patients admitted to ICU with sepsis diagnosis were extracted from MIMIC-IV database for retrospective analysis, and were randomly divided into training set and test set in accordance with 2:1. Six variables were included in this study, all of which were from the scores of 6 organ systems in SOFA score. The machine learning model was trained in the training set and evaluated in the validation set. Six machine learning methods including linear regression analysis, least absolute shrinkage and selection operator (LASSO), Logistic regression analysis (LR), Gaussian Naive Bayes (GNB) and support vector machines (SVM) were used to construct the death risk prediction models, and the accuracy, area under the receiver operating characteristic curve (AUROC), Decision Curve Analysis (DCA) and K-fold cross-validation were used to evaluate the prediction performance of developed models.

Result: A total of 23,889 patients with sepsis were enrolled, of whom 3659 died in hospital. Three feature variables including renal system score, central nervous system score and cardio vascular system score were used to establish prediction models. The accuracy of the LR, GNB, SVM were 0.851, 0.844 and 0.862, respectively, which were better than linear regression analysis (0.123) and LASSO (0.130). The AUROCs of LR, GNB and SVM were 0.76, 0.76 and 0.67, respectively. K-fold cross validation showed that the average AUROCs of LR, GNB and SVM were 0.757 ± 0.005, 0.762 ± 0.006, 0.630 ± 0.013, respectively. For the probability threshold of 5-50%, LY and GNB models both showed positive net benefits.

Conclusion: The two machine learning-based models (LR and GNB models) based on SOFA components can be used to predict in-hospital mortality of septic patients admitted to ICU.

Keywords: Hospital mortality; Intensive care unit; Machine learning; Sepsis; Sequential organ failure assessment.

Publication types

  • Randomized Controlled Trial

MeSH terms

  • Adult
  • Bayes Theorem
  • Hospital Mortality
  • Humans
  • Intensive Care Units
  • Machine Learning
  • Organ Dysfunction Scores*
  • Prognosis
  • ROC Curve
  • Retrospective Studies
  • Sepsis* / diagnosis