A comparative study of explainable ensemble learning and logistic regression for predicting in-hospital mortality in the emergency department

Zahra Rahmatinejad; Toktam Dehghani; Benyamin Hoseini; Fatemeh Rahmatinejad; Aynaz Lotfata; Hamidreza Reihani; Saeid Eslami

doi:10.1038/s41598-024-54038-4

A comparative study of explainable ensemble learning and logistic regression for predicting in-hospital mortality in the emergency department

Sci Rep. 2024 Feb 10;14(1):3406. doi: 10.1038/s41598-024-54038-4.

Authors

Zahra Rahmatinejad¹, Toktam Dehghani^{1

2}, Benyamin Hoseini³, Fatemeh Rahmatinejad¹, Aynaz Lotfata⁴, Hamidreza Reihani⁵, Saeid Eslami^{6

7

8}

Affiliations

¹ Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.
² Toos Institute of Higher Education, Mashhad, Iran.
³ Pharmaceutical Research Center, Pharmaceutical Technology Institute, Mashhad University of Medical Sciences, Mashhad, Iran.
⁴ Department of Pathology, Microbiology, and Immunology, School of Veterinary Medicine, University of California, Davis, CA, USA.
⁵ Department of Emergency Medicine, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran. reihanihr@mums.ac.ir.
⁶ Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran. eslamis@mums.ac.ir.
⁷ Pharmaceutical Research Center, Pharmaceutical Technology Institute, Mashhad University of Medical Sciences, Mashhad, Iran. eslamis@mums.ac.ir.
⁸ Department of Medical Informatics, Amsterdam UMC - Location AMC, University of Amsterdam, Amsterdam, The Netherlands. eslamis@mums.ac.ir.

Abstract

This study addresses the challenges associated with emergency department (ED) overcrowding and emphasizes the need for efficient risk stratification tools to identify high-risk patients for early intervention. While several scoring systems, often based on logistic regression (LR) models, have been proposed to indicate patient illness severity, this study aims to compare the predictive performance of ensemble learning (EL) models with LR for in-hospital mortality in the ED. A cross-sectional single-center study was conducted at the ED of Imam Reza Hospital in northeast Iran from March 2016 to March 2017. The study included adult patients with one to three levels of emergency severity index. EL models using Bagging, AdaBoost, random forests (RF), Stacking and extreme gradient boosting (XGB) algorithms, along with an LR model, were constructed. The training and validation visits from the ED were randomly divided into 80% and 20%, respectively. After training the proposed models using tenfold cross-validation, their predictive performance was evaluated. Model performance was compared using the Brier score (BS), The area under the receiver operating characteristics curve (AUROC), The area and precision-recall curve (AUCPR), Hosmer-Lemeshow (H-L) goodness-of-fit test, precision, sensitivity, accuracy, F1-score, and Matthews correlation coefficient (MCC). The study included 2025 unique patients admitted to the hospital's ED, with a total percentage of hospital deaths at approximately 19%. In the training group and the validation group, 274 of 1476 (18.6%) and 152 of 728 (20.8%) patients died during hospitalization, respectively. According to the evaluation of the presented framework, EL models, particularly Bagging, predicted in-hospital mortality with the highest AUROC (0.839, CI (0.802-0.875)) and AUCPR = 0.64 comparable in terms of discrimination power with LR (AUROC (0.826, CI (0.787-0.864)) and AUCPR = 0.61). XGB achieved the highest precision (0.83), sensitivity (0.831), accuracy (0.842), F1-score (0.833), and the highest MCC (0.48). Additionally, the most accurate models in the unbalanced dataset belonged to RF with the lowest BS (0.128). Although all studied models overestimate mortality risk and have insufficient calibration (P > 0.05), stacking demonstrated relatively good agreement between predicted and actual mortality. EL models are not superior to LR in predicting in-hospital mortality in the ED. Both EL and LR models can be considered as screening tools to identify patients at risk of mortality.

Keywords: Emergency department; Ensemble models; In-hospital mortality; Machine learning; Prognostic models.

MeSH terms

Adult
Cross-Sectional Studies
Emergency Service, Hospital*
Hospital Mortality
Humans
Logistic Models
Machine Learning*
Retrospective Studies

Grants and funding

4000506/Mashhad University of Medical Sciences