Background: The lack of transparency is a prevalent issue among the current machine-learning (ML) algorithms utilized for predicting mortality risk. Herein, we aimed to improve transparency by utilizing the latest ML explicable technology, SHapley Additive exPlanation (SHAP), to develop a predictive model for critically ill patients.
Methods: We extracted data from the Medical Information Mart for Intensive Care IV database, encompassing all intensive care unit admissions. We employed nine different methods to develop the models. The most accurate model, with the highest area under the receiver operating characteristic curve, was selected as the optimal model. Additionally, we used SHAP to explain the workings of the ML model.
Results: The study included 21 395 critically ill patients, with a median age of 68 years (interquartile range, 56-79 years), and most patients were male (56.9%). The cohort was randomly split into a training set (N = 16 046) and a validation set (N = 5349). Among the nine models developed, the Random Forest model had the highest accuracy (87.62%) and the best area under the receiver operating characteristic curve value (0.89). The SHAP summary analysis showed that Glasgow Coma Scale, urine output, and blood urea nitrogen were the top three risk factors for outcome prediction. Furthermore, SHAP dependency analysis and SHAP force analysis were used to interpret the Random Forest model at the factor level and individual level, respectively.
Conclusion: A transparent ML model for predicting outcomes in critically ill patients using SHAP methodology is feasible and effective. SHAP values significantly improve the explainability of ML models.
Keywords: critical illness; explainable artificial intelligence; machine learning; mortality.
© The Author(s) 2024. Published by Oxford University Press on behalf of Fellowship of Postgraduate Medicine. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.