Explainable SHAP-XGBoost models for in-hospital mortality after myocardial infarction

Constantine Tarabanis; Evangelos Kalampokis; Mahmoud Khalil; Carlos L Alviar; Larry A Chinitz; Lior Jankelson

doi:10.1016/j.cvdhj.2023.06.001

Explainable SHAP-XGBoost models for in-hospital mortality after myocardial infarction

Cardiovasc Digit Health J. 2023 Jun 14;4(4):126-132. doi: 10.1016/j.cvdhj.2023.06.001. eCollection 2023 Aug.

Authors

Constantine Tarabanis¹, Evangelos Kalampokis², Mahmoud Khalil³, Carlos L Alviar¹, Larry A Chinitz¹, Lior Jankelson¹

Affiliations

¹ Leon H. Charney Division of Cardiology, NYU Langone Health, New York University School of Medicine, New York, New York.
² Information Systems Laboratory, University of Macedonia, Thessaloniki, Greece.
³ Department of Internal Medicine, Lincoln Medical Centre, Bronx New York.

Abstract

Background: A lack of explainability in published machine learning (ML) models limits clinicians' understanding of how predictions are made, in turn undermining uptake of the models into clinical practice.

Objective: The purpose of this study was to develop explainable ML models to predict in-hospital mortality in patients hospitalized for myocardial infarction (MI).

Methods: Adult patients hospitalized for an MI were identified in the National Inpatient Sample between January 1, 2012, and September 30, 2015. The resulting cohort comprised 457,096 patients described by 64 predictor variables relating to demographic/comorbidity characteristics and in-hospital complications. The gradient boosting algorithm eXtreme Gradient Boosting (XGBoost) was used to develop explainable models for in-hospital mortality prediction in the overall cohort and patient subgroups based on MI type and/or sex.

Results: The resulting models exhibited an area under the receiver operating characteristic curve (AUC) ranging from 0.876 to 0.942, specificity 82% to 87%, and sensitivity 75% to 87%. All models exhibited high negative predictive value ≥0.974. The SHapley Additive exPlanation (SHAP) framework was applied to explain the models. The top predictor variables of increasing and decreasing mortality were age and undergoing percutaneous coronary intervention, respectively. Other notable findings included a decreased mortality risk associated with certain patient subpopulations with hyperlipidemia and a comparatively greater risk of death among women below age 55 years.

Conclusion: The literature lacks explainable ML models predicting in-hospital mortality after an MI. In a national registry, explainable ML models performed best in ruling out in-hospital death post-MI, and their explanation illustrated their potential for guiding hypothesis generation and future study design.

Keywords: Acute coronary syndrome; Explainable machine learning; In-hospital mortality, SHAP; Myocardial infarction.