Predicting the Prognosis of Patients in the Coronary Care Unit: A Novel Multi-Category Machine Learning Model Using XGBoost

Front Cardiovasc Med. 2022 May 12:9:764629. doi: 10.3389/fcvm.2022.764629. eCollection 2022.

Abstract

Background: Early prediction and classification of prognosis is essential for patients in the coronary care unit (CCU). We applied a machine learning (ML) model using the eXtreme Gradient Boosting (XGBoost) algorithm to prognosticate CCU patients and compared XGBoost with traditional classification models.

Methods: CCU patients' data were extracted from the MIMIC-III v1.4 clinical database, and divided into four groups based on the time to death: <30 days, 30 days-1 year, 1-5 years, and ≥5 years. Four classification models, including XGBoost, naïve Bayes (NB), logistic regression (LR), and support vector machine (SVM) were constructed using the Python software. These four models were tested and compared for accuracy, F1 score, Matthews correlation coefficient (MCC), and area under the curve (AUC) of the receiver operating characteristic curves. Subsequently, Local Interpretable Model-Agnostic Explanations method was performed to improve XGBoost model interpretability. We also constructed sub-models of each model based on the different categories of death time and compared the differences by decision curve analysis. The optimal model was further analyzed using a clinical impact curve. At last, feature ablation curves of the XGBoost model were conducted to obtain the simplified model.

Results: Overall, 5360 CCU patients were included. Compared to NB, LR, and SVM, the XGBoost model showed better accuracy (0.663, 0.605, 0.632, and 0.622), micro-AUCs (0.873, 0.811, 0.841, and 0.818), and MCC (0.337, 0.317, 0.250, and 0.182). In subgroup analysis, the XGBoost model had a better predictive performance in acute myocardial infarction subgroup. The decision curve and clinical impact curve analyses verified the clinical utility of the XGBoost model for different categories of patients. Finally, we obtained a simplified model with thirty features.

Conclusions: For CCU physicians, the ML technique by XGBoost is a potential predictive tool in patients with different conditions, and it may contribute to improvements in prognosis.

Keywords: MIMIC-III; XGBoost; coronary care unit (CCU); machine learning; multi-category; prognosis.