Machine learning algorithms for predicting mortality after coronary artery bypass grafting

Front Cardiovasc Med. 2022 Aug 24:9:977747. doi: 10.3389/fcvm.2022.977747. eCollection 2022.

Abstract

Background: As the era of big data analytics unfolds, machine learning (ML) might be a promising tool for predicting clinical outcomes. This study aimed to evaluate the predictive ability of ML models for estimating mortality after coronary artery bypass grafting (CABG).

Materials and methods: Various baseline and follow-up features were obtained from the CABG data registry, established in 2005 at Tehran Heart Center. After selecting key variables using the random forest method, prediction models were developed using: Logistic Regression (LR), Support Vector Machine (SVM), Naïve Bayes (NB), K-Nearest Neighbors (KNN), Extreme Gradient Boosting (XGBoost), and Random Forest (RF) algorithms. Area Under the Curve (AUC) and other indices were used to assess the performance.

Results: A total of 16,850 patients with isolated CABG (mean age: 67.34 ± 9.67 years) were included. Among them, 16,620 had one-year follow-up, from which 468 died. Eleven features were chosen to train the models. Total ventilation hours and left ventricular ejection fraction were by far the most predictive factors of mortality. All the models had AUC > 0.7 (acceptable performance) for 1-year mortality. Nonetheless, LR (AUC = 0.811) and XGBoost (AUC = 0.792) outperformed NB (AUC = 0.783), RF (AUC = 0.783), SVM (AUC = 0.738), and KNN (AUC = 0.715). The trend was similar for two-to-five-year mortality, with LR demonstrating the highest predictive ability.

Conclusion: Various ML models showed acceptable performance for estimating CABG mortality, with LR illustrating the highest prediction performance. These models can help clinicians make decisions according to the risk of mortality in patients undergoing CABG.

Keywords: coronary artery bypass; feature selection; machine learning; mortality; prediction.