Machine learning algorithms for predicting mortality after coronary artery bypass grafting

Amirmohammad Khalaji; Amir Hossein Behnoush; Mana Jameie; Ali Sharifi; Ali Sheikhy; Aida Fallahzadeh; Saeed Sadeghian; Mina Pashang; Jamshid Bagheri; Seyed Hossein Ahmadi Tafti; Kaveh Hosseini

doi:10.3389/fcvm.2022.977747

Machine learning algorithms for predicting mortality after coronary artery bypass grafting

Front Cardiovasc Med. 2022 Aug 24:9:977747. doi: 10.3389/fcvm.2022.977747. eCollection 2022.

Authors

Amirmohammad Khalaji^{1

2

3}, Amir Hossein Behnoush^{1

2

3}, Mana Jameie^{1

3

4}, Ali Sharifi⁵, Ali Sheikhy^{1

3

4}, Aida Fallahzadeh^{1

3

4}, Saeed Sadeghian^{1

2}, Mina Pashang^{1

2}, Jamshid Bagheri¹, Seyed Hossein Ahmadi Tafti¹, Kaveh Hosseini^{1

3}

Affiliations

¹ Tehran Heart Center, Cardiovascular Diseases Research Institute, Tehran University of Medical Sciences, Tehran, Iran.
² School of Medicine, Tehran University of Medical Sciences, Tehran, Iran.
³ Cardiac Primary Prevention Research Center, Cardiovascular Diseases Research Institute, Tehran University of Medical Sciences, Tehran, Iran.
⁴ Non-communicable Diseases Research Center, Endocrinology and Metabolism Population Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran.
⁵ Faculty of Electrical and Computer Engineering, Tarbiat Modares University, Tehran, Iran.

Abstract

Background: As the era of big data analytics unfolds, machine learning (ML) might be a promising tool for predicting clinical outcomes. This study aimed to evaluate the predictive ability of ML models for estimating mortality after coronary artery bypass grafting (CABG).

Materials and methods: Various baseline and follow-up features were obtained from the CABG data registry, established in 2005 at Tehran Heart Center. After selecting key variables using the random forest method, prediction models were developed using: Logistic Regression (LR), Support Vector Machine (SVM), Naïve Bayes (NB), K-Nearest Neighbors (KNN), Extreme Gradient Boosting (XGBoost), and Random Forest (RF) algorithms. Area Under the Curve (AUC) and other indices were used to assess the performance.

Results: A total of 16,850 patients with isolated CABG (mean age: 67.34 ± 9.67 years) were included. Among them, 16,620 had one-year follow-up, from which 468 died. Eleven features were chosen to train the models. Total ventilation hours and left ventricular ejection fraction were by far the most predictive factors of mortality. All the models had AUC > 0.7 (acceptable performance) for 1-year mortality. Nonetheless, LR (AUC = 0.811) and XGBoost (AUC = 0.792) outperformed NB (AUC = 0.783), RF (AUC = 0.783), SVM (AUC = 0.738), and KNN (AUC = 0.715). The trend was similar for two-to-five-year mortality, with LR demonstrating the highest predictive ability.

Conclusion: Various ML models showed acceptable performance for estimating CABG mortality, with LR illustrating the highest prediction performance. These models can help clinicians make decisions according to the risk of mortality in patients undergoing CABG.

Keywords: coronary artery bypass; feature selection; machine learning; mortality; prediction.