Machine learning-based risk prediction of malignant arrhythmia in hospitalized patients with heart failure

ESC Heart Fail. 2021 Dec;8(6):5363-5371. doi: 10.1002/ehf2.13627. Epub 2021 Sep 28.

Abstract

Aims: Predicting the risk of malignant arrhythmias (MA) in hospitalized patients with heart failure (HF) is challenging. Machine learning (ML) can handle a large volume of complex data more effectively than traditional statistical methods. This study explored the feasibility of ML methods for predicting the risk of MA in hospitalized HF patients.

Methods and results: We evaluated the baseline data and MA events of 2794 hospitalized HF patients in the HF cohort in Anhui Province and randomly divided the study population into training and validation sets in a 7:3 ratio. The Lasso-logistic regression, multivariate adaptive regression splines (MARS), classification and regression tree (CART), random forest (RF), and eXtreme gradient boosting (XGBoost) algorithms were used to construct risk prediction models in the training set, and model performance was verified in the validation set. The area under the receiver operating characteristic curve (AUC) and Brier score were employed to evaluate the discrimination and calibration of the model, respectively. Clinical utility of the Lasso-logistic regression model was analysed using decision curve analysis (DCA). The median (Q1, Q3) age of the study population was 70 (61, 77) years, and 39.5% were female. MA events occurred in 117 patients (4.2%) during hospitalization. In the training set (n = 1964), the AUC of the XGBoost model was 0.998 [95% confidence interval (CI) 0.997-1.000], which was higher than the other models (all P < 0.001). In the validation set (n = 830), there was no significant difference in AUC of Lasso-logistic model 1 [AUC: 0.867 (95% CI 0.819-0.915)], Lasso-logistic model 2 [AUC: 0.828 (95% CI 0.764-0.892)], MARS model [AUC: 0.852 (95% CI 0.793-0.910)], RF model [AUC: 0.804 (95% CI 0.726-0.881)], and XGBoost model [AUC: 0.864 (95% CI 0.810-0.918); all P > 0.05], which were higher than that of CART model [AUC: 0.743 (95% CI 0.661-0.824); all P < 0.05]. Brier scores for all prediction models were less than 0.05. DCA results showed that the Lasso-logistic model had a net clinical benefit. Oral antiarrhythmic drug, left bundle branch block, serum magnesium, d-dimer, and random blood glucose were significant predictors in half or more of the models.

Conclusions: The current study findings suggest that ML models based on the Lasso-logistic regression, MARS, RF, and XGBoost algorithms can effectively predict the risk of MA in hospitalized HF patients. The Lasso-logistic model had better clinical interpretability and ease of use than the other models.

Keywords: Heart failure; Machine learning; Tachycardia, Ventricular; Ventricular fibrillation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Arrhythmias, Cardiac / epidemiology
  • Arrhythmias, Cardiac / etiology
  • Female
  • Heart Failure* / complications
  • Heart Failure* / epidemiology
  • Humans
  • Logistic Models
  • Machine Learning*
  • ROC Curve