A machine learning model to predict heart failure readmission: toward optimal feature set

Front Artif Intell. 2024 Feb 21:7:1363226. doi: 10.3389/frai.2024.1363226. eCollection 2024.

Abstract

Background: Hospital readmissions for heart failure patients remain high despite efforts to reduce them. Predictive modeling using big data provides opportunities to identify high-risk patients and inform care management. However, large datasets can constrain performance.

Objective: This study aimed to develop a machine learning based prediction model leveraging a nationwide hospitalization database to predict 30-day heart failure readmissions. Another objective of this study is to find the optimal feature set that leads to the highest AUC value in the prediction model.

Material and methods: Heart failure patient data was extracted from the 2020 Nationwide Readmissions Database. A heuristic feature selection process incrementally incorporated predictors into logistic regression and random forest models, which yields a maximum increase in the AUC metric. Discrimination was evaluated through accuracy, sensitivity, specificity and AUC.

Results: A total of 566,019 discharges with heart failure diagnosis were recognized. Readmission rate was 8.9% for same-cause and 20.6% for all-cause diagnoses. Random forest outperformed logistic regression, achieving AUCs of 0.607 and 0.576 for same-cause and all-cause readmissions respectively. Heuristic feature selection resulted in the identification of optimal feature sets including 20 and 22 variables from a pool of 30 and 31 features for the same-cause and all-cause datasets. Key predictors included age, payment method, chronic kidney disease, disposition status, number of ICD-10-CM diagnoses, and post-care encounters.

Conclusion: The proposed model attained discrimination comparable to prior analyses that used smaller datasets. However, reducing the sample enhanced performance, indicating big data complexity. Improved techniques like heuristic feature selection enabled effective leveraging of the nationwide data. This study provides meaningful insights into predictive modeling methodologies and influential features for forecasting heart failure readmissions.

Keywords: clinical decision making; feature selection; heart failure; machine learning; readmission.

Grants and funding

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.