Machine Learning Model for Predicting Mortality Risk in Patients With Complex Chronic Conditions: Retrospective Analysis

Online J Public Health Inform. 2023 Dec 28:15:e52782. doi: 10.2196/52782. eCollection 2023.

Abstract

Background: The health care system is undergoing a shift toward a more patient-centered approach for individuals with chronic and complex conditions, which presents a series of challenges, such as predicting hospital needs and optimizing resources. At the same time, the exponential increase in health data availability has made it possible to apply advanced statistics and artificial intelligence techniques to develop decision-support systems and improve resource planning, diagnosis, and patient screening. These methods are key to automating the analysis of large volumes of medical data and reducing professional workloads.

Objective: This article aims to present a machine learning model and a case study in a cohort of patients with highly complex conditions. The object was to predict mortality within the following 4 years and early mortality over 6 months following diagnosis. The method used easily accessible variables and health care resource utilization information.

Methods: A classification algorithm was selected among 6 models implemented and evaluated using a stratified cross-validation strategy with k=10 and a 70/30 train-test split. The evaluation metrics used included accuracy, recall, precision, F1-score, and area under the receiver operating characteristic (AUROC) curve.

Results: The model predicted patient death with an 87% accuracy, recall of 87%, precision of 82%, F1-score of 84%, and area under the curve (AUC) of 0.88 using the best model, the Extreme Gradient Boosting (XGBoost) classifier. The results were worse when predicting premature deaths (following 6 months) with an 83% accuracy (recall=55%, precision=64% F1-score=57%, and AUC=0.88) using the Gradient Boosting (GRBoost) classifier.

Conclusions: This study showcases encouraging outcomes in forecasting mortality among patients with intricate and persistent health conditions. The employed variables are conveniently accessible, and the incorporation of health care resource utilization information of the patient, which has not been employed by current state-of-the-art approaches, displays promising predictive power. The proposed prediction model is designed to efficiently identify cases that need customized care and proactively anticipate the demand for critical resources by health care providers.

Keywords: algorithm; algorithms; artificial intelligence; chromic; chronicity; classification; complex; complexity; death; health data; machine learning; mortality; mortality prediction; mortality risk; predict; prediction; predictive; risk prediction.