A prediction model for COPD readmissions: catching up, catching our breath, and improving a national problem

J Community Hosp Intern Med Perspect. 2012 Apr 30;2(1). doi: 10.3402/jchimp.v2i1.9915. Print 2012.

Abstract

Frequent COPD exacerbations have a large impact on morbidity, mortality and health-care expenditures. By 2020, the World Health Organization expects COPD and COPD exacerbations to be the third leading cause of death world-wide. Furthermore, In 2005 it was estimated that COPD exacerbations cost the U.S. health-care system 38 billion dollars. Studies attempting to determine factors related to COPD readmissions are still very limited. Moreover, few have used a organized machine-learning, sensitivity analysis approach, such as a Random Forest (RF) statistical model, to analyze this problem. This study utilized the RF machine learning algorithm to determine factors that predict risk for multiple COPD exacerbations in a single year. This was a retrospective study with a data set of 106 patients. These patients were divided randomly into training (80%) and validating (20%) data-sets, 100 times, using approximately sixty variables intially, which in prior studies had been found to be associated with patient readmission for COPD exacerbation. In an interactive manner, an RF model was created using the training set and validated on the testing dataset. Mean area-under-curve (AUC) statistics, sensitivity, specificity, and negative/positive predictive values (NPV, PPV) were calculated for the 100 runs. THE FOLLOWING VARIABLES WERE FOUND TO BE IMPORTANT PREDICTORS OF PATIENTS HAVING AT LEAST TWO COPD EXACERBATIONS WITHIN ONE YEAR: employment, body mass index, number of previous surgeries, administration of azithromycin/ceftriaxone/moxifloxacin, and admission albumin level. The mean AUC was 0.72, sensitivity of 0.75, specificity of 0.56, PPV of 0.7 and NPV of 0.63. Histograms were used to confirm consistent accuracy. The RF design has consistently demonstrated encouraging results. We expect to validate our results on new patient groups and improve accuracy by increasing our training dataset. We hope that identifying patients at risk for frequent readmissions will improve patient outcome and save valuable hospital resources.

Keywords: COPD; prediction; random forest; readmission.