Common sampling and modeling approaches to analyzing readmission risk that ignore clustering produce misleading results

BMC Med Res Methodol. 2020 Nov 25;20(1):281. doi: 10.1186/s12874-020-01162-0.

Abstract

Background: There is little consensus on how to sample hospitalizations and analyze multiple variables to model readmission risk. The purpose of this study was to compare readmission rates and the accuracy of predictive models based on different sampling and multivariable modeling approaches.

Methods: We conducted a retrospective cohort study of 17,284 adult diabetes patients with 44,203 discharges from an urban academic medical center between 1/1/2004 and 12/31/2012. Models for all-cause 30-day readmission were developed by four strategies: logistic regression using the first discharge per patient (LR-first), logistic regression using all discharges (LR-all), generalized estimating equations (GEE) using all discharges, and cluster-weighted (CWGEE) using all discharges. Multiple sets of models were developed and internally validated across a range of sample sizes.

Results: The readmission rate was 10.2% among first discharges and 20.3% among all discharges, revealing that sampling only first discharges underestimates a population's readmission rate. Number of discharges was highly correlated with number of readmissions (r = 0.87, P < 0.001). Accounting for clustering with GEE and CWGEE yielded more conservative estimates of model performance than LR-all. LR-first produced falsely optimistic Brier scores. Model performance was unstable below samples of 6000-8000 discharges and stable in larger samples. GEE and CWGEE performed better in larger samples than in smaller samples.

Conclusions: Hospital readmission risk models should be based on all discharges as opposed to just the first discharge per patient and utilize methods that account for clustered data.

Keywords: Clustering; Logistic models; Patient readmission; Predictive modeling; Sampling strategies.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Adult
  • Cluster Analysis
  • Hospitalization
  • Humans
  • Patient Discharge*
  • Patient Readmission*
  • Retrospective Studies