Exploring relationships between in-hospital mortality and hospital case volume using random forest: results of a cohort study based on a nationwide sample of German hospitals, 2016-2018

BMC Health Serv Res. 2022 Jan 2;22(1):1. doi: 10.1186/s12913-021-07414-z.

Abstract

Background: Relationships between in-hospital mortality and case volume were investigated for various patient groups in many empirical studies with mixed results. Typically, those studies relied on (semi-)parametric statistical models like logistic regression. Those models impose strong assumptions on the functional form of the relationship between outcome and case volume. The aim of this study was to determine associations between in-hospital mortality and hospital case volume using random forest as a flexible, nonparametric machine learning method.

Methods: We analyzed a sample of 753,895 hospital cases with stroke, myocardial infarction, ventilation > 24 h, COPD, pneumonia, and colorectal cancer undergoing colorectal resection treated in 233 German hospitals over the period 2016-2018. We derived partial dependence functions from random forest estimates capturing the relationship between the patient-specific probability of in-hospital death and hospital case volume for each of the six considered patient groups.

Results: Across all patient groups, the smallest hospital volumes were consistently related to the highest predicted probabilities of in-hospital death. We found strong relationships between in-hospital mortality and hospital case volume for hospitals treating a (very) small number of cases. Slightly higher case volumes were associated with substantially lower mortality. The estimated relationships between in-hospital mortality and case volume were nonlinear and nonmonotonic.

Conclusion: Our analysis revealed strong relationships between in-hospital mortality and hospital case volume in hospitals treating a small number of cases. The nonlinearity and nonmonotonicity of the estimated relationships indicate that studies applying conventional statistical approaches like logistic regression should consider these relationships adequately.

Keywords: Cohort study; Hospital mortality; Nonparametric modelling; Random Forest; Risk factors; Volume-outcome relationship.

MeSH terms

  • Cohort Studies
  • Hospital Mortality
  • Hospitals*
  • Humans
  • Logistic Models
  • Models, Statistical*