Risk prediction for heterogeneous populations with application to hospital admission prediction

Biometrics. 2018 Jun;74(2):557-565. doi: 10.1111/biom.12769. Epub 2017 Oct 26.

Abstract

This article is motivated by the increasing need to model risk for large hospital and health care systems that provide services to diverse and complex patients. Often, heterogeneity across a population is determined by a set of factors such as chronic conditions. When these stratifying factors result in overlapping subpopulations, it is likely that the covariate effects for the overlapping groups have some similarity. We exploit this similarity by imposing structural constraints on the importance of variables in predicting outcomes such as hospital admission. Our basic assumption is that if a variable is important for a subpopulation with one of the chronic conditions, then it should be important for the subpopulation with both conditions. However, a variable can be important for the subpopulation with two particular chronic conditions but not for the subpopulations of people with just one of those two conditions. This assumption and its generalization to more conditions are reasonable and aid greatly in borrowing strength across the subpopulations. We prove an oracle property for our estimation method and show that even when the structural assumptions are misspecified, our method will still include all of the truly nonzero variables in large samples. We demonstrate impressive performance of our method in extensive numerical studies and on an application in hospital admission prediction and validation for the Medicare population of a large health care provider.

Keywords: Heterogeneity; Hierarchical penalization; Risk prediction; Variable selection.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Area Under Curve
  • Biometry / methods*
  • Computer Simulation
  • Diabetes Mellitus
  • Heart Failure
  • Hospitalization / economics
  • Hospitalization / statistics & numerical data
  • Humans
  • Medicare
  • Patient Admission / economics
  • Patient Admission / statistics & numerical data*
  • Population Groups / statistics & numerical data*
  • Pulmonary Disease, Chronic Obstructive
  • Risk
  • United States