Multilevel Regression and Poststratification: A Modeling Approach to Estimating Population Quantities From Highly Selected Survey Samples

Am J Epidemiol. 2018 Aug 1;187(8):1780-1790. doi: 10.1093/aje/kwy070.

Abstract

Investigators in large-scale population health studies face increasing difficulties in recruiting representative samples of participants. Nonparticipation, item nonresponse, and attrition, when follow-up is involved, often result in highly selected samples even in well-designed studies. We aimed to assess the potential value of multilevel regression and poststratification, a method previously used to successfully forecast US presidential election results, for addressing biases due to nonparticipation in the estimation of population descriptive quantities in large cohort studies. The investigation was performed as an extensive case study using baseline data (2013-2014) from a large national health survey of Australian males (Ten to Men: The Australian Longitudinal Study on Male Health). Analyses were performed in the open-source Bayesian computational package RStan. Results showed greater consistency and precision across population subsets of varying sizes when compared with estimates obtained using conventional survey sampling weights. Estimates for smaller population subsets exhibited a greater degree of shrinkage towards the national estimate. Multilevel regression and poststratification provides a promising analytical approach to addressing potential participation bias in the estimation of population descriptive quantities from large-scale health surveys and cohort studies.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Adult
  • Australia
  • Bayes Theorem
  • Child
  • Health Surveys*
  • Humans
  • Longitudinal Studies
  • Male
  • Middle Aged
  • Models, Statistical*
  • Monte Carlo Method
  • Research Design*
  • Selection Bias