On the analysis of two-phase designs in cluster-correlated data settings

Stat Med. 2019 Oct 15;38(23):4611-4624. doi: 10.1002/sim.8321. Epub 2019 Jul 29.

Abstract

In public health research, information that is readily available may be insufficient to address the primary question(s) of interest. One cost-efficient way forward, especially in resource-limited settings, is to conduct a two-phase study in which the population is initially stratified, at phase I, by the outcome and/or some categorical risk factor(s). At phase II detailed covariate data is ascertained on a subsample within each phase I strata. While analysis methods for two-phase designs are well established, they have focused exclusively on settings in which participants are assumed to be independent. As such, when participants are naturally clustered (eg, patients within clinics) these methods may yield invalid inference. To address this, we develop a novel analysis approach based on inverse-probability weighting that permits researchers to specify some working covariance structure and appropriately accounts for the sampling design and ensures valid inference via a robust sandwich estimator for which a closed-form expression is provided. To enhance statistical efficiency, we propose a calibrated inverse-probability weighting estimator that makes use of information available at phase I but not used in the design. In addition to describing the technique, practical guidance is provided for the cluster-correlated data settings that we consider. A comprehensive simulation study is conducted to evaluate small-sample operating characteristics, including the impact of using naïve methods that ignore correlation due to clustering, as well as to investigate design considerations. Finally, the methods are illustrated using data from a one-time survey of the national antiretroviral treatment program in Malawi.

Keywords: calibration; generalized estimating equations; inverse-probability weighting; two-phase study.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Anti-Retroviral Agents / therapeutic use
  • Clinical Trials as Topic
  • Cluster Analysis*
  • Computer Simulation
  • HIV Infections / drug therapy
  • Humans
  • Malawi
  • Models, Statistical*
  • National Health Programs
  • Research Design*
  • Risk Factors

Substances

  • Anti-Retroviral Agents