Two-stage g-computation: Evaluating Treatment and Intervention Impacts in Observational Cohorts When Exposure Information Is Partly Missing

Epidemiology. 2020 Sep;31(5):695-703. doi: 10.1097/EDE.0000000000001233.

Abstract

Illustrations of the g-computation algorithm to evaluate population average treatment and intervention effects have been predominantly implemented in settings with complete exposure information. Thus, worked examples of approaches to handle missing data in this causal framework are needed to facilitate wider use of these estimators. We illustrate two-stage g-computation estimators that leverage partially observed information on the full study sample and complete exposure information on a subset to estimate causal effects. In a hypothetical cohort of 1,623 human immunodeficiency virus (HIV)-positive women with 30% complete opioid prescription information, we illustrate a two-stage extrapolation g-computation estimator for the average treatment effect of shorter or longer duration opioid prescriptions; we further illustrate two-stage inverse probability weighting and imputation g-computation estimators for the average intervention effect of shortening the duration of prescriptions relative to the status quo. Two-stage g-computation estimators approximated the true risk differences for the population average treatment and intervention effects while g-computation fit to the subset of complete cases was biased. In 10,000 Monte Carlo simulations, two-stage approaches considerably reduced bias and mean squared error and improved the coverage of 95% confidence limits. Although missing data threaten validity and precision, two-stage g-computation designs offer principled approaches to handling missing information.

MeSH terms

  • Bias
  • Causality
  • Cohort Studies*
  • Data Interpretation, Statistical*
  • Humans
  • Monte Carlo Method
  • Observational Studies as Topic*
  • Probability
  • Treatment Outcome