Two-stage g-computation: Evaluating Treatment and Intervention Impacts in Observational Cohorts When Exposure Information Is Partly Missing

Tiffany L Breger; Jessie K Edwards; Stephen R Cole; Daniel Westreich; Brian W Pence; Adaora A Adimora

doi:10.1097/EDE.0000000000001233

Two-stage g-computation: Evaluating Treatment and Intervention Impacts in Observational Cohorts When Exposure Information Is Partly Missing

Epidemiology. 2020 Sep;31(5):695-703. doi: 10.1097/EDE.0000000000001233.

Authors

Tiffany L Breger¹, Jessie K Edwards¹, Stephen R Cole¹, Daniel Westreich¹, Brian W Pence¹, Adaora A Adimora^{1

2}

Affiliations

¹ From the Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC.
² Department of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC.

Abstract

Illustrations of the g-computation algorithm to evaluate population average treatment and intervention effects have been predominantly implemented in settings with complete exposure information. Thus, worked examples of approaches to handle missing data in this causal framework are needed to facilitate wider use of these estimators. We illustrate two-stage g-computation estimators that leverage partially observed information on the full study sample and complete exposure information on a subset to estimate causal effects. In a hypothetical cohort of 1,623 human immunodeficiency virus (HIV)-positive women with 30% complete opioid prescription information, we illustrate a two-stage extrapolation g-computation estimator for the average treatment effect of shorter or longer duration opioid prescriptions; we further illustrate two-stage inverse probability weighting and imputation g-computation estimators for the average intervention effect of shortening the duration of prescriptions relative to the status quo. Two-stage g-computation estimators approximated the true risk differences for the population average treatment and intervention effects while g-computation fit to the subset of complete cases was biased. In 10,000 Monte Carlo simulations, two-stage approaches considerably reduced bias and mean squared error and improved the coverage of 95% confidence limits. Although missing data threaten validity and precision, two-stage g-computation designs offer principled approaches to handling missing information.

MeSH terms

Bias
Causality
Cohort Studies*
Data Interpretation, Statistical*
Humans
Monte Carlo Method
Observational Studies as Topic*
Probability
Treatment Outcome

Abstract

MeSH terms

Grants and funding