Practical considerations for sandwich variance estimation in two-stage regression settings

Am J Epidemiol. 2023 Nov 27:kwad234. doi: 10.1093/aje/kwad234. Online ahead of print.

Abstract

We present a practical approach for computing the sandwich variance estimator in two-stage regression model settings. As a motivating example for two-stage regression, we consider regression calibration, a popular approach for addressing covariate measurement error. The sandwich variance approach has been rarely applied in regression calibration, despite it requiring less computation time than popular resampling approaches for variance estimation, specifically the bootstrap. This is likely due to requiring specialized statistical coding. We first outline the steps needed to compute the sandwich variance estimator. We then develop a convenient method of computation in R for sandwich variance estimation, which leverages standard regression model outputs and existing R functions and can be applied in the case of a simple random sample or complex survey design. We use a simulation study to compare the sandwich to a resampling variance approach for both settings. Finally, we further compare these two variance estimation approaches for data examples from the Women's Health Initiative (WHI) and Hispanic Community Health Study/Study of Latinos (HCHS/SOL). The sandwich variance estimator typically had good numerical performance, but simple Wald bootstrap confidence intervals were unstable or over-covered in certain settings, particularly when there was high correlation between covariates or large measurement error.

Keywords: Bootstrap; measurement error; regression calibration; robust variance; sandwich variance; two-stage.