Improving External Validity of Epidemiologic Cohort Analyses: A Kernel Weighting Approach

Lingxiao Wang; Barry I Graubard; Hormuzd A Katki; Yan Li

doi:10.1111/rssa.12564

Improving External Validity of Epidemiologic Cohort Analyses: A Kernel Weighting Approach

J R Stat Soc Ser A Stat Soc. 2020 Jun;183(3):1293-1311. doi: 10.1111/rssa.12564. Epub 2020 Apr 25.

Authors

Lingxiao Wang^{1

2}, Barry I Graubard², Hormuzd A Katki², Yan Li¹

Affiliations

¹ The Joint Program in Survey Methodology, University of Maryland, College Park, U.S.A.
² National Cancer Institute, Division of Cancer Epidemiology & Genetics, Biostatistics Branch, U.S.A.

Abstract

For various reasons, cohort studies generally forgo probability sampling required to obtain population representative samples. However, such cohorts lack population-representativeness, which invalidates estimates of population prevalences for novel health factors only available in cohorts. To improve external validity of estimates from cohorts, we propose a kernel weighting (KW) approach that uses survey data as a reference to create pseudo-weights for cohorts. A jackknife variance is proposed for the KW estimates. In simulations, the KW method outperformed two existing propensity-score-based weighting methods in mean-squared error while maintaining confidence interval coverage. We applied all methods to estimating US population mortality and prevalences of various diseases from the non-representative US NIH-AARP cohort, using the sample from US-representative National Health Interview Survey (NHIS) as the reference. Assuming that the NHIS estimates are correct, the KW approach yielded generally less biased estimates compared to the existing propensity-score-based weighting methods.

Keywords: Cohort studies; Jackknife variance estimation; Taylor series linearization variance; complex survey sample; kernel smoothing; propensity score weighting.

Grants and funding

ZIA CP010181/ImNIH/Intramural NIH HHS/United States