Statistical methods for biomarker data pooled from multiple nested case-control studies

Biostatistics. 2021 Jul 17;22(3):541-557. doi: 10.1093/biostatistics/kxz051.

Abstract

Pooling biomarker data across multiple studies allows for examination of a wider exposure range than generally possible in individual studies, evaluation of population subgroups and disease subtypes with more statistical power, and more precise estimation of biomarker-disease associations. However, circulating biomarker measurements often require calibration to a single reference assay prior to pooling due to assay and laboratory variability across studies. We propose several methods for calibrating and combining biomarker data from nested case-control studies when reference assay data are obtained from a subset of controls in each contributing study. Specifically, we describe a two-stage calibration method and two aggregated calibration methods, named the internalized and full calibration methods, to evaluate the main effect of the biomarker exposure on disease risk and whether that association is modified by a potential covariate. The internalized method uses the reference laboratory measurement in the analysis when available and otherwise uses the estimated value derived from calibration models. The full calibration method uses calibrated biomarker measurements for all subjects, including those with reference laboratory measurements. Under the two-stage method, investigators complete study-specific analyses in the first stage followed by meta-analysis in the second stage. Our results demonstrate that the full calibration method is the preferred aggregated approach to minimize bias in point estimates. We also observe that the two-stage and full calibration methods provide similar effect and variance estimates but that their variance estimates are slightly larger than those from the internalized approach. As an illustrative example, we apply the three methods in a pooling project of nested case-control studies to evaluate (i) the association between circulating vitamin D levels and risk of stroke and (ii) how body mass index modifies the association between circulating vitamin D levels and risk of cardiovascular disease.

Keywords: Aggregation; Calibration; Conditional logistic regression; Nested case–control study; Pooling.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, N.I.H., Intramural

MeSH terms

  • Bias
  • Biomarkers
  • Calibration
  • Case-Control Studies
  • Humans
  • Research Design*

Substances

  • Biomarkers