An empirical Bayes approach to improving population-specific genetic association estimation by leveraging cross-population data

Genet Epidemiol. 2023 Feb;47(1):45-60. doi: 10.1002/gepi.22501. Epub 2022 Sep 18.

Abstract

Populations of non-European ancestry are substantially underrepresented in genome-wide association studies (GWAS). As genetic effects can differ between ancestries due to possibly different causal variants or linkage disequilibrium patterns, a meta-analysis that includes GWAS of all populations yields biased estimation in each of the populations and the bias disproportionately impacts non-European ancestry populations. This is because meta-analysis combines study-specific estimates with inverse variance as the weights, which causes biases towards studies with the largest sample size, typical of the European ancestry population. In this paper, we propose two empirical Bayes (EB) estimators to borrow the strength of information across populations although accounting for between-population heterogeneity. Extensive simulation studies show that the proposed EB estimators are largely unbiased and improve efficiency compared to the population-specific estimator. In contrast, even though the meta-analysis estimator has a much smaller variance, it yields significant bias when the genetic effect is heterogeneous across populations. We apply the proposed EB estimators to a large-scale trans-ancestry GWAS of stroke and demonstrate that the EB estimators reduce the variance of the population-specific estimator substantially, with the effect estimates close to the population-specific estimates.

Keywords: empirical Bayes; genome-wide association study; racial/ethnic-specific genetic association.

Publication types

  • Meta-Analysis
  • Research Support, N.I.H., Extramural

MeSH terms

  • Bayes Theorem
  • Computer Simulation
  • Genome-Wide Association Study*
  • Humans
  • Linkage Disequilibrium
  • Models, Genetic*