Interpreting SNP heritability in admixed populations

bioRxiv [Preprint]. 2023 Nov 7:2023.08.04.551959. doi: 10.1101/2023.08.04.551959.

Abstract

SNP heritability (hsnp2) is defined as the proportion of phenotypic variance explained by genotyped SNPs and is believed to be a lower bound of heritability (h2), being equal to it if all causal variants are known. Despite the simple intuition behind hsnp2, its interpretation and equivalence to h2 is unclear, particularly in the presence of population structure and assortative mating. It is well known that population structure can lead to inflation in h^snp2 estimates. Here we use analytical theory and simulations to demonstrate that hsnp2 estimated with genome-wide restricted maximum likelihood (GREML) can be biased in admixed populations, even in the absence of confounding and even if all causal variants are known. This is because admixture generates linkage disequilibrium (LD), which contributes to the genetic variance, and therefore to heritability. GREML implicitly assumes this component is zero, which may not be true, particularly for traits under divergent or stabilizing selection in the source populations, leading under- or over-estimates of hsnp2 relative to h2. For the same reason, GREML estimates of local ancestry heritability (hγ2) will also be biased. We describe the bias in h^snp2 and h^γ2 as a function of admixture history and the genetic architecture of the trait and discuss its implications for genome-wide association and polygenic prediction.

Publication types

  • Preprint