Genetic association models are robust to common population kinship estimation biases

Genetics. 2023 May 4;224(1):iyad030. doi: 10.1093/genetics/iyad030.

Abstract

Common genetic association models for structured populations, including principal component analysis (PCA) and linear mixed-effects models (LMMs), model the correlation structure between individuals using population kinship matrices, also known as genetic relatedness matrices. However, the most common kinship estimators can have severe biases that were only recently determined. Here we characterize the effect of these kinship biases on genetic association. We employ a large simulated admixed family and genotypes from the 1000 Genomes Project, both with simulated traits, to evaluate key kinship estimators. Remarkably, we find practically invariant association statistics for kinship matrices of different bias types (matching all other features). We then prove using statistical theory and linear algebra that LMM association tests are invariant to these kinship biases, and PCA approximately so. Our proof shows that the intercept and relatedness effect coefficients compensate for the kinship bias, an argument that extends to generalized linear models. As a corollary, association testing is also invariant to changing the reference ancestral population of the kinship matrix. Lastly, we observed that all kinship estimators, except for popkin ratio-of-means, can give improper non-positive semidefinite matrices, which can be problematic although some LMMs handle them surprisingly well, and condition numbers can be used to choose kinship estimators. Overall, we find that existing association studies are robust to kinship estimation bias, and our calculations may help improve association methods by taking advantage of this unexpected robustness, as well as help determine the effects of kinship bias in related problems.

Keywords: estimation bias; genetic association studies; kinship matrices; linear mixed-effects models; principal components analysis.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bias
  • Genotype
  • Humans
  • Linear Models
  • Models, Genetic*
  • Phenotype
  • Population Groups* / genetics