Genome-Wide Association with Uncertainty in the Genetic Similarity Matrix

J Comput Biol. 2023 Feb;30(2):189-203. doi: 10.1089/cmb.2022.0067. Epub 2022 Nov 14.

Abstract

Genome-wide association studies (GWASs) are often confounded by population stratification and structure. Linear mixed models (LMMs) are a powerful class of methods for uncovering genetic effects, while controlling for such confounding. LMMs include random effects for a genetic similarity matrix, and they assume that a true genetic similarity matrix is known. However, uncertainty about the phylogenetic structure of a study population may degrade the quality of LMM results. This may happen in bacterial studies in which the number of samples or loci is small, or in studies with low-quality genotyping. In this study, we develop methods for linear mixed models in which the genetic similarity matrix is unknown and is derived from Markov chain Monte Carlo estimates of the phylogeny. We apply our model to a GWAS of multidrug resistance in tuberculosis, and illustrate our methods on simulated data.

Keywords: genetic similarity; genome-wide association studies; phylogenetics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Genome-Wide Association Study* / methods
  • Humans
  • Linear Models
  • Models, Genetic*
  • Phylogeny
  • Polymorphism, Single Nucleotide
  • Uncertainty