The impact of disregarding family structure on genome-wide association analysis of complex diseases in cohorts with simple pedigrees

J Appl Genet. 2020 Feb;61(1):75-86. doi: 10.1007/s13353-019-00526-7. Epub 2019 Nov 21.

Abstract

The generalized linear mixed models (GLMMs) methodology is the standard framework for genome-wide association studies (GWAS) of complex diseases in family-based cohorts. Fitting GLMMs in very large cohorts, however, can be computationally demanding. Also, the modified versions of GLMM using faster algorithms may underperform, for instance when a single nucleotide polymorphism (SNP) is correlated with fixed-effects covariates. We investigated the extent to which disregarding family structure may compromise GWAS in cohorts with simple pedigrees by contrasting logistic regression models (i.e., with no family structure) to three LMMs-based ones. Our analyses showed that the logistic regression models in general resulted in smaller P values compared with the LMMs-based models; however, the differences in P values were mostly minor. Disregarding family structure had little impact on determining disease-associated SNPs at genome-wide level of significance (i.e., P < 5E-08) as the four P values resulted from the tested methods for any SNP were all below or all above 5E-08. Nevertheless, larger discrepancies were detected between logistic regression and LMMs-based models at suggestive level of significance (i.e., of 5E-08 ≤ P < 5E-06). The SNP effects estimated by the logistic regression models were not statistically different from those estimated by GLMMs that implemented Wald's test. However, several SNP effects were significantly different from their counterparts in LMMs analyses. We suggest that fitting GLMMs with Wald's test on a pre-selected subset of SNPs obtained from logistic regression models can ensure the balance between the speed of analyses and the accuracy of parameters.

Keywords: Complex diseases; Family-based GWAS; GLMMs framework; Logistic regression.

MeSH terms

  • Algorithms
  • Genetic Predisposition to Disease
  • Genome-Wide Association Study* / methods
  • Genomics* / methods
  • Humans
  • Models, Genetic*
  • Multifactorial Inheritance*
  • Pedigree*
  • Polymorphism, Single Nucleotide