Powerful multi-marker association tests: unifying genomic distance-based regression and logistic regression

Genet Epidemiol. 2010 Nov;34(7):680-8. doi: 10.1002/gepi.20529.

Abstract

To detect genetic association with common and complex diseases, many statistical tests have been proposed for candidate gene or genome-wide association studies with the case-control design. Due to linkage disequilibrium (LD), multi-marker association tests can gain power over single-marker tests with a Bonferroni multiple testing adjustment. Among many existing multi-marker association tests, most target to detect only one of many possible aspects in distributional differences between the genotypes of cases and controls, such as allele frequency differences, while a few new ones aim to target two or three aspects, all of which can be implemented in logistic regression. In contrast to logistic regression, a genomic distance-based regression (GDBR) approach aims to detect some high-order genotypic differences between cases and controls. A recent study has confirmed the high power of GDBR tests. At this moment, the popular logistic regression and the emerging GDBR approaches are completely unrelated; for example, one has to choose between the two. In this article, we reformulate GDBR as logistic regression, opening a venue to constructing other powerful tests while overcoming some limitations of GDBR. For example, asymptotic distributions can replace time-consuming permutations for deriving P-values and covariates, including gene-gene interactions, can be easily incorporated. Importantly, this reformulation facilitates combining GDBR with other existing methods in a unified framework of logistic regression. In particular, we show that Fisher's P-value combining method can boost statistical power by incorporating information from allele frequencies, Hardy-Weinberg disequilibrium, LD patterns, and other higher-order interactions among multi-markers as captured by GDBR.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Alleles
  • Amyotrophic Lateral Sclerosis / genetics
  • Computer Simulation
  • Gene Frequency
  • Genetic Markers
  • Genome-Wide Association Study / statistics & numerical data*
  • Humans
  • Linkage Disequilibrium
  • Logistic Models*
  • Models, Genetic*
  • Molecular Epidemiology
  • Polymorphism, Single Nucleotide
  • Regression Analysis

Substances

  • Genetic Markers