A haplotype-based framework for group-wise transmission/disequilibrium tests for rare variant association analysis

Bioinformatics. 2015 May 1;31(9):1452-9. doi: 10.1093/bioinformatics/btu860. Epub 2015 Jan 6.

Abstract

Motivation: A major focus of current sequencing studies for human genetics is to identify rare variants associated with complex diseases. Aside from reduced power of detecting associated rare variants, controlling for population stratification is particularly challenging for rare variants. Transmission/disequilibrium tests (TDT) based on family designs are robust to population stratification and admixture, and therefore provide an effective approach to rare variant association studies to eliminate spurious associations. To increase power of rare variant association analysis, gene-based collapsing methods become standard approaches for analyzing rare variants. Existing methods that extend this strategy to rare variants in families usually combine TDT statistics at individual variants and therefore lack the flexibility of incorporating other genetic models.

Results: In this study, we describe a haplotype-based framework for group-wise TDT (gTDT) that is flexible to encompass a variety of genetic models such as additive, dominant and compound heterozygous (CH) (i.e. recessive) models as well as other complex interactions. Unlike existing methods, gTDT constructs haplotypes by transmission when possible and inherently takes into account the linkage disequilibrium among variants. Through extensive simulations we showed that type I error was correctly controlled for rare variants under all models investigated, and this remained true in the presence of population stratification. Under a variety of genetic models, gTDT showed increased power compared with the single marker TDT. Application of gTDT to an autism exome sequencing data of 118 trios identified potentially interesting candidate genes with CH rare variants.

Availability and implementation: We implemented gTDT in C++ and the source code and the detailed usage are available on the authors' website (https://medschool.vanderbilt.edu/cgg).

Contact: bingshan.li@vanderbilt.edu or wei.chen@chp.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Autistic Disorder / genetics
  • Computer Simulation
  • Data Interpretation, Statistical
  • Exome
  • Genetic Association Studies / methods*
  • Genetic Variation*
  • Haplotypes*
  • Humans
  • Linkage Disequilibrium*
  • Models, Genetic
  • Sequence Analysis, DNA