Imputation without doing imputation: a new method for the detection of non-genotyped causal variants

Genet Epidemiol. 2014 Apr;38(3):173-90. doi: 10.1002/gepi.21792. Epub 2014 Feb 17.

Abstract

Genome-wide association studies allow detection of non-genotyped disease-causing variants through testing of nearby genotyped SNPs. This approach may fail when there are no genotyped SNPs in strong LD with the causal variant. Several genotyped SNPs in weak LD with the causal variant may, however, considered together, provide equivalent information. This observation motivates popular but computationally intensive approaches based on imputation or haplotyping. Here we present a new method and accompanying software designed for this scenario. Our approach proceeds by selecting, for each genotyped "anchor" SNP, a nearby genotyped "partner" SNP, chosen via a specific algorithm we have developed. These two SNPs are used as predictors in linear or logistic regression analysis to generate a final significance test. In simulations, our method captures much of the signal captured by imputation, while taking a fraction of the time and disc space, and generating a smaller number of false-positives. We apply our method to a case/control study of severe malaria genotyped using the Affymetrix 500K array. Previous analysis showed that fine-scale sequencing of a Gambian reference panel in the region of the known causal locus, followed by imputation, increased the signal of association to genome-wide significance levels. Our method also increases the signal of association from P ≈ 2 × 10⁻⁶ to P ≈ 6 × 10⁻¹¹. Our method thus, in some cases, eliminates the need for more complex methods such as sequencing and imputation, and provides a useful additional test that may be used to identify genetic regions of interest.

Keywords: GWAS; haplotype analysis; imputation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • False Positive Reactions
  • Gambia
  • Genome, Human
  • Genome-Wide Association Study
  • Genotype*
  • Haplotypes / genetics
  • Humans
  • Malaria / genetics
  • Models, Genetic
  • Polymorphism, Single Nucleotide / genetics*
  • Software
  • Time Factors