Improving Imputation Accuracy by Inferring Causal Variants in Genetic Studies

J Comput Biol. 2019 Nov;26(11):1203-1213. doi: 10.1089/cmb.2018.0139. Epub 2018 Oct 1.

Abstract

Genotype imputation has been widely utilized for two reasons in the analysis of genome-wide association studies (GWAS). One reason is to increase the power for association studies when causal single nucleotide polymorphisms are not collected in the GWAS. The second reason is to aid the interpretation of a GWAS result by predicting the association statistics at untyped variants. In this article, we show that prediction of association statistics at untyped variants that have an influence on the trait produces is overly conservative. Current imputation methods assume that none of the variants in a region (locus consists of multiple variants) affect the trait, which is often inconsistent with the observed data. In this article, we propose a new method, CAUSAL-Imp, which can impute the association statistics at untyped variants while taking into account variants in the region that may affect the trait. Our method builds on recent methods that impute the marginal statistics for GWAS by utilizing the fact that marginal statistics follow a multivariate normal distribution. We utilize both simulated and real data sets to assess the performance of our method. We show that traditional imputation approaches underestimate the association statistics for variants involved in the trait, and our results demonstrate that our approach provides less biased estimates of these association statistics.

Keywords: causal variants; genome-wide association studies; imputation; summary statistics..

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Genome / genetics*
  • Genome-Wide Association Study / statistics & numerical data*
  • Genotype
  • Humans
  • Phenotype
  • Polymorphism, Single Nucleotide / genetics
  • Software*