Predicting causal variants affecting expression by using whole-genome sequencing and RNA-seq from multiple human tissues

Nat Genet. 2017 Dec;49(12):1747-1751. doi: 10.1038/ng.3979. Epub 2017 Oct 23.

Abstract

Genetic association mapping produces statistical links between phenotypes and genomic regions, but identifying causal variants remains difficult. Whole-genome sequencing (WGS) can help by providing complete knowledge of all genetic variants, but it is financially prohibitive for well-powered GWAS studies. We performed mapping of expression quantitative trait loci (eQTLs) with WGS and RNA-seq, and found that lead eQTL variants called with WGS were more likely to be causal. Through simulations, we derived properties of causal variants and used them to develop a method for identifying likely causal SNPs. We estimated that 25-70% of causal variants were located in open-chromatin regions, depending on the tissue and experiment. Finally, we identified a set of high-confidence causal variants and showed that these were more enriched in GWAS associations than other eQTLs. Of those, we found 65 associations with GWAS traits and provide examples in which genes implicated by expression are functionally validated as being relevant for complex traits.

MeSH terms

  • Chromosome Mapping
  • Gene Expression Profiling / methods*
  • Genetic Predisposition to Disease / genetics
  • Genetic Variation*
  • Genome, Human / genetics*
  • Genome-Wide Association Study / methods*
  • Genotype
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Polymorphism, Single Nucleotide
  • Quantitative Trait Loci / genetics
  • Reproducibility of Results