Quantifying deleterious effects of regulatory variants

Shan Li; Roberto Vera Alvarez; Roded Sharan; David Landsman; Ivan Ovcharenko

doi:10.1093/nar/gkw1263

Quantifying deleterious effects of regulatory variants

Nucleic Acids Res. 2017 Mar 17;45(5):2307-2317. doi: 10.1093/nar/gkw1263.

Authors

Shan Li¹, Roberto Vera Alvarez¹, Roded Sharan², David Landsman¹, Ivan Ovcharenko¹

Affiliations

¹ Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892, USA.
² School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel.

Abstract

The majority of genome-wide association study (GWAS) risk variants reside in non-coding DNA sequences. Understanding how these sequence modifications lead to transcriptional alterations and cell-to-cell variability can help unraveling genotype-phenotype relationships. Here, we describe a computational method, dubbed CAPE, which calculates the likelihood of a genetic variant deactivating enhancers by disrupting the binding of transcription factors (TFs) in a given cellular context. CAPE learns sequence signatures associated with putative enhancers originating from large-scale sequencing experiments (such as ChIP-seq or DNase-seq) and models the change in enhancer signature upon a single nucleotide substitution. CAPE accurately identifies causative cis-regulatory variation including expression quantitative trait loci (eQTLs) and DNase I sensitivity quantitative trait loci (dsQTLs) in a tissue-specific manner with precision superior to several currently available methods. The presented method can be trained on any tissue-specific dataset of enhancers and known functional variants and applied to prioritize disease-associated variants in the corresponding tissue.

Published by Oxford University Press on behalf of Nucleic Acids Research 2016.

Publication types

Research Support, N.I.H., Intramural

MeSH terms

B-Lymphocytes / cytology
B-Lymphocytes / metabolism
Base Sequence
Deoxyribonuclease I / metabolism
Enhancer Elements, Genetic*
Genetic Association Studies*
Genome, Human*
Genome-Wide Association Study
High-Throughput Nucleotide Sequencing
Humans
Likelihood Functions
Machine Learning
Organ Specificity
Polymorphism, Single Nucleotide*
Protein Binding
Quantitative Trait Loci*
Transcription Factors / genetics
Transcription Factors / metabolism*
Transcription, Genetic

Substances

Transcription Factors
Deoxyribonuclease I