Detection of regulatory SNPs in human genome using ChIP-seq ENCODE data

PLoS One. 2013 Oct 29;8(10):e78833. doi: 10.1371/journal.pone.0078833. eCollection 2013.

Abstract

A vast amount of SNPs derived from genome-wide association studies are represented by non-coding ones, therefore exacerbating the need for effective identification of regulatory SNPs (rSNPs) among them. However, this task remains challenging since the regulatory part of the human genome is annotated much poorly as opposed to coding regions. Here we describe an approach aggregating the whole set of ENCODE ChIP-seq data in order to search for rSNPs, and provide the experimental evidence of its efficiency. Its algorithm is based on the assumption that the enrichment of a genomic region with transcription factor binding loci (ChIP-seq peaks) indicates its regulatory function, and thereby SNPs located in this region are more likely to influence transcription regulation. To ensure that the approach preferably selects functionally meaningful SNPs, we performed enrichment analysis of several human SNP datasets associated with phenotypic manifestations. It was shown that all samples are significantly enriched with SNPs falling into the regions of multiple ChIP-seq peaks as compared with the randomly selected SNPs. For experimental verification, 40 SNPs falling into overlapping regions of at least 7 TF binding loci were selected from OMIM. The effect of SNPs on the binding of the DNA fragments containing them to the nuclear proteins from four human cell lines (HepG2, HeLaS3, HCT-116, and K562) has been tested by EMSA. A radical change in the binding pattern has been observed for 29 SNPs, besides, 6 more SNPs also demonstrated less pronounced changes. Taken together, the results demonstrate the effective way to search for potential rSNPs with the aid of ChIP-seq data provided by ENCODE project.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cell Line, Tumor
  • Chromatin Immunoprecipitation*
  • Computer Simulation
  • Genome, Human / genetics
  • Genomics / methods*
  • Humans
  • Polymorphism, Single Nucleotide*
  • Transcription Factors / metabolism

Substances

  • Transcription Factors

Grants and funding

The work was supported by State Contract no. 16.512.11.2274 of Ministry of Education and Science of the Russian Federation; by the Integration Project of the Siberian Branch of the Russian Academy of Sciences no. 65; by the program of the Presidium of the Russian Academy of Sciences “Basis Sciences Medicine” no. 23; by the Russian Foundation for Basic Research project no. 13-04-01077. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.