Identification of C2H2-ZF binding preferences from ChIP-seq data using RCADE

Bioinformatics. 2015 Sep 1;31(17):2879-81. doi: 10.1093/bioinformatics/btv284. Epub 2015 May 6.

Abstract

Current methods for motif discovery from chromatin immunoprecipitation followed by sequencing (ChIP-seq) data often identify non-targeted transcription factor (TF) motifs, and are even further limited when peak sequences are similar due to common ancestry rather than common binding factors. The latter aspect particularly affects a large number of proteins from the Cys2His2 zinc finger (C2H2-ZF) class of TFs, as their binding sites are often dominated by endogenous retroelements that have highly similar sequences. Here, we present recognition code-assisted discovery of regulatory elements (RCADE) for motif discovery from C2H2-ZF ChIP-seq data. RCADE combines predictions from a DNA recognition code of C2H2-ZFs with ChIP-seq data to identify models that represent the genuine DNA binding preferences of C2H2-ZF proteins. We show that RCADE is able to identify generalizable binding models even from peaks that are exclusively located within the repeat regions of the genome, where state-of-the-art motif finding approaches largely fail.

Availability and implementation: RCADE is available as a webserver and also for download at http://rcade.ccbr.utoronto.ca/.

Supplementary information: Supplementary data are available at Bioinformatics online.

Contact: t.hughes@utoronto.ca.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Binding Sites
  • Carrier Proteins / metabolism*
  • Chromatin Immunoprecipitation / methods*
  • Gene Expression Regulation
  • Genome, Human
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Nuclear Proteins / metabolism*
  • Nucleotide Motifs / genetics*
  • Regulatory Sequences, Nucleic Acid*
  • Repressor Proteins
  • Retroelements / genetics
  • Sequence Analysis, DNA / methods
  • Transcription Factors / metabolism*
  • Zinc Fingers / genetics*

Substances

  • BCL11A protein, human
  • Carrier Proteins
  • Nuclear Proteins
  • Repressor Proteins
  • Retroelements
  • Transcription Factors