Identification of coupling DNA motif pairs on long-range chromatin interactions in human K562 cells

Bioinformatics. 2016 Feb 1;32(3):321-4. doi: 10.1093/bioinformatics/btv555. Epub 2015 Sep 26.

Abstract

Motivation: The protein-DNA interactions between transcription factors (TFs) and transcription factor binding sites (TFBSs, also known as DNA motifs) are critical activities in gene transcription. The identification of the DNA motifs is a vital task for downstream analysis. Unfortunately, the long-range coupling information between different DNA motifs is still lacking. To fill the void, as the first-of-its-kind study, we have identified the coupling DNA motif pairs on long-range chromatin interactions in human.

Results: The coupling DNA motif pairs exhibit substantially higher DNase accessibility than the background sequences. Half of the DNA motifs involved are matched to the existing motif databases, although nearly all of them are enriched with at least one gene ontology term. Their motif instances are also found statistically enriched on the promoter and enhancer regions. Especially, we introduce a novel measurement called motif pairing multiplicity which is defined as the number of motifs that are paired with a given motif on chromatin interactions. Interestingly, we observe that motif pairing multiplicity is linked to several characteristics such as regulatory region type, motif sequence degeneracy, DNase accessibility and pairing genomic distance. Taken into account together, we believe the coupling DNA motif pairs identified in this study can shed lights on the gene transcription mechanism under long-range chromatin interactions.

Availability and implementation: The identified motif pair data is compressed and available in the supplementary materials associated with this manuscript.

Contact: kc.w@cityu.edu.hk

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Binding Sites
  • Chromatin / chemistry*
  • Chromatin / metabolism
  • Deoxyribonuclease I
  • Genomics
  • Humans
  • K562 Cells
  • Nucleotide Motifs
  • Promoter Regions, Genetic
  • Regulatory Elements, Transcriptional*
  • Sequence Analysis, DNA / methods*
  • Transcription Factors / metabolism

Substances

  • Chromatin
  • Transcription Factors
  • Deoxyribonuclease I