Identification of Human Lineage-Specific Transcriptional Coregulators Enabled by a Glossary of Binding Modules and Tunable Genomic Backgrounds

Cell Syst. 2017 Sep 27;5(3):187-201.e7. doi: 10.1016/j.cels.2017.06.015.

Abstract

Transcription factors (TFs) control cellular processes by binding specific DNA motifs to modulate gene expression. Motif enrichment analysis of regulatory regions can identify direct and indirect TF binding sites. Here, we created a glossary of 108 non-redundant TF-8mer "modules" of shared specificity for 671 metazoan TFs from publicly available and new universal protein binding microarray data. Analysis of 239 ENCODE TF chromatin immunoprecipitation sequencing datasets and associated RNA sequencing profiles suggest the 8mer modules are more precise than position weight matrices in identifying indirect binding motifs and their associated tethering TFs. We also developed GENRE (genomically equivalent negative regions), a tunable tool for construction of matched genomic background sequences for analysis of regulatory regions. GENRE outperformed four state-of-the-art approaches to background sequence construction. We used our TF-8mer glossary and GENRE in the analysis of the indirect binding motifs for the co-occurrence of tethering factors, suggesting novel TF-TF interactions. We anticipate that these tools will aid in elucidating tissue-specific gene-regulatory programs.

Keywords: ChIP-seq analysis; DNA binding specificity; ENCODE; background construction; motif enrichment; transcription factors.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Binding Sites / genetics
  • Chromatin Immunoprecipitation / methods
  • Computational Biology / methods*
  • DNA-Binding Proteins / genetics
  • Gene Regulatory Networks / genetics
  • Genomics
  • Humans
  • Protein Binding / genetics
  • Regulatory Sequences, Nucleic Acid / genetics
  • Sequence Analysis, RNA / methods
  • Transcription Factors / classification
  • Transcription Factors / genetics*
  • Transcription, Genetic / genetics*
  • Transcriptional Activation / physiology

Substances

  • DNA-Binding Proteins
  • Transcription Factors