Wide-scale analysis of human functional transcription factor binding reveals a strong bias towards the transcription start site

PLoS One. 2007 Aug 29;2(8):e807. doi: 10.1371/journal.pone.0000807.

Abstract

Background: Transcription factors (TF) regulate expression by binding to specific DNA sequences. A binding event is functional when it affects gene expression. Functionality of a binding site is reflected in conservation of the binding sequence during evolution and in over represented binding in gene groups with coherent biological functions. Functionality is governed by several parameters such as the TF-DNA binding strength, distance of the binding site from the transcription start site (TSS), DNA packing, and more. Understanding how these parameters control functionality of different TFs in different biological contexts is a must for identifying functional TF binding sites and for understanding regulation of transcription.

Methodology/principal findings: We introduce a novel method to screen the promoters of a set of genes with shared biological function (obtained from the functional Gene Ontology (GO) classification) against a precompiled library of motifs, and find those motifs which are statistically over-represented in the gene set. More than 8,000 human (and 23,000 mouse) genes, were assigned to one of 134 GO sets. Their promoters were searched (from 200 bp downstream to 1,000 bp upstream the TSS) for 414 known DNA motifs. We optimized the sequence similarity score threshold, independently for every location window, taking into account nucleotide heterogeneity along the promoters of the target genes. The method, combined with binding sequence and location conservation between human and mouse, identifies with high probability functional binding sites for groups of functionally-related genes. We found many location-sensitive functional binding events and showed that they clustered close to the TSS. Our method and findings were tested experimentally.

Conclusions/significance: We identified reliably functional TF binding sites. This is an essential step towards constructing regulatory networks. The promoter region proximal to the TSS is of central importance for regulation of transcription in human and mouse, just as it is in bacteria and yeast.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Binding Sites
  • Cell Cycle
  • Humans
  • Mice
  • Promoter Regions, Genetic*
  • TATA Box
  • Transcription Factors / chemistry
  • Transcription Factors / metabolism*
  • Transcription Initiation Site*

Substances

  • Transcription Factors