Inferring functional transcription factor-gene binding pairs by integrating transcription factor binding data with transcription factor knockout data

BMC Syst Biol. 2013;7 Suppl 6(Suppl 6):S13. doi: 10.1186/1752-0509-7-S6-S13. Epub 2013 Dec 13.

Abstract

Background: Chromatin immunoprecipitation (ChIP) experiments are now the most comprehensive experimental approaches for mapping the binding of transcription factors (TFs) to their target genes. However, ChIP data alone is insufficient for identifying functional binding target genes of TFs for two reasons. First, there is an inherent high false positive/negative rate in ChIP-chip or ChIP-seq experiments. Second, binding signals in the ChIP data do not necessarily imply functionality.

Methods: It is known that ChIP-chip data and TF knockout (TFKO) data reveal complementary information on gene regulation. While ChIP-chip data can provide TF-gene binding pairs, TFKO data can provide TF-gene regulation pairs. Therefore, we propose a novel network approach for identifying functional TF-gene binding pairs by integrating the ChIP-chip data with the TFKO data. In our method, a TF-gene binding pair from the ChIP-chip data is regarded to be functional if it also has high confident curated TFKO TF-gene regulatory relation or deduced hypostatic TF-gene regulatory relation.

Results and conclusions: We first validated our method on a gathered ground truth set. Then we applied our method to the ChIP-chip data to identify functional TF-gene binding pairs. The biological significance of our identified functional TF-gene binding pairs was shown by assessing their functional enrichment, the prevalence of protein-protein interaction, and expression coherence. Our results outperformed the results of three existing methods across all measures. And our identified functional targets of TFs also showed statistical significance over the randomly assigned TF-gene pairs. We also showed that our method is dataset independent and can apply to ChIP-seq data and the E. coli genome. Finally, we provided an example showing the biological applicability of our notion.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Escherichia coli / genetics
  • Escherichia coli / metabolism
  • Gene Expression Regulation
  • Gene Knockout Techniques*
  • Genome, Bacterial / genetics
  • Protein Binding
  • Reproducibility of Results
  • Transcription Factors / deficiency
  • Transcription Factors / genetics*
  • Transcription Factors / metabolism*

Substances

  • Transcription Factors