The Role of Genome Accessibility in Transcription Factor Binding in Bacteria

PLoS Comput Biol. 2016 Apr 22;12(4):e1004891. doi: 10.1371/journal.pcbi.1004891. eCollection 2016 Apr.

Abstract

ChIP-seq enables genome-scale identification of regulatory regions that govern gene expression. However, the biological insights generated from ChIP-seq analysis have been limited to predictions of binding sites and cooperative interactions. Furthermore, ChIP-seq data often poorly correlate with in vitro measurements or predicted motifs, highlighting that binding affinity alone is insufficient to explain transcription factor (TF)-binding in vivo. One possibility is that binding sites are not equally accessible across the genome. A more comprehensive biophysical representation of TF-binding is required to improve our ability to understand, predict, and alter gene expression. Here, we show that genome accessibility is a key parameter that impacts TF-binding in bacteria. We developed a thermodynamic model that parameterizes ChIP-seq coverage in terms of genome accessibility and binding affinity. The role of genome accessibility is validated using a large-scale ChIP-seq dataset of the M. tuberculosis regulatory network. We find that accounting for genome accessibility led to a model that explains 63% of the ChIP-seq profile variance, while a model based in motif score alone explains only 35% of the variance. Moreover, our framework enables de novo ChIP-seq peak prediction and is useful for inferring TF-binding peaks in new experimental conditions by reducing the need for additional experiments. We observe that the genome is more accessible in intergenic regions, and that increased accessibility is positively correlated with gene expression and anti-correlated with distance to the origin of replication. Our biophysically motivated model provides a more comprehensive description of TF-binding in vivo from first principles towards a better representation of gene regulation in silico, with promising applications in systems biology.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Bacteria / genetics*
  • Bacteria / metabolism*
  • Biophysical Phenomena
  • Chromatin Immunoprecipitation
  • Computational Biology
  • Gene Regulatory Networks
  • Genome, Bacterial
  • Linear Models
  • Models, Biological
  • Mycobacterium tuberculosis / genetics
  • Mycobacterium tuberculosis / metabolism
  • Protein Binding
  • Sequence Analysis, DNA
  • Systems Biology
  • Thermodynamics
  • Transcription Factors / metabolism*

Substances

  • Transcription Factors