A generative model for the behavior of RNA polymerase

Bioinformatics. 2017 Jan 15;33(2):227-234. doi: 10.1093/bioinformatics/btw599. Epub 2016 Sep 23.

Abstract

Motivation: Transcription by RNA polymerase is a highly dynamic process involving multiple distinct points of regulation. Nascent transcription assays are a relatively new set of high throughput techniques that measure the location of actively engaged RNA polymerase genome wide. Hence, nascent transcription is a rich source of information on the regulation of RNA polymerase activity. To fully dissect this data requires the development of stochastic models that can both deconvolve the stages of polymerase activity and identify significant changes in activity between experiments.

Results: We present a generative, probabilistic model of RNA polymerase that fully describes loading, initiation, elongation and termination. We fit this model genome wide and profile the enzymatic activity of RNA polymerase across various loci and following experimental perturbation. We observe striking correlation of predicted loading events and regulatory chromatin marks. We provide principled statistics that compute probabilities reminiscent of traveler's and divergent ratios. We finish with a systematic comparison of RNA Polymerase activity at promoter versus non-promoter associated loci.

Availability and implementation: Transcription Fit (Tfit) is a freely available, open source software package written in C/C ++ that requires GNU compilers 4.7.3 or greater. Tfit is available from GitHub (https://github.com/azofeifa/Tfit).

Contact: robin.dowell@colorado.eduSupplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Chromatin / metabolism
  • DNA-Directed RNA Polymerases / metabolism*
  • Eukaryota / enzymology
  • Eukaryota / genetics
  • Models, Biological*
  • Models, Molecular*
  • Models, Statistical
  • Promoter Regions, Genetic*
  • Software*

Substances

  • Chromatin
  • DNA-Directed RNA Polymerases