Genome-wide prediction and validation of sigma70 promoters in Lactobacillus plantarum WCFS1

PLoS One. 2012;7(9):e45097. doi: 10.1371/journal.pone.0045097. Epub 2012 Sep 20.

Abstract

Background: In prokaryotes, sigma factors are essential for directing the transcription machinery towards promoters. Various sigma factors have been described that recognize, and bind to specific DNA sequence motifs in promoter sequences. The canonical sigma factor σ(70) is commonly involved in transcription of the cell's housekeeping genes, which is mediated by the conserved σ(70) promoter sequence motifs. In this study the σ(70)-promoter sequences in Lactobacillus plantarum WCFS1 were predicted using a genome-wide analysis. The accuracy of the transcriptionally-active part of this promoter prediction was subsequently evaluated by correlating locations of predicted promoters with transcription start sites inferred from the 5'-ends of transcripts detected by high-resolution tiling array transcriptome datasets.

Results: To identify σ(70)-related promoter sequences, we performed a genome-wide sequence motif scan of the L. plantarum WCFS1 genome focussing on the regions upstream of protein-encoding genes. We obtained several highly conserved motifs including those resembling the conserved σ(70)-promoter consensus. Position weight matrices-based models of the recovered σ(70)-promoter sequence motif were employed to identify 3874 motifs with significant similarity (p-value<10(-4)) to the model-motif in the L. plantarum genome. Genome-wide transcript information deduced from whole genome tiling-array transcriptome datasets, was used to infer transcription start sites (TSSs) from the 5'-end of transcripts. By this procedure, 1167 putative TSSs were identified that were used to corroborate the transcriptionally active fraction of these predicted promoters. In total, 568 predicted promoters were found in proximity (≤ 40 nucleotides) of the putative TSSs, showing a highly significant co-occurrence of predicted promoter and TSS (p-value<10(-263)).

Conclusions: High-resolution tiling arrays provide a suitable source to infer TSSs at a genome-wide level, and allow experimental verification of in silico predicted promoter sequence motifs.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Conserved Sequence / genetics
  • DNA, Intergenic / genetics
  • DNA-Directed RNA Polymerases / genetics*
  • Genome, Bacterial / genetics*
  • Molecular Sequence Data
  • Nucleotide Motifs / genetics
  • Promoter Regions, Genetic*
  • Reproducibility of Results
  • Sigma Factor / genetics*
  • Transcription Initiation Site

Substances

  • DNA, Intergenic
  • Sigma Factor
  • RNA polymerase sigma 70
  • DNA-Directed RNA Polymerases

Grants and funding

T.J. Todt was funded by HAN University of Applied Sciences. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.