A strong structural correlation between short inverted repeat sequences and the polyadenylation signal in yeast and nucleosome exclusion by these inverted repeats

Curr Genet. 2019 Apr;65(2):575-590. doi: 10.1007/s00294-018-0907-8. Epub 2018 Nov 29.

Abstract

DNA sequences that read the same from 5' to 3' in either strand are called inverted repeat sequences or simply IRs. They are found throughout a wide variety of genomes, from prokaryotes to eukaryotes. Despite extensive research, their in vivo functions, if any, remain unclear. Using Saccharomyces cerevisiae, we performed genome-wide analyses for the distribution, occurrence frequency, sequence characteristics and relevance to chromatin structure, for the IRs that reportedly have a cruciform-forming potential. Here, we provide the first comprehensive map of these IRs in the S. cerevisiae genome. The statistically significant enrichment of the IRs was found in the close vicinity of the DNA positions corresponding to polyadenylation [poly(A)] sites and ~ 30 to ~ 60 bp downstream of start codon-coding sites (referred to as 'start codons'). In the former, ApT- or TpA-rich IRs and A-tract- or T-tract-rich IRs are enriched, while in the latter, different IRs are enriched. Furthermore, we found a strong structural correlation between the former IRs and the poly(A) signal. In the chromatin formed on the gene end regions, the majority of the IRs causes low nucleosome occupancy. The IRs in the region ~ 30 to ~ 60 bp downstream of start codons are located in the + 1 nucleosomes. In contrast, fewer IRs are present in the adjacent region downstream of start codons. The current study suggests that the IRs play similar roles in Escherichia coli and S. cerevisiae to regulate or complete transcription at the RNA level.

Keywords: 3′-Untranslated region; IR map; Inverted repeat (IR); Nucleosome exclusion; Yeast genome.

MeSH terms

  • 3' Untranslated Regions
  • Chromatin / genetics
  • Chromatin / metabolism
  • Computational Biology / methods
  • Gene Expression Regulation, Fungal*
  • Genome, Fungal
  • Genomics / methods
  • Inverted Repeat Sequences*
  • Molecular Sequence Annotation
  • Nucleosomes / metabolism*
  • Polyadenylation*
  • Protein Binding
  • Yeasts / genetics*
  • Yeasts / metabolism*

Substances

  • 3' Untranslated Regions
  • Chromatin
  • Nucleosomes

Grants and funding