Secondary structural entropy in RNA switch (Riboswitch) identification

BMC Bioinformatics. 2015 Apr 28:16:133. doi: 10.1186/s12859-015-0523-2.

Abstract

Background: RNA regulatory elements play a significant role in gene regulation. Riboswitches, a widespread group of regulatory RNAs, are vital components of many bacterial genomes. These regulatory elements generally function by forming a ligand-induced alternative fold that controls access to ribosome binding sites or other regulatory sites in RNA. Riboswitch-mediated mechanisms are ubiquitous across bacterial genomes. A typical class of riboswitch has its own unique structural and biological complexity, making de novo riboswitch identification a formidable task. Traditionally, riboswitches have been identified through comparative genomics based on sequence and structural homology. The limitations of structural-homology-based approaches, coupled with the assumption that there is a great diversity of undiscovered riboswitches, suggests the need for alternative methods for riboswitch identification, possibly based on features intrinsic to their structure. As of yet, no such reliable method has been proposed.

Results: We used structural entropy of riboswitch sequences as a measure of their secondary structural dynamics. Entropy values of a diverse set of riboswitches were compared to that of their mutants, their dinucleotide shuffles, and their reverse complement sequences under different stochastic context-free grammar folding models. Significance of our results was evaluated by comparison to other approaches, such as the base-pairing entropy and energy landscapes dynamics. Classifiers based on structural entropy optimized via sequence and structural features were devised as riboswitch identifiers and tested on Bacillus subtilis, Escherichia coli, and Synechococcus elongatus as an exploration of structural entropy based approaches. The unusually long untranslated region of the cotH in Bacillus subtilis, as well as upstream regions of certain genes, such as the sucC genes were associated with significant structural entropy values in genome-wide examinations.

Conclusions: Various tests show that there is in fact a relationship between higher structural entropy and the potential for the RNA sequence to have alternative structures, within the limitations of our methodology. This relationship, though modest, is consistent across various tests. Understanding the behavior of structural entropy as a fairly new feature for RNA conformational dynamics, however, may require extensive exploratory investigation both across RNA sequences and folding models.

Publication types

  • Research Support, N.I.H., Intramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Bacillus subtilis / genetics
  • Base Pairing
  • Base Sequence
  • Binding Sites / genetics
  • Computational Biology / methods*
  • Entropy*
  • Escherichia coli / genetics
  • Molecular Sequence Data
  • Nucleic Acid Conformation*
  • RNA, Bacterial / chemistry*
  • RNA, Bacterial / genetics
  • RNA, Bacterial / metabolism
  • Regulatory Sequences, Nucleic Acid
  • Ribosomes / chemistry
  • Ribosomes / metabolism*
  • Riboswitch / genetics*
  • Software*
  • Synechococcus / genetics

Substances

  • RNA, Bacterial
  • Riboswitch