Asymptotic distribution of motifs in a stochastic context-free grammar model of RNA folding

J Math Biol. 2014 Dec;69(6-7):1743-72. doi: 10.1007/s00285-013-0750-y. Epub 2014 Jan 3.

Abstract

We analyze the distribution of RNA secondary structures given by the Knudsen-Hein stochastic context-free grammar used in the prediction program Pfold. Our main theorem gives relations between the expected number of these motifs--independent of the grammar probabilities. These relations are a consequence of proving that the distribution of base pairs, of helices, and of different types of loops is asymptotically Gaussian in this model of RNA folding. Proof techniques use singularity analysis of probability generating functions. We also demonstrate that these asymptotic results capture well the expected number of RNA base pairs in native ribosomal structures, and certain other aspects of their predicted secondary structures. In particular, we find that the predicted structures largely satisfy the expected relations, although the native structures do not.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Base Pairing
  • Models, Chemical*
  • Normal Distribution
  • Nucleic Acid Conformation*
  • RNA / chemistry*
  • RNA Folding*
  • Stochastic Processes
  • Thermodynamics

Substances

  • RNA