RNA secondary structure prediction using stochastic context-free grammars and evolutionary history

Bioinformatics. 1999 Jun;15(6):446-54. doi: 10.1093/bioinformatics/15.6.446.

Abstract

Motivation: Many computerized methods for RNA secondary structure prediction have been developed. Few of these methods, however, employ an evolutionary model, thus relevant information is often left out from the structure determination. This paper introduces a method which incorporates evolutionary history into RNA secondary structure prediction. The method reported here is based on stochastic context-free grammars (SCFGs) to give a prior probability distribution of structures.

Results: The phylogenetic tree relating the sequences can be found by maximum likelihood (ML) estimation from the model introduced here. The tree is shown to reveal information about the structure, due to mutation patterns. The inclusion of a prior distribution of RNA structures ensures good structure predictions even for a small number of related sequences. Prediction is carried out using maximum a posteriori estimation (MAP) estimation in a Bayesian approach. For small sequence sets, the method performs very well compared to current automated methods.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Base Sequence
  • Biological Evolution
  • Computational Biology
  • Computer Simulation
  • Endoribonucleases / genetics
  • Molecular Sequence Data
  • Nucleic Acid Conformation*
  • Probability
  • RNA / chemistry*
  • RNA, Bacterial / chemistry
  • RNA, Bacterial / genetics
  • RNA, Catalytic / genetics
  • Ribonuclease P
  • Sequence Homology, Nucleic Acid
  • Stochastic Processes

Substances

  • RNA, Bacterial
  • RNA, Catalytic
  • RNA
  • Endoribonucleases
  • Ribonuclease P