StatAlign 2.0: combining statistical alignment with RNA secondary structure prediction

Bioinformatics. 2013 Mar 1;29(5):654-5. doi: 10.1093/bioinformatics/btt025. Epub 2013 Jan 17.

Abstract

Motivation: Comparative modeling of RNA is known to be important for making accurate secondary structure predictions. RNA structure prediction tools such as PPfold or RNAalifold use an aligned set of sequences in predictions. Obtaining a multiple alignment from a set of sequences is quite a challenging problem itself, and the quality of the alignment can affect the quality of a prediction. By implementing RNA secondary structure prediction in a statistical alignment framework, and predicting structures from multiple alignment samples instead of a single fixed alignment, it may be possible to improve predictions.

Results: We have extended the program StatAlign to make use of RNA-specific features, which include RNA secondary structure prediction from multiple alignments using either a thermodynamic approach (RNAalifold) or a Stochastic Context-Free Grammars (SCFGs) approach (PPfold). We also provide the user with scores relating to the quality of a secondary structure prediction, such as information entropy values for the combined space of secondary structures and sampled alignments, and a reliability score that predicts the expected number of correctly predicted base pairs. Finally, we have created RNA secondary structure visualization plugins and automated the process of setting up Markov Chain Monte Carlo runs for RNA alignments in StatAlign.

Availability and implementation: The software is available from http://statalign.github.com/statalign/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Base Pairing
  • Bayes Theorem
  • Markov Chains
  • Nucleic Acid Conformation
  • RNA / chemistry*
  • Sequence Alignment / methods*
  • Sequence Analysis, RNA*
  • Software*
  • Thermodynamics

Substances

  • RNA