Fast filtering for RNA homology search

Bioinformatics. 2011 Nov 15;27(22):3102-9. doi: 10.1093/bioinformatics/btr545. Epub 2011 Sep 28.

Abstract

Motivation: Homology search for RNAs can use secondary structure information to increase power by modeling base pairs, as in covariance models, but the resulting computational costs are high. Typical acceleration strategies rely on at least one filtering stage using sequence-only search.

Results: Here we present the multi-segment CYK (MSCYK) filter, which implements a heuristic of ungapped structural alignment for RNA homology search. Compared to gapped alignment, this approximation has lower computation time requirements (O(N⁴) reduced to O(N³), and space requirements (O(N³) reduced to O(N²). A vector-parallel implementation of this method gives up to 100-fold speed-up; vector-parallel implementations of standard gapped alignment at two levels of precision give 3- and 6-fold speed-ups. These approaches are combined to create a filtering pipeline that scores RNA secondary structure at all stages, with results that are synergistic with existing methods.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Base Sequence
  • Consensus Sequence
  • Humans
  • Nucleic Acid Conformation
  • RNA / chemistry*
  • Sequence Alignment
  • Sequence Analysis, RNA / methods*
  • Sequence Homology, Nucleic Acid*
  • Software

Substances

  • RNA