RNA motif search with data-driven element ordering

BMC Bioinformatics. 2016 May 18;17(1):216. doi: 10.1186/s12859-016-1074-x.

Abstract

Background: In this paper, we study the problem of RNA motif search in long genomic sequences. This approach uses a combination of sequence and structure constraints to uncover new distant homologs of known functional RNAs. The problem is NP-hard and is traditionally solved by backtracking algorithms.

Results: We have designed a new algorithm for RNA motif search and implemented a new motif search tool RNArobo. The tool enhances the RNAbob descriptor language, allowing insertions in helices, which enables better characterization of ribozymes and aptamers. A typical RNA motif consists of multiple elements and the running time of the algorithm is highly dependent on their ordering. By approaching the element ordering problem in a principled way, we demonstrate more than 100-fold speedup of the search for complex motifs compared to previously published tools.

Conclusions: We have developed a new method for RNA motif search that allows for a significant speedup of the search of complex motifs that include pseudoknots. Such speed improvements are crucial at a time when the rate of DNA sequencing outpaces growth in computing. RNArobo is available at http://compbio.fmph.uniba.sk/rnarobo .

Keywords: Entropy; Pseudoknot; RNA motif search; Search order.

MeSH terms

  • Algorithms
  • Entropy
  • Humans
  • Nucleotide Motifs*
  • RNA / chemistry*
  • Sequence Analysis, RNA / methods*

Substances

  • RNA