Customized strategies for discovering distant ncRNA homologs

Brief Funct Genomic Proteomic. 2009 Nov;8(6):451-60. doi: 10.1093/bfgp/elp035. Epub 2009 Sep 24.

Abstract

A large fraction of non-coding RNAs is short and/or poorly conserved in sequence. Most of the longer examples, furthermore, consist of a collection of conserved structural motifs rather than a coherent globally conserved secondary structure. As a consequence, the conceptually simple problem of homology search becomes a complex and technically demanding task. Despite the best efforts of databases such as Rfam, the situation is complicated further by the sparsity of information in many--in particular prokaryotic--RNA families. In this contribution, we review recent efforts to customize sequence-based search tools for ncRNA applications. In particular, semi-global alignments and the development of methods for fragmented pattern search have brought significant practical advances. Current developments in this area focus on the integration of fragmented sequence pattern search with search algorithms for secondary structure patterns. We focus here, in particular, on strategies that can be successful in the 'twilight zone' where generic approaches from blast to infernal to start to fail.

Publication types

  • Review

MeSH terms

  • Animals
  • Base Sequence
  • Conserved Sequence
  • Humans
  • Nucleic Acid Conformation
  • RNA, Untranslated / chemistry*
  • RNA, Untranslated / genetics
  • Sequence Analysis, RNA / methods*
  • Sequence Homology, Nucleic Acid

Substances

  • RNA, Untranslated