Mining hidden polymorphic sequence motifs from divergent plant helitrons

Mob Genet Elements. 2014 Oct 30;4(5):1-5. doi: 10.4161/21592543.2014.971635. eCollection 2014 Oct.

Abstract

As a major driving force of genome evolution, transposons have been deviating from their original connotation as "junk" DNA ever since their important roles were revealed. The recently discovered Helitron transposons have been investigated in diverse eukaryotic genomes because of their remarkable gene-capture ability and other features that are crucial to our current understanding of genome dynamics. Helitrons are not canonical transposons in that they do not end in inverted repeats or create target site duplications, which makes them difficult to identify. Previous methods mainly rely on sequence alignment of conserved Helitron termini or manual curation. The abundance of Helitrons in genomes is still underestimated. We developed an automated and generalized tool, HelitronScanner, that identified a plethora of divergent Helitrons in many plant genomes. A local combinational variable approach as the key component of HelitronScanner offers a more granular representation of conserved nucleotide combinations and therefore is more sensitive in finding divergent Helitrons. This commentary provides an in-depth view of the local combinational variable approach and its association with Helitron sequence patterns. Analysis of Helitron terminal sequences shows that the local combinational variable approach is an efficacious representation of nucleotide patterns imperceptible at a full-sequence level.

Keywords: Helitron; algorithm; bioinformatic analysis; local combinational variable; sequence pattern.