SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets

Bioinformatics. 2017 Mar 1;33(5):743-745. doi: 10.1093/bioinformatics/btw718.

Abstract

Motivation: Short Interspersed Nuclear Elements (SINEs) are transposable elements (TEs) that amplify through a copy-and-paste mode via RNA intermediates. The computational identification of new SINEs are challenging because of their weak structural signals and rapid diversification in sequences.

Results: Here we report SINE_Scan, a highly efficient program to predict SINE elements in genomic DNA sequences. SINE_Scan integrates hallmark of SINE transposition, copy number and structural signals to identify a SINE element. SINE_Scan outperforms the previously published de novo SINE discovery program. It shows high sensitivity and specificity in 19 plant and animal genome assemblies, of which sizes vary from 120 Mb to 3.5 Gb. It identifies numerous new families and substantially increases the estimation of the abundance of SINEs in these genomes.

Availability and implementation: The code of SINE_Scan is freely available at http://github.com/maohlzj/SINE_Scan , implemented in PERL and supported on Linux.

Contact: wangh8@fudan.edu.cn.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • DNA Transposable Elements
  • Genomics / methods*
  • Humans
  • Plants / genetics
  • Sensitivity and Specificity
  • Sequence Analysis, DNA / methods*
  • Short Interspersed Nucleotide Elements*
  • Software*

Substances

  • DNA Transposable Elements