MGEScan: a Galaxy-based system for identifying retrotransposons in genomes

Bioinformatics. 2016 Aug 15;32(16):2502-4. doi: 10.1093/bioinformatics/btw157. Epub 2016 Apr 7.

Abstract

: MGEScan-long terminal repeat (LTR) and MGEScan-non-LTR are successfully used programs for identifying LTRs and non-LTR retrotransposons in eukaryotic genome sequences. However, these programs are not supported by easy-to-use interfaces nor well suited for data visualization in general data formats. Here, we present MGEScan, a user-friendly system that combines these two programs with a Galaxy workflow system accelerated with MPI and Python threading on compute clusters. MGEScan and Galaxy empower researchers to identify transposable elements in a graphical user interface with ready-to-use workflows. MGEScan also visualizes the custom annotation tracks for mobile genetic elements in public genome browsers. A maximum speed-up of 3.26× is attained for execution time using concurrent processing and MPI on four virtual cores. MGEScan provides four operational modes: as a command line tool, as a Galaxy Toolshed, on a Galaxy-based web server, and on a virtual cluster on the Amazon cloud.

Availability and implementation: MGEScan tutorials and source code are available at http://mgescan.readthedocs.org/

Contact: hatang@indiana.edu or syoh@ajou.ac.kr

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Algorithms*
  • Computational Biology / methods
  • Genome
  • Programming Languages*
  • Retroelements*
  • Software
  • Systems Integration

Substances

  • Retroelements