GRAbB: Selective Assembly of Genomic Regions, a New Niche for Genomic Research

PLoS Comput Biol. 2016 Jun 16;12(6):e1004753. doi: 10.1371/journal.pcbi.1004753. eCollection 2016 Jun.

Abstract

GRAbB (Genomic Region Assembly by Baiting) is a new program that is dedicated to assemble specific genomic regions from NGS data. This approach is especially useful when dealing with multi copy regions, such as mitochondrial genome and the rDNA repeat region, parts of the genome that are often neglected or poorly assembled, although they contain interesting information from phylogenetic or epidemiologic perspectives, but also single copy regions can be assembled. The program is capable of targeting multiple regions within a single run. Furthermore, GRAbB can be used to extract specific loci from NGS data, based on homology, like sequences that are used for barcoding. To make the assembly specific, a known part of the region, such as the sequence of a PCR amplicon or a homologous sequence from a related species must be specified. By assembling only the region of interest, the assembly process is computationally much less demanding and may lead to assemblies of better quality. In this study the different applications and functionalities of the program are demonstrated such as: exhaustive assembly (rDNA region and mitochondrial genome), extracting homologous regions or genes (IGS, RPB1, RPB2 and TEF1a), as well as extracting multiple regions within a single run. The program is also compared with MITObim, which is meant for the exhaustive assembly of a single target based on a similar query sequence. GRAbB is shown to be more efficient than MITObim in terms of speed, memory and disk usage. The other functionalities (handling multiple targets simultaneously and extracting homologous regions) of the new program are not matched by other programs. The program is available with explanatory documentation at https://github.com/b-brankovics/grabb. GRAbB has been tested on Ubuntu (12.04 and 14.04), Fedora (23), CentOS (7.1.1503) and Mac OS X (10.7). Furthermore, GRAbB is available as a docker repository: brankovics/grabb (https://hub.docker.com/r/brankovics/grabb/).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology
  • Computer Simulation
  • DNA, Fungal / genetics
  • DNA, Ribosomal / genetics
  • Fusarium / genetics
  • Genome, Fungal
  • Genome, Mitochondrial
  • Genomics / methods*
  • Genomics / statistics & numerical data
  • High-Throughput Nucleotide Sequencing / statistics & numerical data
  • Software*

Substances

  • DNA, Fungal
  • DNA, Ribosomal

Grants and funding

The investigations are supported by the Division for Earth and Life Sciences (ALW) with financial aid from the Netherlands Organization for Scientific Research (NWO, http://www.nwo.nl/) under grant number 833.13.006. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.