Flexbar 3.0 - SIMD and multicore parallelization

Bioinformatics. 2017 Sep 15;33(18):2941-2942. doi: 10.1093/bioinformatics/btx330.

Abstract

Motivation: High-throughput sequencing machines can process many samples in a single run. For Illumina systems, sequencing reads are barcoded with an additional DNA tag that is contained in the respective sequencing adapters. The recognition of barcode and adapter sequences is hence commonly needed for the analysis of next-generation sequencing data. Flexbar performs demultiplexing based on barcodes and adapter trimming for such data. The massive amounts of data generated on modern sequencing machines demand that this preprocessing is done as efficiently as possible.

Results: We present Flexbar 3.0, the successor of the popular program Flexbar. It employs now twofold parallelism: multi-threading and additionally SIMD vectorization. Both types of parallelism are used to speed-up the computation of pair-wise sequence alignments, which are used for the detection of barcodes and adapters. Furthermore, new features were included to cover a wide range of applications. We evaluated the performance of Flexbar based on a simulated sequencing dataset. Our program outcompetes other tools in terms of speed and is among the best tools in the presented quality benchmark.

Availability and implementation: https://github.com/seqan/flexbar.

Contact: johannes.roehr@fu-berlin.de or knut.reinert@fu-berlin.de.

MeSH terms

  • Animals
  • Caenorhabditis elegans / genetics
  • Genome, Helminth
  • High-Throughput Nucleotide Sequencing / methods*
  • Sequence Alignment / methods
  • Sequence Analysis, DNA / methods
  • Software*