Trowel: a fast and accurate error correction module for Illumina sequencing reads

Bioinformatics. 2014 Nov 15;30(22):3264-5. doi: 10.1093/bioinformatics/btu513. Epub 2014 Jul 29.

Abstract

Motivation: The ability to accurately read the order of nucleotides in DNA and RNA is fundamental for modern biology. Errors in next-generation sequencing can lead to many artifacts, from erroneous genome assemblies to mistaken inferences about RNA editing. Uneven coverage in datasets also contributes to false corrections.

Result: We introduce Trowel, a massively parallelized and highly efficient error correction module for Illumina read data. Trowel both corrects erroneous base calls and boosts base qualities based on the k-mer spectrum. With high-quality k-mers and relevant base information, Trowel achieves high accuracy for different short read sequencing applications.The latency in the data path has been significantly reduced because of efficient data access and data structures. In performance evaluations, Trowel was highly competitive with other tools regardless of coverage, genome size read length and fragment size.

Availability and implementation: Trowel is written in C++ and is provided under the General Public License v3.0 (GPLv3). It is available at http://trowel-ec.sourceforge.net.

Contact: euncheon.lim@tue.mpg.de or weigel@tue.mpg.de

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • High-Throughput Nucleotide Sequencing / methods*
  • Software*