Using a priori knowledge to align sequencing reads to their exact genomic position

René Böttcher; Ronny Amberg; F P Ruzius; V Guryev; Wim F J Verhaegh; Peter Beyerlein; P J van der Zaag

doi:10.1093/nar/gks393

Using a priori knowledge to align sequencing reads to their exact genomic position

Nucleic Acids Res. 2012 Sep;40(16):e125. doi: 10.1093/nar/gks393. Epub 2012 May 11.

Authors

René Böttcher¹, Ronny Amberg, F P Ruzius, V Guryev, Wim F J Verhaegh, Peter Beyerlein, P J van der Zaag

Affiliation

¹ Philips Research Laboratories, High Tech Campus 11, 5656 AE Eindhoven, The Netherlands.

Abstract

The use of a priori knowledge in the alignment of targeted sequencing data is investigated using computational experiments. Adapting a Needleman-Wunsch algorithm to incorporate the genomic position information from the targeted capture, we demonstrate that alignment can be done to just the target region of interest. When in addition use is made of direct string comparison, an improvement of up to a factor of 8 in alignment speed compared to the fastest conventional aligner (Bowtie) is obtained. This results in a total alignment time in targeted sequencing of around 7 min for aligning approximately 56 million captured reads. For conventional aligners such as Bowtie, BWA or MAQ, alignment to just the target region is not feasible as experiments show that this leads to an additional 88% SNP calls, the vast majority of which are false positives (≈ 92%).

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Genomics / methods*
Polymorphism, Single Nucleotide
Sequence Alignment / methods*
Sequence Analysis, DNA*