A bit-parallel dynamic programming algorithm suitable for DNA sequence alignment

J Bioinform Comput Biol. 2012 Aug;10(4):1250002. doi: 10.1142/S0219720012500023. Epub 2012 Jun 22.

Abstract

Myers' elegant and powerful bit-parallel dynamic programming algorithm for approximate string matching has a restriction that the query length should be within the word size of the computer, typically 64. We propose a modification of Myers' algorithm, in which the modification has a restriction not on the query length but on the maximum number of mismatches (substitutions, insertions, or deletions), which should be less than half of the word size. The time complexity is O(m log |Σ|), where m is the query length and |Σ| is the size of the alphabet Σ. Thus, it is particularly suited for sequences on a small alphabet such as DNA sequences. In particular, it is useful in quickly extending a large number of seed alignments against a reference genome for high-throughput short-read data produced by next-generation DNA sequencers.

MeSH terms

  • Algorithms*
  • Base Sequence*
  • Computational Biology
  • DNA / chemistry*
  • Genome
  • Sequence Alignment
  • Sequence Analysis, DNA

Substances

  • DNA