SOPanG: online text searching over a pan-genome

Bioinformatics. 2018 Dec 15;34(24):4290-4292. doi: 10.1093/bioinformatics/bty506.

Abstract

Motivation: The many thousands of high-quality genomes available now-a-days imply a shift from single genome to pan-genomic analyses. A basic algorithmic building brick for such a scenario is online search over a collection of similar texts, a problem with surprisingly few solutions presented so far.

Results: We present SOPanG, a simple tool for exact pattern matching over an elastic-degenerate string, a recently proposed simplified model for the pan-genome. Thanks to bit-parallelism, it achieves pattern matching speeds above 400 MB/s, more than an order of magnitude higher than of other software.

Availability and implementation: SOPanG is available for free from: https://github.com/MrAlexSee/sopang.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Genome* / genetics
  • Genomics* / methods
  • Information Storage and Retrieval
  • Internet
  • Software* / standards