The khmer software package: enabling efficient nucleotide sequence analysis

F1000Res. 2015 Sep 25:4:900. doi: 10.12688/f1000research.6924.1. eCollection 2015.

Abstract

The khmer package is a freely available software library for working efficiently with fixed length DNA words, or k-mers. khmer provides implementations of a probabilistic k-mer counting data structure, a compressible De Bruijn graph representation, De Bruijn graph partitioning, and digital normalization. khmer is implemented in C++ and Python, and is freely available under the BSD license at https://github.com/dib-lab/khmer/.

Keywords: bioinformatics; dna sequencing analysis; k-mer; khmer; kmer; low-memory; online; streaming.