Taxonomic analysis of metagenomic data with kASA

Nucleic Acids Res. 2021 Jul 9;49(12):e68. doi: 10.1093/nar/gkab200.

Abstract

The taxonomic analysis of sequencing data has become important in many areas of life sciences. However, currently available tools for that purpose either consume large amounts of RAM or yield insufficient quality and robustness. Here, we present kASA, a k-mer based tool capable of identifying and profiling metagenomic DNA or protein sequences with high computational efficiency and a user-definable memory footprint. We ensure both high sensitivity and precision by using an amino acid-like encoding of k-mers together with a range of multiple k's. Custom algorithms and data structures optimized for external memory storage enable a full-scale taxonomic analysis without compromise on laptop, desktop, and HPCC.

MeSH terms

  • Algorithms
  • Metagenomics / methods*
  • Sequence Analysis, DNA / methods
  • Sequence Analysis, Protein / methods