MerCat2: a versatile k-mer counter and diversity estimator for database-independent property analysis obtained from omics data

Bioinform Adv. 2024 Apr 24;4(1):vbae061. doi: 10.1093/bioadv/vbae061. eCollection 2024.

Abstract

Motivation: MerCat2 ("Mer-Catenate2") is a versatile, parallel, scalable and modular property software package for robustly analyzing features in omics data. Using massively parallel sequencing raw reads, assembled contigs, and protein sequences from any platform as input, MerCat2 performs k-mer counting of any length k, resulting in feature abundance counts tables, quality control reports, protein feature metrics, and graphical representation (i.e. principal component analysis (PCA)).

Results: MerCat2 allows for direct analysis of data properties in a database-independent manner that initializes all data, which other profilers and assembly-based methods cannot perform. MerCat2 represents an integrated tool to illuminate omics data within a sample for rapid cross-examination and comparisons.

Availability and implementation: MerCat2 is written in Python and distributed under a BSD-3 license. The source code of MerCat2 is freely available at https://github.com/raw-lab/mercat2. MerCat2 is compatible with Python 3 on Mac OS X and Linux. MerCat2 can also be easily installed using bioconda: mamba create -n mercat2 -c conda-forge -c bioconda mercat2.