genozip: a fast and efficient compression tool for VCF files

Bioinformatics. 2020 Jul 1;36(13):4091-4092. doi: 10.1093/bioinformatics/btaa290.

Abstract

Motivation: genozip is a new lossless compression tool for Variant Call Format (VCF) files. By applying field-specific algorithms and fully utilizing the available computational hardware, genozip achieves the highest compression ratios amongst existing lossless compression tools known to the authors, at speeds comparable with the fastest multi-threaded compressors.

Availability and implementation: genozip is freely available to non-commercial users. It can be installed via conda-forge, Docker Hub, or downloaded from github.com/divonlan/genozip.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Data Compression*
  • Genomics
  • High-Throughput Nucleotide Sequencing*
  • Software