Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies

Genome Biol. 2020 Sep 14;21(1):245. doi: 10.1186/s13059-020-02134-9.

Abstract

Recent long-read assemblies often exceed the quality and completeness of available reference genomes, making validation challenging. Here we present Merqury, a novel tool for reference-free assembly evaluation based on efficient k-mer set operations. By comparing k-mers in a de novo assembly to those found in unassembled high-accuracy reads, Merqury estimates base-level accuracy and completeness. For trios, Merqury can also evaluate haplotype-specific accuracy, completeness, phase block continuity, and switch errors. Multiple visualizations, such as k-mer spectrum plots, can be generated for evaluation. We demonstrate on both human and plant genomes that Merqury is a fast and robust method for assembly validation.

Keywords: Assembly validation; Benchmarking; Genome assembly; Haplotype phasing; K-mers; Trio binning.

Publication types

  • Research Support, N.I.H., Intramural
  • Validation Study

MeSH terms

  • Arabidopsis
  • Genome, Human
  • Genome, Plant
  • Genomics / methods*
  • Humans
  • Software*