metaFlye: scalable long-read metagenome assembly using repeat graphs

Nat Methods. 2020 Nov;17(11):1103-1110. doi: 10.1038/s41592-020-00971-x. Epub 2020 Oct 5.

Abstract

Long-read sequencing technologies have substantially improved the assemblies of many isolate bacterial genomes as compared to fragmented short-read assemblies. However, assembling complex metagenomic datasets remains difficult even for state-of-the-art long-read assemblers. Here we present metaFlye, which addresses important long-read metagenomic assembly challenges, such as uneven bacterial composition and intra-species heterogeneity. First, we benchmarked metaFlye using simulated and mock bacterial communities and show that it consistently produces assemblies with better completeness and contiguity than state-of-the-art long-read assemblers. Second, we performed long-read sequencing of the sheep microbiome and applied metaFlye to reconstruct 63 complete or nearly complete bacterial genomes within single contigs. Finally, we show that long-read assembly of human microbiomes enables the discovery of full-length biosynthetic gene clusters that encode biomedically important natural products.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Animals
  • Benchmarking
  • Gastrointestinal Microbiome / genetics
  • Genome, Bacterial / genetics*
  • Genome, Human / genetics*
  • Humans
  • Metagenome / genetics*
  • Metagenomics / methods*
  • Microbiota / genetics*
  • Sequence Analysis, DNA / methods
  • Sheep
  • Software
  • Species Specificity