Embracing Metagenomic Complexity with a Genome-Free Approach

mSystems. 2021 Aug 31;6(4):e0081621. doi: 10.1128/mSystems.00816-21. Epub 2021 Aug 17.

Abstract

A central paradigm in microbiome data analysis, which we term the genome-centric paradigm, is that a linear (non-branching) DNA sequence is the ideal representation of a microbial genome. This representation is natural, as microbes indeed have non-branching genomes. Tremendous discoveries in microbiology were made under this paradigm, but is it always optimal for microbiome research? In this Commentary, we claim that the realization of this paradigm in metagenomic assembly, a fundamental step in the "metagenomics analysis pipeline," suboptimally models the extensive genomic variability present in the microbiome. We outline our efforts to address these issues with a "genome-free" approach that eschews linear genomic representations in favor of a pan-metagenomic graph.

Keywords: assembly; genomics; metagenomics; microbiome.