Incorporating genome-based phylogeny and functional similarity into diversity assessments helps to resolve a global collection of human gut metagenomes

Environ Microbiol. 2022 Sep;24(9):3966-3984. doi: 10.1111/1462-2920.15910. Epub 2022 Jan 31.

Abstract

Tree-based diversity measures incorporate phylogenetic or functional relatedness into comparisons of microbial communities. This can improve the identification of explanatory factors compared to tree-agnostic diversity measures. However, applying tree-based diversity measures to metagenome data is more challenging than for single-locus sequencing (e.g. 16S rRNA gene). Utilizing the Genome Taxonomy Database for species-level metagenome profiling allows for functional diversity measures based on genomic content or traits inferred from it. Still, it is unclear how metagenome-based assessments of microbiome diversity benefit from incorporating phylogeny or function into measures of diversity. We assessed this by measuring phylogeny-based, function-based and tree-agnostic diversity measures from a large, global collection of human gut metagenomes composed of 30 studies and 2943 samples. We found tree-based measures to explain phenotypic variation (e.g. westernization, disease status and gender) better or equivalent to tree-agnostic measures. Ecophylogenetic and functional diversity measures provided unique insight into how microbiome diversity was partitioned by phenotype. Tree-based measures greatly improved machine learning model performance for predicting westernization, disease status and gender, relative to models trained solely on tree-agnostic measures. Our findings illustrate the usefulness of tree- and function-based measures for metagenomic assessments of microbial diversity, which is a fundamental component of microbiome science.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Metagenome*
  • Metagenomics
  • Microbiota*
  • Phylogeny
  • RNA, Ribosomal, 16S / genetics

Substances

  • RNA, Ribosomal, 16S