A standardized archaeal taxonomy for the Genome Taxonomy Database

Nat Microbiol. 2021 Jul;6(7):946-959. doi: 10.1038/s41564-021-00918-8. Epub 2021 Jun 21.

Abstract

The accrual of genomic data from both cultured and uncultured microorganisms provides new opportunities to develop systematic taxonomies based on evolutionary relationships. Previously, we established a bacterial taxonomy through the Genome Taxonomy Database. Here, we propose a standardized archaeal taxonomy that is derived from a 122-concatenated-protein phylogeny that resolves polyphyletic groups and normalizes ranks based on relative evolutionary divergence. The resulting archaeal taxonomy, which forms part of the Genome Taxonomy Database, is stable for a range of phylogenetic variables including marker gene selection, inference methods, corrections for rate heterogeneity and compositional bias, tree rooting scenarios and expansion of the genome database. Rank normalization is shown to robustly correct for substitution rates varying up to 30-fold using simulated datasets. Taxonomic curation follows the rules of the International Code of Nomenclature of Prokaryotes while taking into account proposals to formally recognize the rank of phylum and to use genome sequences as type material. This taxonomy is based on 2,392 archaeal genomes, 93.3% of which required one or more changes to their existing taxonomy, mainly owing to incomplete classification. We identify 16 archaeal phyla and reclassify 3 major monophyletic units from the former Euryarchaeota and one phylum that unites the Thaumarchaeota-Aigarchaeota-Crenarchaeota-Korarchaeota (TACK) superphylum into a single phylum.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Archaea / classification*
  • Archaea / genetics
  • Databases, Genetic* / standards
  • Evolution, Molecular
  • Genome, Archaeal*
  • Genomics
  • Phylogeny
  • Reference Standards