Taxogenomics and Systematics of the Genus Pantoea

Front Microbiol. 2019 Oct 30:10:2463. doi: 10.3389/fmicb.2019.02463. eCollection 2019.

Abstract

Members of the genus Pantoea are Gram-negative bacteria isolated from various environments. Taxonomic affiliation based on multilocus sequence analysis (MLSA) is used routinely for inferring accurate phylogeny and identification of bacterial species and genera. Partial sequences of five housekeeping genes (fusA, gyrB, leuS, rpoB, and pyrG) were extracted from 206 draft or complete genomes of Pantoea strains publicly available in databases and analyzed together with the representative sequences of the 25 validly published Pantoea type strains to verify and assess their phylogenetic assignations. Of a total of 159 strains assigned to species level, 11.3% of the non-type strains were incorrectly assigned within suitable Pantoea species. The highest proportion of misidentified strains was recorded in Pantoea vagans, 8 out of 15 (53.3%) inaccurate assignations at the species level. One probable reason for this incorrect classification could be the method previously used for strain identification. Forty-seven (22.8%) genome sequences were from strains identified at the genus level only (Pantoea sp.). A combination of MLSA, average nucleotide identities [ANI and MuMmer-based ANI (ANIm)], tetranucleotide usage pattern (TETRA), and genome-based DNA-DNA hybridization (gDDH) data was used to accurately assign 25 of the 47 strains to validly published Pantoea species, while 17 strains could be assigned as putative novel species within the genus Pantoea. Four genomes designed as Pantoea sp. were identified as Mixta calida. Positive and significant correlation coefficients were computed between MLSA and all the indices derived from whole-genome sequences being proposed for species delimitation. gDDH exhibited the best correlation with MLSA while TETRA was the worst. Accurate species-level identification is key to a better understanding of bacterial diversity and evolution. The MLSA scheme used here could be instrumental to determine the correct taxonomic status of new whole-genome sequenced Pantoea strains, especially non-type strains, before depositing into public databases.

Keywords: average nucleotide identity; codon usage; phylogenomics; systematics; taxonomy; tetranucleotides.