Genomic Epidemiology and Strain Taxonomy of Corynebacterium diphtheriae

J Clin Microbiol. 2021 Nov 18;59(12):e0158121. doi: 10.1128/JCM.01581-21. Epub 2021 Sep 15.

Abstract

Corynebacterium diphtheriae is highly transmissible and can cause large diphtheria outbreaks where vaccination coverage is insufficient. Sporadic cases or small clusters are observed in high-vaccination settings. The phylogeography and short timescale evolution of C. diphtheriae are not well understood, in part due to a lack of harmonized analytical approaches of genomic surveillance and strain tracking. We combined 1,305 genes with highly reproducible allele calls into a core genome multilocus sequence typing (cgMLST) scheme. We analyzed cgMLST gene diversity among 602 isolates from sporadic clinical cases, small clusters, or large outbreaks. We defined sublineages based on the phylogenetic structure within C. diphtheriae and strains based on the highest number of cgMLST mismatches within documented outbreaks. We performed time-scaled phylogenetic analyses of major sublineages. The cgMLST scheme showed high allele call rate in C. diphtheriae and the closely related species C. belfantii and C. rouxii. We demonstrate its utility to delineate epidemiological case clusters and outbreaks using a 25 mismatches threshold and reveal a number of cryptic transmission chains, most of which are geographically restricted to one or a few adjacent countries. Subcultures of the vaccine strain PW8 differed by up to 20 cgMLST mismatches. Phylogenetic analyses revealed a short-timescale evolutionary gain or loss of the diphtheria toxin and biovar-associated genes. We devised a genomic taxonomy of strains and deeper sublineages (defined using a 500-cgMLST-mismatch threshold), currently comprising 151 sublineages, only a few of which are geographically widespread based on current sampling. The cgMLST genotyping tool and nomenclature was made publicly accessible (https://bigsdb.pasteur.fr/diphtheria). Standardized genome-scale strain genotyping will help tracing transmission and geographic spread of C. diphtheriae. The unified genomic taxonomy of C. diphtheriae strains provides a common language for studies of ecology, evolution, and virulence heterogeneity among C. diphtheriae sublineages.

Keywords: cgMLST; diphtheria; epidemiology; genomic epidemiology; genomics; microevolution.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Corynebacterium diphtheriae* / genetics
  • Diphtheria* / epidemiology
  • Diphtheria* / microbiology
  • Genome, Bacterial
  • Genomics
  • Humans
  • Multilocus Sequence Typing
  • Phylogeny