Population structure analysis and laboratory monitoring of Shigella by core-genome multilocus sequence typing

Nat Commun. 2022 Jan 27;13(1):551. doi: 10.1038/s41467-022-28121-1.

Abstract

The laboratory surveillance of bacillary dysentery is based on a standardised Shigella typing scheme that classifies Shigella strains into four serogroups and more than 50 serotypes on the basis of biochemical tests and lipopolysaccharide O-antigen serotyping. Real-time genomic surveillance of Shigella infections has been implemented in several countries, but without the use of a standardised typing scheme. Here, we study over 4000 reference strains and clinical isolates of Shigella, covering all serotypes, with both the current serotyping scheme and the standardised EnteroBase core-genome multilocus sequence typing scheme (cgMLST). The Shigella genomes are grouped into eight phylogenetically distinct clusters, within the E. coli species. The cgMLST hierarchical clustering (HC) analysis at different levels of resolution (HC2000 to HC400) recognises the natural population structure of Shigella. By contrast, the serotyping scheme is affected by horizontal gene transfer, leading to a conflation of genetically unrelated Shigella strains and a separation of genetically related strains. The use of this cgMLST scheme will facilitate the transition from traditional phenotypic typing to routine whole-genome sequencing for the laboratory surveillance of Shigella infections.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Disease Outbreaks
  • Escherichia coli
  • Genome, Bacterial*
  • Genotype
  • Humans
  • Molecular Epidemiology
  • Multigene Family
  • Multilocus Sequence Typing / methods*
  • Phylogeny
  • Shigella / classification*
  • Shigella / genetics*
  • Shigella / isolation & purification*
  • Whole Genome Sequencing