A Vibrio cholerae Core Genome Multilocus Sequence Typing Scheme To Facilitate the Epidemiological Study of Cholera

J Bacteriol. 2020 Nov 19;202(24):e00086-20. doi: 10.1128/JB.00086-20. Print 2020 Nov 19.

Abstract

Core genome multilocus sequence typing (cgMLST) has gained popularity in recent years in epidemiological research and subspecies-level classification. cgMLST retains the intuitive nature of traditional MLST but offers much greater resolution by utilizing significantly larger portions of the genome. Here, we introduce a cgMLST scheme for Vibrio cholerae, a bacterium abundant in marine and freshwater environments and the etiologic agent of cholera. A set of 2,443 core genes ubiquitous in V. cholerae were used to analyze a comprehensive data set of 1,262 clinical and environmental strains collected from 52 countries, including 65 newly sequenced genomes in this study. We established a sublineage threshold based on 133 allelic differences that creates clusters nearly identical to traditional MLST types, providing backwards compatibility to new cgMLST classifications. We also defined an outbreak threshold based on seven allelic differences that is capable of identifying strains from the same outbreak and closely related isolates that could give clues on outbreak origin. Using cgMLST, we confirmed the South Asian origin of modern epidemics and identified clustering affinity among sublineages of environmental isolates from the same geographic origin. Advantages of this method are highlighted by direct comparison with existing classification methods, such as MLST and single-nucleotide polymorphism-based methods. cgMLST outperforms all existing methods in terms of resolution, standardization, and ease of use. We anticipate this scheme will serve as a basis for a universally applicable and standardized classification system for V. cholerae research and epidemiological surveillance in the future. This cgMLST scheme is publicly available on PubMLST (https://pubmlst.org/vcholerae/).IMPORTANCE Toxigenic Vibrio cholerae isolates of the O1 and O139 serogroups are the causative agents of cholera, an acute diarrheal disease that plagued the world for centuries, if not millennia. Here, we introduce a core genome multilocus sequence typing scheme for V. cholerae Using this scheme, we have standardized the definition for subspecies-level classification, facilitating global collaboration in the surveillance of V. cholerae In addition, this typing scheme allows for quick identification of outbreak-related isolates that can guide subsequent analyses, serving as an important first step in epidemiological research. This scheme is also easily scalable to analyze thousands of isolates at various levels of resolution, making it an invaluable tool for large-scale ecological and evolutionary analyses.

Keywords: Vibrio cholerae; cgMLST; cholera; core genome; epidemiological surveillance; gene-by-gene approach; multilocus sequence typing; whole-genome sequencing.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles
  • Bacterial Typing Techniques / methods*
  • Cholera / epidemiology
  • Cholera / microbiology*
  • Epidemiologic Studies
  • Genome, Bacterial
  • Genotype
  • Humans
  • Multilocus Sequence Typing / methods*
  • Phylogeny
  • Polymorphism, Single Nucleotide
  • Vibrio cholerae / classification
  • Vibrio cholerae / genetics*
  • Vibrio cholerae / isolation & purification
  • Yemen / epidemiology