Rapid and Easy In Silico Serotyping of Escherichia coli Isolates by Use of Whole-Genome Sequencing Data

J Clin Microbiol. 2015 Aug;53(8):2410-26. doi: 10.1128/JCM.00008-15. Epub 2015 May 13.

Abstract

Accurate and rapid typing of pathogens is essential for effective surveillance and outbreak detection. Conventional serotyping of Escherichia coli is a delicate, laborious, time-consuming, and expensive procedure. With whole-genome sequencing (WGS) becoming cheaper, it has vast potential in routine typing and surveillance. The aim of this study was to establish a valid and publicly available tool for WGS-based in silico serotyping of E. coli applicable for routine typing and surveillance. A FASTA database of specific O-antigen processing system genes for O typing and flagellin genes for H typing was created as a component of the publicly available Web tools hosted by the Center for Genomic Epidemiology (CGE) (www.genomicepidemiology.org). All E. coli isolates available with WGS data and conventional serotype information were subjected to WGS-based serotyping employing this specific SerotypeFinder CGE tool. SerotypeFinder was evaluated on 682 E. coli genomes, 108 of which were sequenced for this study, where both the whole genome and the serotype were available. In total, 601 and 509 isolates were included for O and H typing, respectively. The O-antigen genes wzx, wzy, wzm, and wzt and the flagellin genes fliC, flkA, fllA, flmA, and flnA were detected in 569 and 508 genome sequences, respectively. SerotypeFinder for WGS-based O and H typing predicted 560 of 569 O types and 504 of 508 H types, consistent with conventional serotyping. In combination with other available WGS typing tools, E. coli serotyping can be performed solely from WGS data, providing faster and cheaper typing than current routine procedures and making WGS typing a superior alternative to conventional typing strategies.

Publication types

  • Evaluation Study

MeSH terms

  • Antigens, Bacterial / genetics*
  • Computational Biology / methods*
  • Computer Simulation
  • Costs and Cost Analysis
  • Escherichia coli / classification*
  • Escherichia coli / genetics*
  • Genome, Bacterial
  • Humans
  • O Antigens / genetics*
  • Sequence Analysis, DNA
  • Serotyping / methods*
  • Time Factors

Substances

  • Antigens, Bacterial
  • H antigen
  • O Antigens