Validation of Whole-Genome Sequencing for Identification and Characterization of Shiga Toxin-Producing Escherichia coli To Produce Standardized Data To Enable Data Sharing

J Clin Microbiol. 2018 Feb 22;56(3):e01388-17. doi: 10.1128/JCM.01388-17. Print 2018 Mar.

Abstract

Whole-genome sequencing (WGS) is rapidly becoming the method of choice for outbreak investigations and public health surveillance of microbial pathogens. The combination of improved cluster resolution and prediction of resistance and virulence phenotypes provided by a single tool is extremely advantageous. However, the data produced are complex, and standard bioinformatics pipelines are required to translate the output into easily interpreted epidemiologically relevant information for public health action. The main aim of this study was to validate the implementation of WGS at the Scottish Escherichia coli O157/STEC Reference Laboratory (SERL) using the Public Health England (PHE) bioinformatics pipeline to produce standardized data to enable interlaboratory comparison of results generated at two national reference laboratories. In addition, we evaluated the BioNumerics whole-genome multilocus sequence typing (wgMLST) and E. coli genotyping plug-in tools using the same data set. A panel of 150 well-characterized isolates of Shiga toxin-producing E. coli (STEC) that had been sequenced and analyzed at PHE using the PHE pipeline and database (SnapperDB) was assembled to provide identification and typing data, including serotype (O:H type), sequence type (ST), virulence genes (eae and Shiga toxin [stx] subtype), and a single-nucleotide polymorphism (SNP) address. To validate the implementation of sequencing at the SERL, DNA was reextracted from the isolates and sequenced and analyzed using the PHE pipeline, which had been installed at the SERL; the output was then compared with the PHE data. The results showed a very high correlation between the data, ranging from 93% to 100%, suggesting that the standardization of WGS between our reference laboratories is possible. We also found excellent correlation between the results obtained using the PHE pipeline and BioNumerics, except for the detection of stx2a and stx2c when these subtypes are both carried by strains.

Keywords: Shiga toxin-producing Escherichia coli; whole-genome sequencing.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • DNA, Bacterial / genetics
  • Databases, Factual / standards*
  • England / epidemiology
  • Escherichia coli Infections / epidemiology
  • Escherichia coli Infections / microbiology*
  • Escherichia coli O157 / genetics
  • Genome, Bacterial / genetics*
  • Humans
  • Information Dissemination*
  • Molecular Epidemiology / standards*
  • Multilocus Sequence Typing
  • Serogroup
  • Shiga-Toxigenic Escherichia coli / genetics*
  • Shiga-Toxigenic Escherichia coli / isolation & purification
  • Whole Genome Sequencing / standards*

Substances

  • DNA, Bacterial