Identifying non-O157 Shiga toxin-producing Escherichia coli (STEC) using deep learning methods with hyperspectral microscope images

Spectrochim Acta A Mol Biomol Spectrosc. 2020 Jan 5:224:117386. doi: 10.1016/j.saa.2019.117386. Epub 2019 Jul 16.

Abstract

Non-O157 Shiga toxin-producing Escherichia coli (STEC) serogroups such as O26, O45, O103, O111, O121 and O145 often cause illness to people in the United States and the conventional identification of these "Big-Six" are complex. The label-free hyperspectral microscope imaging (HMI) method, which provides spectral "fingerprints" information of bacterial cells, was employed to classify serogroups at the cellular level. In spectral analysis, principal component analysis (PCA) method and stacked auto-encoder (SAE) method were conducted to extract principal spectral features for classification task. Based on these features, multiple classifiers including linear discriminant analysis (LDA), support vector machine (SVM) and soft-max regression (SR) methods were evaluated. Different sizes of datasets were also tested in search for the suitable classification models. Among the results, SAE-based classification models performed better than PCA-based models, achieving classification accuracy of SAE-LDA (93.5%), SAE-SVM (94.9%) and SAE-SR (94.6%), respectively. In contrast, classification results of PCA-based methods such as PCA-LDA, PCA-SVM and PCA-SR were only 75.5%, 85.7% and 77.1%, respectively. The results also suggested the increasing number of training samples have positive effects on classification models. Taking advantage of increasing dataset, the SAE-SR classification model finally performed better than others with average accuracy of 94.9% in classifying STEC serogroups. Specifically, O103 serogroup was classified with the highest accuracy of 97.4%, followed by O111 (96.5%), O26 (95.3%), O121 (95%), O145 (92.9%) and O45 (92.4%), respectively. Thus, the HMI technology coupled with SAE-SR classification model has the potential for "Big-Six" identification.

Keywords: Classification; Food safety; Foodborne bacteria; Machine learning; Optical method; Stacked auto-encoder.

MeSH terms

  • Algorithms
  • Bacterial Typing Techniques / methods*
  • Deep Learning*
  • Food Microbiology
  • Foodborne Diseases / microbiology
  • Humans
  • Image Processing, Computer-Assisted / methods*
  • Microscopy / methods*
  • Optical Imaging / methods
  • Principal Component Analysis
  • Shiga-Toxigenic Escherichia coli* / chemistry
  • Shiga-Toxigenic Escherichia coli* / classification