Rapid identification of salmonella serovars by using Raman spectroscopy and machine learning algorithm

Talanta. 2023 Feb 1:253:123807. doi: 10.1016/j.talanta.2022.123807. Epub 2022 Sep 8.

Abstract

A widespread and escalating public health problem worldwide is foodborne illness, and foodborne Salmonella infection is one of the most common causes of human illness.For the three most pathogenic Salmonella serotypes, Raman spectroscopy was employed to acquire spectral data.As machine learning offers high efficiency and accuracy, we have chosen the convolutional neural network(CNN), which is suitable for solving multi-classification problems, to do in-depth mining and analysis of Raman spectral data.To optimize the instrument parameters, we compared three laser wavelengths: 532, 638, and 785 nm.Ultimately, the 532 nm wavelength was chosen as the most effective for detecting Salmonella.A pre-processing step is necessary to remove interference from the background noise of the Raman spectrum.Our study compared the effects of five spectral preprocessing methods, Savitzky-Golay smoothing (SG), Multivariate Scatter Correction (MSC), Standard Normal Variate (SNV), and Hilbert Transform (HT), on the predictive power of CNN models.Accuracy(ACC), Precision, Recall, and F1-score 4 machine learning evaluation indicators are used to evaluate the model performance under different preprocessing methods.In the results, SG combined with SNV was found to be the most accurate spectral pre-processing method for predicting Salmonella serotypes using Raman spectroscopy, achieving an accuracy of 98.7% for the training set and over 98.5% for the test set in CNN model.Pre-processing spectral data using this method yields higher accuracy than other methods.As a conclusion, the results of this study demonstrate that Raman spectroscopy when used in conjunction with a convolutional neural network model enables the rapid identification of three Salmonella serotypes at the single-cell level, and that the model has a great deal of potential for distinguishing between different serotypes of pathogenic bacteria and closely related bacterial species.This is vital to preventing outbreaks of foodborne illness and the spread of foodborne pathogens.

Keywords: Machine learning, convolutional neural network, Salmonella serovars, Identification; Raman spectroscopy.

MeSH terms

  • Humans
  • Machine Learning*
  • Salmonella
  • Spectrum Analysis, Raman*