Rapid discrimination of Bifidobacterium longum subspecies based on MALDI-TOF MS and machine learning

Front Microbiol. 2023 Dec 4:14:1297451. doi: 10.3389/fmicb.2023.1297451. eCollection 2023.

Abstract

Although MALDI-TOF mass spectrometry (MS) is widely known as a rapid and cost-effective reference method for identifying microorganisms, its commercial databases face limitations in accurately distinguishing specific subspecies of Bifidobacterium. This study aimed to explore the potential of MALDI-TOF MS protein profiles, coupled with prediction methods, to differentiate between Bifidobacterium longum subsp. infantis (B. infantis) and Bifidobacterium longum subsp. longum (B. longum). The investigation involved the analysis of mass spectra of 59 B. longum strains and 41 B. infantis strains, leading to the identification of five distinct biomarker peaks, specifically at m/z 2,929, 4,408, 5,381, 5,394, and 8,817, using Recurrent Feature Elimination (RFE). To facilate classification between B. longum and B. infantis based on the mass spectra, machine learning models were developed, employing algorithms such as logistic regression (LR), random forest (RF), and support vector machine (SVM). The evaluation of the mass spectrometry data showed that the RF model exhibited the highest performace, boasting an impressive AUC of 0.984. This model outperformed other algorithms in terms of accuracy and sensitivity. Furthermore, when employing a voting mechanism on multi-mass spectrometry data for strain identificaton, the RF model achieved the highest accuracy of 96.67%. The outcomes of this research hold the significant potential for commercial applications, enabling the rapid and precise discrimination of B. longum and B. infantis using MALDI-TOF MS in conjunction with machine learning. Additionally, the approach proposed in this study carries substantial implications across various industries, such as probiotics and pharmaceuticals, where the precise differentiation of specific subspecies is essential for product development and quality control.

Keywords: B. infantis; B. longum; Bifidobacterium longum subspecies; MALDI-TOF MS; identification; machine learning.

Grants and funding

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.