Fast and automated biomarker detection in breath samples with machine learning

PLoS One. 2022 Apr 12;17(4):e0265399. doi: 10.1371/journal.pone.0265399. eCollection 2022.

Abstract

Volatile organic compounds (VOCs) in human breath can reveal a large spectrum of health conditions and can be used for fast, accurate and non-invasive diagnostics. Gas chromatography-mass spectrometry (GC-MS) is used to measure VOCs, but its application is limited by expert-driven data analysis that is time-consuming, subjective and may introduce errors. We propose a machine learning-based system to perform GC-MS data analysis that exploits deep learning pattern recognition ability to learn and automatically detect VOCs directly from raw data, thus bypassing expert-led processing. We evaluate this new approach on clinical samples and with four types of convolutional neural networks (CNNs): VGG16, VGG-like, densely connected and residual CNNs. The proposed machine learning methods showed to outperform the expert-led analysis by detecting a significantly higher number of VOCs in just a fraction of time while maintaining high specificity. These results suggest that the proposed novel approach can help the large-scale deployment of breath-based diagnosis by reducing time and cost, and increasing accuracy and consistency.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers / analysis
  • Breath Tests* / methods
  • Gas Chromatography-Mass Spectrometry / methods
  • Humans
  • Machine Learning
  • Volatile Organic Compounds* / analysis

Substances

  • Biomarkers
  • Volatile Organic Compounds

Grants and funding

The study was partially funded by the EU H2020 TOXI-Triage Project #653409. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.