Biomarker selection and a prospective metabolite-based machine learning diagnostic for lyme disease

Sci Rep. 2022 Jan 27;12(1):1478. doi: 10.1038/s41598-022-05451-0.

Abstract

We provide a pipeline for data preprocessing, biomarker selection, and classification of liquid chromatography-mass spectrometry (LCMS) serum samples to generate a prospective diagnostic test for Lyme disease. We utilize tools of machine learning (ML), e.g., sparse support vector machines (SSVM), iterative feature removal (IFR), and k-fold feature ranking to select several biomarkers and build a discriminant model for Lyme disease. We report a 98.13% test balanced success rate (BSR) of our model based on a sequestered test set of LCMS serum samples. The methodology employed is general and can be readily adapted to other LCMS, or metabolomics, data sets.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Biomarkers / blood
  • Biomarkers / metabolism
  • Case-Control Studies
  • Chromatography, High Pressure Liquid / methods
  • Datasets as Topic
  • Healthy Volunteers
  • Humans
  • Lyme Disease / blood
  • Lyme Disease / diagnosis*
  • Mass Spectrometry / methods
  • Metabolomics / methods*
  • Support Vector Machine

Substances

  • Biomarkers