Using Machine Learning and Silver Nanoparticle-Based Surface-Enhanced Raman Spectroscopy for Classification of Cardiovascular Disease Biomarkers

ACS Appl Nano Mater. 2023 Aug 22;6(17):15385-15396. doi: 10.1021/acsanm.3c01442. eCollection 2023 Sep 8.

Abstract

Characterizing complex biofluids using surface-enhanced Raman spectroscopy (SERS) coupled with machine learning (ML) has been proposed as a powerful tool for point-of-care detection of clinical disease. ML is well-suited to categorizing otherwise uninterpretable, patient-derived SERS spectra that contain a multitude of low concentration, disease-specific molecular biomarkers among a dense spectral background of biological molecules. However, ML can generate false, non-generalizable models when data sets used for model training are inadequate. It is thus critical to determine how different SERS experimental methodologies and workflow parameters can potentially impact ML disease classification of clinical samples. In this study, a label-free, broadband, Ag nanoparticle-based SERS platform was coupled with ML to assess simulated clinical samples for cardiovascular disease (CVD), containing randomized combinations of five key CVD biomarkers at clinically relevant concentrations in serum. Raman spectra obtained at 532, 633, and 785 nm from up to 300 unique samples were classified into physiological and pathological categories using two standard ML models. Label-free SERS and ML could correctly classify randomized CVD samples with high accuracies of up to 90.0% at 532 nm using as few as 200 training samples. Spectra obtained at 532 nm produced the highest accuracies with no significant increase achieved using multiwavelength SERS. Sample preparation and measurement methodologies (e.g., different SERS substrate lots, sample volumes, sample sizes, and known variations in randomization and experimental handling) were shown to strongly influence the ML classification and could artificially increase classification accuracies by as much as 27%. This detailed investigation into the proper application of ML techniques for CVD classification can lead to improved data set acquisition required for the SERS community, such that ML on labeled and robust SERS data sets can be practically applied for future point-of-care testing in patients.