Significance: Standardized data processing approaches are required in the field of bio-Raman spectroscopy to ensure information associated with spectral data acquired by different research groups, and with different systems, can be compared on an equal footing.
Aim: An open-sourced data processing software package was developed, implementing algorithms associated with all steps required to isolate the inelastic scattering component from signals acquired using Raman spectroscopy devices. The package includes a novel morphological baseline removal technique (BubbleFill) that provides increased adaptability to complex baseline shapes compared to current gold standard techniques. Also incorporated in the package is a versatile tool simulating spectroscopic data with varying levels of Raman signal-to-background ratios, baselines with different morphologies, and varying levels of stochastic noise.
Results: Application of the BubbleFill technique to simulated data demonstrated superior baseline removal performance compared to standard algorithms, including iModPoly and MorphBR. The data processing workflow of the open-sourced package was validated in four independent in-human datasets, demonstrating it leads to inter-systems data compatibility.
Conclusions: A new open-sourced spectroscopic data pre-processing package was validated on simulated and real-world in-human data and is now available to researchers and clinicians for the development of new clinical applications using Raman spectroscopy.
Keywords: Raman spectroscopy; fluorescence; machine learning; open-sourced software; optics; tissue optics.
© 2023 The Authors.