Raman spectroscopy combined with partial least squares (PLS) based on hybrid spectral preprocessing and backward interval PLS (biPLS) for quantitative analysis of four PAHs in oil sludge

Spectrochim Acta A Mol Biomol Spectrosc. 2024 Apr 5:310:123953. doi: 10.1016/j.saa.2024.123953. Epub 2024 Jan 24.

Abstract

Polycyclic aromatic hydrocarbons (PAHs) contained in a large amount of oily sludge produced in petroleum and petrochemical production has become one of the main environmental protection concerns in the industry. The accurate determination of PAHs is of great significance in the field of petroleum geochemistry and environmental protection. In this study, Raman spectroscopy combined with partial least squares (PLS) based on different hybrid spectral preprocessing methods and variable selection strategies was proposed for quantitative analysis of phenanthrene, fluoranthrene, fluorene and naphthalene (Phe, Flt, Flu and Nap) in oil sludge. At first, PAHs in oily sludge was extracted by solid-liquid extraction with methanol as extractant, and Raman spectra of 21 oily sludge samples were collected by portable Raman spectrometer. And then, the influence of first derivative (D1st), wavelet transform (WT) and their hybrid spectral preprocessing on the predictive performance of the PLS calibration model was discussed. Thirdly, biPLS (backward interval partial least squares) was used to optimize the input variables before and after the hybrid spectral preprocessing methods, and the influence of biPLS and the hybrid spectral preprocessing sequence on the predictive performance of the PLS calibration model was discussed. Finally, the predictive performance of the PLS calibration model was optimized according to the results of leave-one-out cross-validation (LOOCV) method. The results show that the biPLS-D1st-WT-PLS calibration model established by using biPLS first to select the characteristic variables, followed by hybrid spectral preprocessing of the characteristic variables, has better prediction performance for Flt (determination coefficient of prediction (R2P) = 0.9987, and the mean relative error of prediction (MREP) = 0.0606). For Phe, Flu and Nap, the WT-biPLS-PLS calibration model has a better predictive effect (R2P are 0.9995, 0.9996 and 0.9983, and MREP are 0.0426, 0.0719 and 0.0497, respectively). In general, portable Raman spectroscopy combined with PLS calibration model based on different hybrid spectral preprocessing and variable selection strategies has achieved good prediction results for quantitative analysis of four PAHs in oily sludge. It is a new strategy to firstly select the characteristic variables of the original spectra, and secondly to preprocess the characteristic variables by the hybrid spectral preprocessing, which will provide a new idea for the establishment of quantitative analysis methods for PAHs in oily sludge.

Keywords: Backward interval partial least squares (biPLS); Hybrid spectral preprocessing; PAHs; Partial least squares (PLS); Raman spectroscopy.