Optimized variable selection and machine learning models for olive oil quality assessment using portable near infrared spectroscopy

Spectrochim Acta A Mol Biomol Spectrosc. 2023 Dec 15:303:123213. doi: 10.1016/j.saa.2023.123213. Epub 2023 Jul 27.

Abstract

Olive oil is a key component of the Mediterranean diet, rich in antioxidants and beneficial monounsaturated fatty acids. As a result, high-quality olive oil is in great demand, with its price varying depending on its quality. Traditional chemical tests for assessing olive oil quality are expensive and time-consuming. To address these limitations, this study explores the use of near infrared spectroscopy (NIRS) in predicting key quality parameters of olive oil, including acidity, K232, and K270. To this end, a set of 200 olive oil samples was collected from various agricultural regions of Morocco, covering all three quality categories (extra virgin, virgin, and ordinary virgin). The findings of this study have implications for reducing analysis time and costs associated with olive oil quality assessment. To predict olive oil quality parameters, chemical analysis was conducted in accordance with international standards, while the spectra were obtained using a portable NIR spectrometer. Partial least squares regression (PLSR) was employed along with various variable selection algorithms to establish the relationship between wavelengths and chemical data in order to accurately predict the quality parameters. Through this approach, the study aimed to enhance the efficiency and accuracy of olive oil quality assessment. The obtained results show that NIRS combined with machine learning accurately predicted the acidity using iPLS methods for variable selection, it generates a PLSR with coefficients of determination R2 = 0.94, root mean square error RMSE = 0.32 and ratios of standard error of performance to standard deviation RPD = 4.2 for the validation set. Also, the use of variable selection methods improves the quality of the prediction. For K232 and K270 the NIRS shows moderate prediction performance, it gave an R2 between 0.60 and 0.75. Generally, the results showed that it was possible to predict acidity K232, and K270 parameters with excellent to moderate accuracy for the two last parameters. Moreover, it was also possible to distinguish between different quality groups of olive oil using the principal component analysis PCA, and the use of variable selection helps to use the useful wavelength for the prediction olive oil using a portable NIR spectrometer.

Keywords: Acidity; K232/K270; Machine learning; Near infrared spectroscopy; Olive oil quality; Variable selection.

MeSH terms

  • Agriculture
  • Antioxidants*
  • Least-Squares Analysis
  • Olive Oil / analysis
  • Spectroscopy, Near-Infrared* / methods

Substances

  • Olive Oil
  • Antioxidants