Rapid On-site identification of geographical origin and storage age of tangerine peel by Near-infrared spectroscopy

Spectrochim Acta A Mol Biomol Spectrosc. 2022 Apr 15:271:120936. doi: 10.1016/j.saa.2022.120936. Epub 2022 Jan 29.

Abstract

The feasibility of identifying geographical origin and storage age of tangerine peel was explored by using a handheld near-infrared (NIR) spectrometer combined with machine learning. A handheld NIR spectrometer (900-1700 nm) was used to scan the outer surface of tangerine peel and collect the corresponding NIR diffuse reflectance spectra. Principal component analysis (PCA) combined with Mahalanobis distance were used to detect outliers. The accuracies of all models in the anomaly set were much lower than that in calibration set and test set, indicating that the outliers were effectively identified. After removing the outliers, in order to initially explore the clustering characteristics of tangerine peels, PCA was performed on tangerine peels from different origins and the same origin with different storage ages. The results showed that the tangerine peels from the same origin or the same storage age had the potential to cluster, indicating that the spectral data of the same origin or the same storage age had a certain similarity, which laid the foundation for subsequent modeling and identification. However, there were quite a few samples with different origins or different storage ages overlapped and could not be distinguished from each other. In order to achieve qualitative identification of origin and storage age, Savitzky-Golay convolution smoothing with first derivative (SGFD) and standard normal variate (SNV) were used to preprocess the raw spectra. Random forest (RF), K-nearest neighbor (KNN) and linear discriminant analysis (LDA) were used to establish the discriminant model. The results showed that SGFD-LDA could accurately distinguish the origin and storage age of tangerine peel at the same time. The origin identification accuracy was 96.99%. The storage age identification accuracy was 100% for Guangdong tangerine peel and 97.15% for Sichuan tangerine peel. This indicated that the near-infrared spectroscopy (NIRS) combine with machine learning can simultaneously and rapidly identify the origin and storage age of tangerine peel on site.

Keywords: Machine learning; Near-infrared spectroscopy; Origin; Storage age; Tangerine peel.

MeSH terms

  • Calibration
  • Discriminant Analysis
  • Geography
  • Principal Component Analysis
  • Spectroscopy, Near-Infrared* / methods