Near-Infrared Spectroscopy Coupled Chemometric Algorithms for Rapid Origin Identification and Lipid Content Detection of Pinus Koraiensis Seeds

Sensors (Basel). 2020 Aug 30;20(17):4905. doi: 10.3390/s20174905.

Abstract

Lipid content is an important indicator of the edible and breeding value of Pinus koraiensis seeds. Difference in origin will affect the lipid content of the inner kernel, and neither can be judged by appearance or morphology. Traditional chemical methods are small-scale, time-consuming, labor-intensive, costly, and laboratory-dependent. In this study, near-infrared (NIR) spectroscopy combined with chemometrics was used to identify the origin and lipid content of P. koraiensis seeds. Principal component analysis (PCA), wavelet transformation (WT), Monte Carlo (MC), and uninformative variable elimination (UVE) methods were used to process spectral data and the prediction models were established with partial least-squares (PLS). Models were evaluated by R2 for calibration and prediction sets, root mean standard error of cross-validation (RMSECV), and root mean square error of prediction (RMSEP). Two dimensions of input data produced a faster and more accurate PLS model. The accuracy of the calibration and prediction sets was 98.75% and 97.50%, respectively. When the Donoho Thresholding wavelet filter 'bior4.4' was selected, the WT-MC-UVE-PLS regression model had the best predictions. The R2 for the calibration and prediction sets was 0.9485 and 0.9369, and the RMSECV and RMSEP were 0.0098 and 0.0390, respectively. NIR technology combined with chemometric algorithms can be used to characterize P. koraiensis seeds.

Keywords: NIR spectroscopy; Pinus koraiensis seeds; chemometric algorithms; feature selection; preprocessing.

MeSH terms

  • Algorithms
  • Calibration
  • Least-Squares Analysis
  • Lipids / analysis*
  • Pinus*
  • Plant Breeding
  • Seeds
  • Spectroscopy, Near-Infrared*

Substances

  • Lipids