Application for Identifying the Origin and Predicting the Physiologically Active Ingredient Contents of Gastrodia elata Blume Using Visible-Near-Infrared Spectroscopy Combined with Machine Learning

Foods. 2023 Nov 8;12(22):4061. doi: 10.3390/foods12224061.

Abstract

Gastrodia elata (G. elata) Blume is widely used as a health product with significant economic, medicinal, and ecological values. Due to variations in the geographical origin, soil pH, and content of organic matter, the levels of physiologically active ingredient contents in G. elata from different origins may vary. Therefore, rapid methods for predicting the geographical origin and the contents of these ingredients are important for the market. This paper proposes a visible-near-infrared (Vis-NIR) spectroscopy technology combined with machine learning. A variety of machine learning models were benchmarked against a one-dimensional convolutional neural network (1D-CNN) in terms of accuracy. In the origin identification models, the 1D-CNN demonstrated excellent performance, with the F1 score being 1.0000, correctly identifying the 11 origins. In the quantitative models, the 1D-CNN outperformed the other three algorithms. For the prediction set of eight physiologically active ingredients, namely, GA, HA, PE, PB, PC, PA, GA + HA, and total, the RMSEP values were 0.2881, 0.0871, 0.3387, 0.2485, 0.0761, 0.7027, 0.3664, and 1.2965, respectively. The Rp2 values were 0.9278, 0.9321, 0.9433, 0.9094, 0.9454, 0.9282, 0.9173, and 0.9323, respectively. This study demonstrated that the 1D-CNN showed highly accurate non-linear descriptive capability. The proposed combinations of Vis-NIR spectroscopy with 1D-CNN models have significant potential in the quality evaluation of G. elata.

Keywords: 1D-CNN; Gastrodia elata Blume; geographical origin; physiologically active ingredients; visible–near-infrared spectroscopy.