Development of PCA-MLP Model Based on Visible and Shortwave Near Infrared Spectroscopy for Authenticating Arabica Coffee Origins

Foods. 2023 May 24;12(11):2112. doi: 10.3390/foods12112112.

Abstract

Arabica coffee, one of Indonesia's economically important coffee commodities, is commonly subject to fraud due to mislabeling and adulteration. In many studies, spectroscopic techniques combined with chemometric methods have been massively employed in classification issues, such as principal component analysis (PCA) and discriminant analyses, compared to machine learning models. In this study, spectroscopy combined with PCA and a machine learning algorithm (artificial neural network, ANN) were developed to verify the authenticity of Arabica coffee collected from four geographical origins in Indonesia, including Temanggung, Toraja, Gayo, and Kintamani. Spectra from pure green coffee were collected from Vis-NIR and SWNIR spectrometers. Several preprocessing techniques were also applied to attain precise information from spectroscopic data. First, PCA compressed spectroscopic information and generated new variables called PCs scores, which would become inputs for the ANN model. The discrimination of Arabica coffee from different origins was conducted with a multilayer perceptron (MLP)-based ANN model. The accuracy attained ranged from 90% to 100% in the internal cross-validation, training, and testing sets. The error in the classification process did not exceed 10%. The generalization ability of the MLP combined with PCA was superior, suitable, and successful for verifying the origin of Arabica coffee.

Keywords: Arabica coffee; authentication; multilayer perceptron; principal component analysis; spectroscopy.