A data fusion model merging information from near infrared spectroscopy and X-ray fluorescence. Searching for atomic-molecular correlations to predict and characterize the composition of coffee blends

Food Chem. 2020 Apr 30:325:126953. doi: 10.1016/j.foodchem.2020.126953. Online ahead of print.

Abstract

This article aims to develop and validate a multivariate model for quantifying Robusta-Arabica coffee blends by combining near infrared spectroscopy (NIRS) and total reflection X-ray fluorescence (TXRF). For this aim, 80 coffee blends (0.0-33.0%) were formulated. NIR spectra were obtained in the wavenumber range 11100-4950 cm-1 and 14 elements were determined by TXRF. Partial least squares models were built using data fusion at low and medium levels. In addition, selection of predictive variables based on their importance indices (SVPII) improved results. The best model reduced the number of variables from 1114 to 75 and root mean square error of prediction from 4.1% to 1.7%. SVPII selected NIR regions correlated with coffee components, and the following elements were chosen: Ti, Mn, Fe, Cu, Zn, Br, Rb, Sr. The model interpretation took advantage of the data fusion between atomic and molecular spectra in order to characterize the differences between these coffee varieties.

Keywords: Atomic spectroscopy; Coffee quality; Data fusion; PLS; Variable selection; Vibrational spectroscopy.