Semi-Supervised Deep Learning-Based Multi-component Spectral Calibration Modeling for UV-vis and Near-Infrared Spectroscopy without Information Loss

Anal Chem. 2023 Sep 12;95(36):13446-13455. doi: 10.1021/acs.analchem.3c01132. Epub 2023 Aug 28.

Abstract

Spectral analysis is an important method for characterizing and identifying chemical species. However, quantitative spectral analysis of multiple chemical properties in the real world has always been a challenging problem due to the strong correlation, massive noise, and serious information overlapping of the spectral features. Here, we present a new semi-supervised spectral calibration method based on information lossless decoupling of spectral features named NICEM. To realize the separation and extraction of key latent features, the method uses the flow-based model non-linear independent component estimation (NICE) to learn the sample distribution. The spectral data information is transformed into independent latent variables obeying Gaussian distribution by the reversible structure of deep network without information loss, so as to find the essential properties and realize the feature nonlinear decomposition. Moreover, the association between the input latent feature variables and attributes is evaluated by the maximum mutual information coefficient to eliminate the adverse effects of irrelevant information in the latent variable space and mine key information. Since the latent variables are independent in each dimension, the NICEM method is easier to establish an accurate semi-supervised multi-component calibration model even for high overlapping and complex spectral data. The applicability of the proposed spectral modeling method is demonstrated by using three ultraviolet-visible and near-infrared spectral data sets with 15 physical and chemical properties including diesel fuels, corn, and multi-metal ions solution. Results show that the proposed NICEM method has the highest determination coefficient (R2) and significantly improves extrapolation compared with the seven state-of-the-art methods. The proposed method is intuitive because it obviates complex feature engineering and prior knowledge and is a promising spectral calibration tool for quantitative analysis in other spectroscopy applications.