Spectral Encoder to Extract the Features of Near-Infrared Spectra for Multivariate Calibration

Chaoshu Duan; Xuyang Liu; Wensheng Cai; Xueguang Shao

doi:10.1021/acs.jcim.2c00786

Spectral Encoder to Extract the Features of Near-Infrared Spectra for Multivariate Calibration

J Chem Inf Model. 2022 Aug 22;62(16):3695-3703. doi: 10.1021/acs.jcim.2c00786. Epub 2022 Aug 2.

Authors

Chaoshu Duan^{1

2}, Xuyang Liu^{1

2}, Wensheng Cai^{1

2}, Xueguang Shao^{1

2}

Affiliations

¹ Research Center for Analytical Sciences, Frontiers Science Center for New Organic Matter, College of Chemistry, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin 300071, China.
² Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China.

PMID: 35916486
DOI: 10.1021/acs.jcim.2c00786

Abstract

An autoencoder architecture was adopted for near-infrared (NIR) spectral analysis by extracting the common features in the spectra. Three autoencoder-based networks with different purposes were constructed. First, a spectral encoder was established by training the network with a set of spectra as the input. The features of the spectra can be encoded by the nodes in the bottleneck layer, which in turn can be used to build a sparse and robust model. Second, taking the spectra of one instrument as the input and that of another instrument as the reference output, the common features in both spectra can be obtained in the bottleneck layer. Therefore, in the prediction step, the spectral features of the second can be predicted by taking the reverse of the decoder as the encoder. Furthermore, transfer learning was used to build the model for the spectra of more instruments by fine-tuning the trained network. NIR datasets of plant, wheat, and pharmaceutical tablets measured on multiple instruments were used to test the method. The multi-linear regression (MLR) model with the encoded features was found to have a similar or slightly better performance in prediction compared with the partial least-squares (PLS) model.

Publication types

Review
Research Support, Non-U.S. Gov't

MeSH terms

Calibration
Least-Squares Analysis
Spectroscopy, Near-Infrared* / methods
Tablets

Substances

Tablets