A hybrid variable selection method combining Fisher's linear discriminant combined population analysis and an improved binary cuckoo search algorithm

Anal Methods. 2024 Feb 15;16(7):1021-1033. doi: 10.1039/d3ay01942j.

Abstract

In this paper, a novel hybrid variable selection method for model building by near-infrared (NIR) spectroscopy is proposed for composition measurement in industrial processes. A double-layer structure is designed for variable selection by combining Fisher's linear discriminant combined population analysis (FCPA) and an improved binary cuckoo search algorithm (IBCS). The Fisher classifier combined with model population analysis is used to select the variable interval wherein the useful variables are roughly located even when strong multicollinearity exists among spectral variables. Opposition-based learning (OBL) and jumping genes (JG) are introduced to improve the binary cuckoo search algorithm for the fine selection of key variables, thus avoiding the loss of excellent solutions due to randomness and the local optimum. Different variable selection methods were used to select variables for beer, corn, and diesel fuel datasets, and the partial least squares (PLS) algorithms were used to build calibration models to predict the original extract concentration of beer, the protein and starch content of corn, and the boiling point of diesel fuel, respectively. The results showed that the proposed PLS modeling method based on FCPA-IBCS has higher fitting accuracy and smaller prediction errors.