A Monte Carlo resampling based multiple feature-spaces ensemble (MFE) strategy for consistency-enhanced spectral variable selection

Anal Chim Acta. 2023 Oct 23:1279:341782. doi: 10.1016/j.aca.2023.341782. Epub 2023 Sep 8.

Abstract

Background: Variable selection has gained significant attention as a means to enhance spectroscopic calibration performance. However, existing methods still have certain limitations. Firstly, the selection results are sensitive to the choice of training samples, indicating that the selected variables may not be truly relevant. Secondly, the number of the selected variables is still too large in some situations, and modelling with too many predictors may lead to over-fitting issues. To address these challenges, we propose and implement a novel multiple feature-spaces ensemble (MFE) strategy with the least absolute shrinkage and selection operator (LASSO) method.

Results: The MFE strategy synergizes the advantages of LASSO regression and ensemble strategy, thereby facilitating a more robust identification of key variables. We demonstrated the efficacy of our approach through extensive experimentation on publicly available datasets. The results not only demonstrate enhanced consistency in variable selection but also manifest improved prediction performance compared to benchmark methods.

Significant: The MFE strategy provided a comprehensive framework for conducting variable importance analysis, leading to robust and consistent variable selection. Furthermore, the improved consistency in variable selection contributes to enhanced prediction performance for spectroscopic calibration, making it more robust and accurate.

Keywords: Chemometrics; Consistency evolution; Ensemble; LASSO; Multiple feature-spaces; Variable selection.