Elimination of the uninformative calibration sample subset in the modified UVE(Uninformative Variable Elimination)-PLS (Partial Least Squares) method

Anal Sci. 2001 Feb;17(2):319-22. doi: 10.2116/analsci.17.319.

Abstract

In order to increase the predictive ability of the PLS (Partial Least Squares) model, we have developed a new algorithm, by which uninformative samples which cannot contribute to the model very much are eliminated from a calibration data set. In the proposed algorithm, uninformative wavelength (or independent) variables are eliminated at the first stage by using the modified UVE (Uninformative Variable Elimination)-PLS method that we reported previously. Then, if the prediction error of the ith (1 < or =i< or = n) sample is larger than 3sigma, the corresponding sample is eliminated as uninformative, where n is the total number of calibration samples and sigma is the standard deviation calculated from the other n(-1) samples. Calculation of sigma by the leave-one-out manner enhances the ability to identify the uninformative samples. The final PLS model is constructed precisely because both uninformative wavelength variables and uninformative samples are eliminated. In order to demonstrate the usefulness of the algorithm, we have applied it to two kinds of mid-infrared spectral data sets.