Procrustes Cross-Validation of short datasets in PCA context

Talanta. 2021 May 1:226:122104. doi: 10.1016/j.talanta.2021.122104. Epub 2021 Jan 15.

Abstract

We suggest using a new tool, Procrustes cross-validation, as an alternative to a regular cross-validation for short datasets where each sample is important and, therefore, cannot be removed in line with the conventional leave-one-out cross-validation procedure. The advantages of the new approach are demonstrated using two real-world examples: the first one contains discrete variables (chemical profiles). The second one is based on continuous data (spectra). The method is implemented in R and Matlab as a small procedure that any analyst can easily use.

Keywords: Designed data; Procrustes cross-validation; Small data.