Randomised SIMPLISMA: Using a dictionary of initial estimates for spectral unmixing in the framework of chemical imaging

Talanta. 2020 Sep 1:217:121024. doi: 10.1016/j.talanta.2020.121024. Epub 2020 Apr 18.

Abstract

Hyperspectral imaging opens the opportunity in analytical chemistry to investigate always more complex samples by the use of Multivariate Curve Resolution - Alternating Least Squares (MCR-ALS) and other signal unmixing techniques, but not without difficulties. Nowadays, one of the principal challenges regarding this kind of analysis is the awkward estimation of the correct chemical rank of the dataset, which represents the total number of pure compounds present in the chemical system. Despite the existence of various algorithms able to focus on this rank evaluation, the method very often used for this task is finally quite simple since it is based on the observation of the eigenvalues generated by the Principal Component Analysis (PCA). Although this method has shown some potential for rank evaluation, it is still difficult to use it on complex and big datasets or when the signal to noise ratio is relatively weak. In this paper, we introduce a new method, based on the SIMPLE-to-use Self-modeling Mixture Analysis (SIMPLISMA) algorithm that we call Randomised SIMPLISMA. The main idea is thus to use random selections of spectra from the initial dataset and to apply the SIMPLISMA approach to each of them. At the end of this step, all selected spectra are observed using PCA where observed clusters can potentially be highlighted and exploited for the tasks we are interested in. With the present paper, we want to highlight in particular the possibility of an easier rank estimation and initial estimates generation when this approach is considered. Datasets of different complexity acquired with various spectroscopic techniques will be explored in order to evaluate the potential of this approach.

Keywords: Big dataset; Hyperspectral imaging; Initial estimates; MCR-ALS; Rank estimation; SIMPLISMA.