Self-organising maps for the exploration and classification of thin-layer chromatograms

Talanta. 2021 Oct 1:233:122460. doi: 10.1016/j.talanta.2021.122460. Epub 2021 May 13.

Abstract

Thin-layer chromatography (TLC) allows the swift analysis of larger sample sets in almost any laboratory. The obtained chromatograms are patterns of coloured zones that are conveniently evaluated and classified by visual inspection. This manual approach reaches its limit when several dozens or a few hundred samples need to be evaluated. Methods to classify TLCs automatically and objectively have been explored but without a definitive conclusion; established methods, such as principal component analysis, suffer from the variability of the data, while contemporary omics methods were constructed for the analysis of large numbers of highly resolved analyses. Self-organizing maps (SOMs) are an algorithm for unsupervised learning that reduces higher dimensional datasets to a two-dimensional map, locating similar samples close to each other. It tolerates small variations between samples of the same type. We investigated the capability of SOMs for the evaluation of TLCs with two sample sets. With the first one (495 analyses of essential oils), it was confirmed that SOMs arrange the same type of sample in a common region. The obtained multi-class maps were used to classify a test set and to explore the causes for the few misclassifications (<3%). With the second test set (50 extracts of experimental wheats), the effects of a greater variability within substance classes was explored. With SOMs, it was possible to single out the exceptional samples that warranted a more detailed investigation. In addition, the SOM quality control index method was tested. It proved to be considerably stricter than the classification with a SOM of all samples. When this method was unable to classify a sample correctly, it would flag the sample for inspection, as it gave either multiple assignments or none at all. The combination of SOMs and TLC - two accessible analytical tools - can be most useful for the unsupervised classification of samples by TLC, and to identify samples that stand out from a set and are therefore worth the investment into additional analyses with more complex or expensive methods.

Keywords: Anthocyanins; Essential oils; Principal component analysis; SOMQC Index; Wheat.

MeSH terms

  • Algorithms*
  • Magnetic Resonance Imaging
  • Neural Networks, Computer*
  • Principal Component Analysis
  • Quality Control