Multivariate analytics of chromatographic data: Visual computing based on moving window factor models

J Chromatogr B Analyt Technol Biomed Life Sci. 2018 Aug 15:1092:179-190. doi: 10.1016/j.jchromb.2018.06.010. Epub 2018 Jun 5.

Abstract

Chromatography is one of the most versatile unit operations in the biotechnological industry. Regulatory initiatives like Process Analytical Technology and Quality by Design led to the implementation of new chromatographic devices. Those represent an almost inexhaustible source of data. However, the analysis of large datasets is complicated, and significant amounts of information stay hidden in big data. Here we present a new, top-down approach for the systematic analysis of chromatographic datasets. It is the goal of this approach to analyze the dataset as a whole, starting with the most important, global information. The workflow should highlight interesting regions (outliers, drifts, data inconsistencies), and help to localize those regions within a multi-dimensional space in a straightforward way. Moving window factor models were used to extract the most important information, focusing on the differences between samples. The prototype was implemented as an interactive visualization tool for the explorative analysis of complex datasets. We found that the tool makes it convenient to localize variances in a multidimensional dataset and allows to differentiate between explainable and unexplainable variance. Starting with one global difference descriptor per sample, the analysis ends up with highly resolute temporally dependent difference descriptor values, thought as a starting point for the detailed analysis of the underlying raw data.

Keywords: Chromatography; Data analysis; Outlier detection; Visualization.

MeSH terms

  • Algorithms
  • Chromatography*
  • Data Interpretation, Statistical*
  • Databases, Factual
  • Multivariate Analysis*