Theoretical modeling and machine learning-based data processing workflows in comprehensive two-dimensional gas chromatography-A review

J Chromatogr A. 2023 Nov 22:1711:464467. doi: 10.1016/j.chroma.2023.464467. Epub 2023 Oct 19.

Abstract

In recent years, comprehensive two-dimensional gas chromatography (GC × GC) has been gradually gaining prominence as a preferred method for the analysis of complex samples due to its higher peak capacity and resolution power compared to conventional gas chromatography (GC). Nonetheless, to fully benefit from the capabilities of GC × GC, a holistic approach to method development and data processing is essential for a successful and informative analysis. Method development enables the fine-tuning of the chromatographic separation, resulting in high-quality data. While generating such data is pivotal, it does not necessarily guarantee that meaningful information will be extracted from it. To this end, the first part of this manuscript reviews the importance of theoretical modeling in achieving good optimization of the separation conditions, ultimately improving the quality of the chromatographic separation. Multiple theoretical modeling approaches are discussed, with a special focus on thermodynamic-based modeling. The second part of this review highlights the importance of establishing robust data processing workflows, with a special emphasis on the use of advanced data processing tools such as, Machine Learning (ML) algorithms. Three widely used ML algorithms are discussed: Random Forest (RF), Support Vector Machine (SVM), and Partial Least Square-Discriminate Analysis (PLS-DA), highlighting their role in discovery-based analysis.

Keywords: Comprehensive two-dimensional gas chromatography; Data processing; Machine Learning; Method development; Modeling.

Publication types

  • Review

MeSH terms

  • Algorithms*
  • Chromatography, Gas / methods
  • Support Vector Machine*
  • Thermodynamics
  • Workflow