Comparative multiple quantitative structure-retention relationships modeling of gas chromatographic retention time of essential oils using multiple linear regression, principal component regression, and partial least squares techniques

J Chromatogr A. 2009 Jul 3;1216(27):5302-12. doi: 10.1016/j.chroma.2009.05.016. Epub 2009 May 15.

Abstract

Quantitative structure-retention relationships (QSRR) models were built for a data set consisting of 96 essential oils and used to predict their gas chromatographic (GC) retention times (t(R)). Multiple linear regression (MLR), principal component regression (PCR), and partial least squares (PLS) have been applied to build different QSRR models by using 13 nonzero E-state indexes and 56 descriptors calculated from TSAR software. The three chemometric methods (MLR, PCR, and PLS) for evaluation of GC t(R) values of essential oils have been compared. The best model based on the whole data set derived from MLR model (model M2) appears to be the best predictive power (r(2)=0.9689 and q(2)=0.9631) for this data set. The whole data set was splitted into a training set consisting of 72 compounds and a test set consisting of 24 compounds. The model based on the training set derived from MLR offered the highest r(2) of 0.9756 and q(2) of 0.9693. The best model base on the training set obtained from PLS not only showed a good internal predictive power (r(2)=0.9703 and q(2)=0.9633) but also offered the highest external predictive power (R(2)=0.9588 and q(2)(ext)=0.9572). The results showed that two E-state indexes (sssCH and sOH) and five molecular connective indices ((1)chi(B), (2)chi(p), (3)chi(C), (4)chi(C), and (6)chi(p)) closely relate to the GC t(R) values of essential oils. The applicability domain of the QSRR models were defined by control leverage values (h*) and the models can be used to predict the unknown compounds falling in this domain.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromatography, Gas / instrumentation*
  • Chromatography, Gas / statistics & numerical data*
  • Least-Squares Analysis
  • Linear Models
  • Models, Chemical
  • Models, Statistical*
  • Oils, Volatile / chemistry*
  • Principal Component Analysis

Substances

  • Oils, Volatile