Simultaneous classification of multiple classes in NMR metabolomics and vibrational spectroscopy using interval-based classification methods: iECVA vs iPLS-DA

Anal Chim Acta. 2018 Aug 27:1021:20-27. doi: 10.1016/j.aca.2018.03.020. Epub 2018 Mar 29.

Abstract

Interval based chemometric algorithms have proven to be very powerful for spectral alignments, spectral regressions and spectral classifications. The interval-based methods may not only improve the performance, but also reduce model complexity and enhance the spectral interpretation. Extended Canonical Variate Analysis (ECVA) is a powerful method for multiple group classifications of multivariate data and can easily be extended to an interval approach, iECVA. This study outlines the iECVA method and compares its performance to interval Partial Least Squares Discriminant Analysis (iPLS-DA) on three spectroscopic datasets from Nuclear Magnetic Resonance (NMR), Near Infrared (NIR) and Infrared (IR) spectroscopy, respectively. The results invariantly show that the interval-based classification methods greatly enhance the interpretability of the models by identifying important spectral regions, which facilitate interpretation and biomarker discovery. Although the results for the two methods are similar regarding the number of misclassifications and identified important regions, the model complexity of the PLS-DA proved to consistently lower than the ECVA. The Matlab source codes for both iECVA and iPLS-DA are made freely available at www.

Models: life.ku.dk.

Keywords: Biomarkers; Classification; ECVA; Interpretation; Interval; PLS-DA.

MeSH terms

  • Analysis of Variance
  • Discriminant Analysis
  • Least-Squares Analysis
  • Metabolomics*
  • Nuclear Magnetic Resonance, Biomolecular*
  • Spectrophotometry, Infrared