Multivariate analysis of NMR-based metabolomic data

NMR Biomed. 2022 Feb;35(2):e4638. doi: 10.1002/nbm.4638. Epub 2021 Nov 5.

Abstract

Nuclear magnetic resonance (NMR) spectroscopy allows for simultaneous detection of a wide range of metabolites and lipids. As metabolites act together in complex metabolic networks, they are often highly correlated, and optimal biological insight is achieved when using methods that take the correlation into account. For this reason, latent-variable-based methods, such as principal component analysis and partial least-squares discriminant analysis, are widely used in metabolomic studies. However, with increasing availability of larger population cohorts, and a shift from analysis of spectral data to using quantified metabolite levels, both more traditional statistical approaches and alternative machine learning methods have become more widely used. This review aims at providing an overview of the current state-of-the-art multivariate methods for the analysis of NMR-based metabolomic data as well as alternative methods, highlighting their strengths and limitations.

Keywords: ASCA; PCA; PLS-DA; clustering; deep learning; machine learning; validation.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Cluster Analysis
  • Deep Learning
  • Least-Squares Analysis
  • Linear Models
  • Logistic Models
  • Magnetic Resonance Spectroscopy / methods*
  • Metabolomics / methods*
  • Principal Component Analysis
  • Proportional Hazards Models