Preprocessing of NMR metabolomics data

Scand J Clin Lab Invest. 2015 May;75(3):193-203. doi: 10.3109/00365513.2014.1003593.

Abstract

Metabolomics involves the large scale analysis of metabolites and thus, provides information regarding cellular processes in a biological sample. Independently of the analytical technique used, a vast amount of data is always acquired when carrying out metabolomics studies; this results in complex datasets with large amounts of variables. This type of data requires multivariate statistical analysis for its proper biological interpretation. Prior to multivariate analysis, preprocessing of the data must be carried out to remove unwanted variation such as instrumental or experimental artifacts. This review aims to outline the steps in the preprocessing of NMR metabolomics data and describe some of the methods to perform these. Since using different preprocessing methods may produce different results, it is important that an appropriate pipeline exists for the selection of the optimal combination of methods in the preprocessing workflow.

Keywords: Baseline correction; NMR spectroscopy; biological marker; metabolomics; multivariate analysis; normalization; peak alignment; scaling; statistical data interpretation; variable selection.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Algorithms
  • Data Interpretation, Statistical
  • Humans
  • Magnetic Resonance Spectroscopy / methods*
  • Magnetic Resonance Spectroscopy / standards
  • Metabolome*
  • Metabolomics
  • Multivariate Analysis
  • Reference Standards