Metabolite profile analysis: from raw data to regression and classification

Physiol Plant. 2008 Feb;132(2):150-61. doi: 10.1111/j.1399-3054.2007.01006.x.

Abstract

Successful metabolic profile analysis will aid in the fundamental understanding of physiology. Here, we present a possible analysis workflow. Initially, the procedure to transform raw data into a data matrix containing relative metabolite levels for each sample is described. Given that, because of experimental issues in the technical equipment, the levels of some metabolites cannot be universally determined or that different experiments need to be compared, missing value estimation and normalization are presented as helpful preprocessing steps. Regression methods are presented in this review as tools to relate metabolite levels with other physiological properties like biomass and gene expression. As the number of measured metabolites often exceeds the number of samples, dimensionality reduction methods are required. Two of these methods are discussed in detail in this review. Throughout this article, practical examples illustrating the application of the aforementioned methods are given. We focus on the uncovering the relationship between metabolism and growth-related properties.

Publication types

  • Review

MeSH terms

  • Arabidopsis / genetics
  • Arabidopsis / growth & development
  • Arabidopsis / metabolism
  • Biomass
  • Computational Biology / methods
  • Electronic Data Processing / methods
  • Gene Expression Profiling
  • Plant Development
  • Plants / genetics*
  • Plants / metabolism*
  • Principal Component Analysis
  • Regression Analysis