Normalization method for metabolomics data using optimal selection of multiple internal standards

BMC Bioinformatics. 2007 Mar 15:8:93. doi: 10.1186/1471-2105-8-93.

Abstract

Background: Success of metabolomics as the phenotyping platform largely depends on its ability to detect various sources of biological variability. Removal of platform-specific sources of variability such as systematic error is therefore one of the foremost priorities in data preprocessing. However, chemical diversity of molecular species included in typical metabolic profiling experiments leads to different responses to variations in experimental conditions, making normalization a very demanding task.

Results: With the aim to remove unwanted systematic variation, we present an approach that utilizes variability information from multiple internal standard compounds to find optimal normalization factor for each individual molecular species detected by metabolomics approach (NOMIS). We demonstrate the method on mouse liver lipidomic profiles using Ultra Performance Liquid Chromatography coupled to high resolution mass spectrometry, and compare its performance to two commonly utilized normalization methods: normalization by l2 norm and by retention time region specific standard compound profiles. The NOMIS method proved superior in its ability to reduce the effect of systematic error across the full spectrum of metabolite peaks. We also demonstrate that the method can be used to select best combinations of standard compounds for normalization.

Conclusion: Depending on experiment design and biological matrix, the NOMIS method is applicable either as a one-step normalization method or as a two-step method where the normalization parameters, influenced by variabilities of internal standard compounds and their correlation to metabolites, are first calculated from a study conducted in repeatability conditions. The method can also be used in analytical development of metabolomics methods by helping to select best combinations of standard compounds for a particular biological matrix and analytical platform.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Cell Physiological Phenomena*
  • Computer Simulation
  • Databases, Protein / standards*
  • Gene Expression / physiology*
  • Gene Expression Profiling / methods*
  • Gene Expression Profiling / standards*
  • Models, Biological*
  • Proteome / metabolism*
  • Reference Values

Substances

  • Proteome