Computational strategies for metabolite identification in metabolomics

Bioanalysis. 2009 Dec;1(9):1579-96. doi: 10.4155/bio.09.138.

Abstract

Most metabolomic data are characterized by complex spectra or chromatograms containing hundreds of peaks or features. While there are many methods for aligning or comparing these spectral features, there are few approaches for actually identifying which peaks match to which compounds. Indeed, one of the biggest unmet needs in the field of metabolomics lies in the problem of compound identification. This review describes some of the newly emerging computational strategies in metabolomics that are being used to aid in the identification of metabolites from biofluid mixtures analyzed by NMR and MS. The most successful compound-identification strategies typically involve matching spectral features of the unknown compound(s) to curated spectral databases of reference compounds. This approach is known as the identification of 'known unknowns'. However, the identification of truly novel compounds (the 'unknown unknowns') is particularly challenging and requires the use of computer-aided structure elucidation methods being applied to the purified compound. The strengths and limitations of these approaches as applied to different analytical technologies (GC-MS, LC-MS and NMR) will be discussed, as will prospects for potential improvements to existing strategies.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Animals
  • Chromatography, Liquid / methods*
  • Gas Chromatography-Mass Spectrometry / methods*
  • Humans
  • Image Processing, Computer-Assisted*
  • Magnetic Resonance Spectroscopy / methods*
  • Metabolomics / methods*