A modular computational framework for automated peak extraction from ion mobility spectra

BMC Bioinformatics. 2014 Jan 22:15:25. doi: 10.1186/1471-2105-15-25.

Abstract

Background: An ion mobility (IM) spectrometer coupled with a multi-capillary column (MCC) measures volatile organic compounds (VOCs) in the air or in exhaled breath. This technique is utilized in several biotechnological and medical applications. Each peak in an MCC/IM measurement represents a certain compound, which may be known or unknown. For clustering and classification of measurements, the raw data matrix must be reduced to a set of peaks. Each peak is described by its coordinates (retention time in the MCC and reduced inverse ion mobility) and shape (signal intensity, further shape parameters). This fundamental step is referred to as peak extraction. It is the basis for identifying discriminating peaks, and hence putative biomarkers, between two classes of measurements, such as a healthy control group and a group of patients with a confirmed disease. Current state-of-the-art peak extraction methods require human interaction, such as hand-picking approximate peak locations, assisted by a visualization of the data matrix. In a high-throughput context, however, it is preferable to have robust methods for fully automated peak extraction.

Results: We introduce PEAX, a modular framework for automated peak extraction. The framework consists of several steps in a pipeline architecture. Each step performs a specific sub-task and can be instantiated by different methods implemented as modules. We provide open-source software for the framework and several modules for each step. Additionally, an interface that allows easy extension by a new module is provided. Combining the modules in all reasonable ways leads to a large number of peak extraction methods. We evaluate all combinations using intrinsic error measures and by comparing the resulting peak sets with an expert-picked one.

Conclusions: Our software PEAX is able to automatically extract peaks from MCC/IM measurements within a few seconds. The automatically obtained results keep up with the results provided by current state-of-the-art peak extraction methods. This opens a high-throughput context for the MCC/IM application field. Our software is available at http://www.rahmannlab.de/research/ims.

MeSH terms

  • Biomarkers / analysis
  • Breath Tests
  • Case-Control Studies
  • Computational Biology / methods*
  • Humans
  • Ions / analysis
  • Signal Processing, Computer-Assisted*
  • Software*
  • Spectrum Analysis / methods*
  • Volatile Organic Compounds / analysis*

Substances

  • Biomarkers
  • Ions
  • Volatile Organic Compounds