SIMA: simultaneous multiple alignment of LC/MS peak lists

Bioinformatics. 2011 Apr 1;27(7):987-93. doi: 10.1093/bioinformatics/btr051. Epub 2011 Feb 3.

Abstract

Motivation: Alignment of multiple liquid chromatography/mass spectrometry (LC/MS) experiments is a necessity today, which arises from the need for biological and technical repeats. Due to limits in sampling frequency and poor reproducibility of retention times, current LC systems suffer from missing observations and non-linear distortions of the retention times across runs. Existing approaches for peak correspondence estimation focus almost exclusively on solving the pairwise alignment problem, yielding straightforward but suboptimal results for multiple alignment problems.

Results: We propose SIMA, a novel automated procedure for alignment of peak lists from multiple LC/MS runs. SIMA combines hierarchical pairwise correspondence estimation with simultaneous alignment and global retention time correction. It employs a tailored multidimensional kernel function and a procedure based on maximum likelihood estimation to find the retention time distortion function that best fits the observed data. SIMA does not require a dedicated reference spectrum, is robust with regard to outliers, needs only two intuitive parameters and naturally incorporates incomplete correspondence information. In a comparison with seven alternative methods on four different datasets, we show that SIMA yields competitive and superior performance on real-world data.

Availability: A C++ implementation of the SIMA algorithm is available from http://hci.iwr.uni-heidelberg.de/MIP/Software.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Chromatography, Liquid / methods*
  • Mass Spectrometry / methods*