Graph-based peak alignment algorithms for multiple liquid chromatography-mass spectrometry datasets

Bioinformatics. 2013 Oct 1;29(19):2469-76. doi: 10.1093/bioinformatics/btt435. Epub 2013 Jul 30.

Abstract

Liquid chromatography coupled to mass spectrometry (LC-MS) is the dominant technological platform for proteomics. An LC-MS analysis of a complex biological sample can be visualized as a 'map' of which the positional coordinates are the mass-to-charge ratio (m/z) and chromatographic retention time (RT) of the chemical species profiled. Label-free quantitative proteomics requires the alignment and comparison of multiple LC-MS maps to ascertain the reproducibility of experiments or reveal proteome changes under different conditions. The main challenge in this task lies in correcting inevitable RT shifts. Similar, but not identical, LC instruments and settings can cause peptides to elute at very different times and sometimes in a different order, violating the assumptions of many state-of-the-art alignment tools. To meet this challenge, we developed LWBMatch, a new algorithm based on weighted bipartite matching. Unlike existing tools, which search for accurate warping functions to correct RT shifts, we directly seek a peak-to-peak mapping by maximizing a global similarity function between two LC-MS maps. For alignment tasks with large RT shifts (>500 s), an approximate warping function is determined by locally weighted scatterplot smoothing of potential matched features, detected using a novel voting scheme based on co-elution. For validation, we defined the ground truth for alignment success based on tandem mass spectrometry identifications from sequence searching. We showed that our method outperforms several existing tools in terms of precision and recall, and is capable of aligning maps from different instruments and settings.

Availability: Available at https://sourceforge.net/projects/rt-alignment/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Cell Line, Tumor
  • Chromatography, Liquid* / instrumentation
  • Chromatography, Liquid* / methods
  • Databases, Protein
  • Humans
  • Mass Spectrometry* / instrumentation
  • Mass Spectrometry* / methods
  • Proteome / chemistry
  • Proteomics / methods*
  • Reproducibility of Results

Substances

  • Proteome