Efficient reduction of candidate matches in peptide spectrum library searching using the top k most intense peaks

J Proteome Res. 2014 Sep 5;13(9):4175-83. doi: 10.1021/pr401269z. Epub 2014 Jul 24.

Abstract

Spectral library searching is a popular approach for MS/MS-based peptide identification. Because the size of spectral libraries continues to grow, the performance of searching algorithms is an important issue. This technical note introduces a strategy based on a minimum shared peak count between two spectra to reduce the set of admissible candidate spectra when issuing a query. A theoretical validation through time complexity analysis and an experimental validation based on an implementation of the candidate reduction strategy show that the approach can achieve a reduction of the set of candidate spectra by (at least) an order of magnitude, resulting in a significant improvement in the speed of the search. Meanwhile, more than 99% of the positive search results is retained. This efficient strategy to drastically improve the speed of spectral library searching with a negligible loss of sensitivity can be applied to any current spectral library search tool, irrespective of the employed similarity metric.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Data Mining
  • Databases, Protein*
  • Humans
  • Peptide Library*
  • Proteins
  • Proteomics / methods*
  • Software*
  • Yeasts

Substances

  • Peptide Library
  • Proteins