Multi-Reference Spectral Library Yields Almost Complete Coverage of Heterogeneous LC-MS/MS Data Sets

J Proteome Res. 2019 Apr 5;18(4):1553-1566. doi: 10.1021/acs.jproteome.8b00819. Epub 2019 Mar 8.

Abstract

Spectral libraries play a central role in the analysis of data-independent-acquisition (DIA) proteomics experiments. A main assumption in current spectral library tools is that a single characteristic intensity pattern (CIP) suffices to describe the fragmentation of a peptide in a particular charge state (peptide charge pair). However, we find that this is often not the case. We carry out a systematic evaluation of spectral variability over public repositories and in-house data sets. We show that spectral variability is widespread and partly occurs under fixed experimental conditions. Using clustering of preprocessed spectra, we derive a limited number of multiple characteristic intensity patterns (MCIPs) for each peptide charge pair, which allow almost complete coverage of our heterogeneous data set without affecting the false discovery rate. We show that a MCIP library derived from public repositories performs in most cases similar to a "custom-made" spectral library, which has been acquired under identical experimental conditions as the query spectra. We apply the MCIP approach to a DIA data set and observe a significant increase in peptide recognition. We propose the MCIP approach as an easy-to-implement addition to current spectral library search engines and as a new way to utilize the data stored in spectral repositories.

Keywords: SWATH-MS; algorithms; bioinformatics; data evaluation; peptide fragmentation; spectral libraries; tandem mass spectrometry.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Chromatography, Liquid*
  • Databases, Protein*
  • Peptide Fragments / chemistry
  • Peptide Fragments / genetics
  • Peptide Library*
  • Proteomics / methods*
  • Tandem Mass Spectrometry*

Substances

  • Peptide Fragments
  • Peptide Library