Can We Boost N-Glycopeptide Identification Confidence? Smart Collision Energy Choice Taking into Account Structure and Search Engine

J Am Soc Mass Spectrom. 2024 Feb 7;35(2):333-343. doi: 10.1021/jasms.3c00375. Epub 2024 Jan 29.

Abstract

High confidence and reproducibility are still challenges in bottom-up mass spectrometric N-glycopeptide identification. The collision energy used in the MS/MS measurements and the database search engine used to identify the species are perhaps the two most decisive factors. We investigated how the structural features of N-glycopeptides and the choice of the search engine influence the optimal collision energy, delivering the highest identification confidence. We carried out LC-MS/MS measurements using a series of collision energies on a large set of N-glycopeptides with both the glycan and peptide part varied and studied the behavior of Byonic, pGlyco, and GlycoQuest scores. We found that search engines show a range of behavior between peptide-centric and glycan-centric, which manifests itself already in the dependence of optimal collision energy on m/z. Using classical statistical and machine learning methods, we revealed that peptide hydrophobicity, glycan and peptide masses, and the number of mobile protons also have significant and search-engine-dependent influence, as opposed to a series of other parameters we probed. We envisioned an MS/MS workflow making a smart collision energy choice based on online available features such as the hydrophobicity (described by retention time) and glycan mass (potentially available from a scout MS/MS). Our assessment suggests that this workflow can lead to a significant gain (up to 100%) in the identification confidence, particularly for low-scoring hits close to the filtering limit, which has the potential to enhance reproducibility of N-glycopeptide analyses. Data are available via MassIVE (MSV000093110).

Keywords: N-glycosylation; bottom-up proteomics; collision energy optimization; general linear model; glycan structure; identification score; lasso regression; search engine; tandem mass spectrometry.

MeSH terms

  • Chromatography, Liquid
  • Glycopeptides* / chemistry
  • Peptides
  • Polysaccharides / analysis
  • Reproducibility of Results
  • Search Engine*
  • Tandem Mass Spectrometry / methods

Substances

  • Glycopeptides
  • Peptides
  • Polysaccharides