Database search engines and target database features impinge upon the identification of post-translationally cis-spliced peptides in HLA class I immunopeptidomes

Proteomics. 2022 May;22(10):e2100226. doi: 10.1002/pmic.202100226. Epub 2022 Mar 3.

Abstract

Unconventional epitopes presented by HLA class I complexes are emerging targets for T cell targeted immunotherapies. Their identification by mass spectrometry (MS) required development of novel methods to cope with the large number of theoretical candidates. Methods to identify post-translationally spliced peptides led to a broad range of outcomes. We here investigated the impact of three common database search engines - that is, Mascot, Mascot+Percolator, and PEAKS DB - as final identification step, as well as the features of target database on the ability to correctly identify non-spliced and cis-spliced peptides. We used ground truth datasets measured by MS to benchmark methods' performance and extended the analysis to HLA class I immunopeptidomes. PEAKS DB showed better precision and recall of cis-spliced peptides and larger number of identified peptides in HLA class I immunopeptidomes than the other search engine strategies. The better performance of PEAKS DB appears to result from better discrimination between target and decoy hits and hence a more robust FDR estimation, and seems independent to peptide and spectrum features here investigated.

Keywords: HLA; Mascot; PEAKS; immunopeptidome; peptide splicing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Epitopes
  • Mass Spectrometry
  • Peptides* / chemistry
  • Search Engine*
  • Software

Substances

  • Epitopes
  • Peptides