Database search engines and target database features impinge upon the identification of post-translationally cis-spliced peptides in HLA class I immunopeptidomes

Michele Mishto; Yehor Horokhovskyi; John A Cormican; Xiaoping Yang; Steven Lynham; Henning Urlaub; Juliane Liepe

doi:10.1002/pmic.202100226

Database search engines and target database features impinge upon the identification of post-translationally cis-spliced peptides in HLA class I immunopeptidomes

Proteomics. 2022 May;22(10):e2100226. doi: 10.1002/pmic.202100226. Epub 2022 Mar 3.

Authors

Michele Mishto^{1

2}, Yehor Horokhovskyi³, John A Cormican³, Xiaoping Yang⁴, Steven Lynham⁴, Henning Urlaub^{3

5}, Juliane Liepe³

Affiliations

¹ Centre for Inflammation Biology and Cancer Immunology (CIBCI) & Peter Gorer Department of Immunobiology, King's College London, London, UK.
² Francis Crick Institute, London, UK.
³ Max-Planck-Institute for Multidisciplinary Sciences, Göttingen, Germany.
⁴ Proteomics Core Facility, James Black Centre, King's College, London, UK.
⁵ Institute of Clinical Chemistry, University Medical Center Göttingen, Göttingen, Germany.

Abstract

Unconventional epitopes presented by HLA class I complexes are emerging targets for T cell targeted immunotherapies. Their identification by mass spectrometry (MS) required development of novel methods to cope with the large number of theoretical candidates. Methods to identify post-translationally spliced peptides led to a broad range of outcomes. We here investigated the impact of three common database search engines - that is, Mascot, Mascot+Percolator, and PEAKS DB - as final identification step, as well as the features of target database on the ability to correctly identify non-spliced and cis-spliced peptides. We used ground truth datasets measured by MS to benchmark methods' performance and extended the analysis to HLA class I immunopeptidomes. PEAKS DB showed better precision and recall of cis-spliced peptides and larger number of identified peptides in HLA class I immunopeptidomes than the other search engine strategies. The better performance of PEAKS DB appears to result from better discrimination between target and decoy hits and hence a more robust FDR estimation, and seems independent to peptide and spectrum features here investigated.

Keywords: HLA; Mascot; PEAKS; immunopeptidome; peptide splicing.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Epitopes
Mass Spectrometry
Peptides* / chemistry
Search Engine*
Software

Substances

Epitopes
Peptides

Abstract

Publication types

MeSH terms

Substances

Grants and funding