Machine learning-based peptide-spectrum match rescoring opens up the immunopeptidome

Proteomics. 2024 Apr;24(8):e2300336. doi: 10.1002/pmic.202300336. Epub 2023 Nov 27.

Abstract

Immunopeptidomics is a key technology in the discovery of targets for immunotherapy and vaccine development. However, identifying immunopeptides remains challenging due to their non-tryptic nature, which results in distinct spectral characteristics. Moreover, the absence of strict digestion rules leads to extensive search spaces, further amplified by the incorporation of somatic mutations, pathogen genomes, unannotated open reading frames, and post-translational modifications. This inflation in search space leads to an increase in random high-scoring matches, resulting in fewer identifications at a given false discovery rate. Peptide-spectrum match rescoring has emerged as a machine learning-based solution to address challenges in mass spectrometry-based immunopeptidomics data analysis. It involves post-processing unfiltered spectrum annotations to better distinguish between correct and incorrect peptide-spectrum matches. Recently, features based on predicted peptidoform properties, including fragment ion intensities, retention time, and collisional cross section, have been used to improve the accuracy and sensitivity of immunopeptide identification. In this review, we describe the diverse bioinformatics pipelines that are currently available for peptide-spectrum match rescoring and discuss how they can be used for the analysis of immunopeptidomics data. Finally, we provide insights into current and future machine learning solutions to boost immunopeptide identification.

Keywords: data analysis; immunopeptidomics; machine learning; mass spectrometry.

Publication types

  • Review

MeSH terms

  • Machine Learning
  • Mass Spectrometry / methods
  • Peptides* / chemistry
  • Protein Processing, Post-Translational
  • Proteomics* / methods

Substances

  • Peptides