pDeepXL: MS/MS Spectrum Prediction for Cross-Linked Peptide Pairs by Deep Learning

J Proteome Res. 2021 May 7;20(5):2570-2582. doi: 10.1021/acs.jproteome.0c01004. Epub 2021 Apr 6.

Abstract

In cross-linking mass spectrometry, the identification of cross-linked peptide pairs heavily relies on the ability of a database search engine to measure the similarities between experimental and theoretical MS/MS spectra. However, the lack of accurate ion intensities in theoretical spectra impairs the performance of search engines, in particular, on proteome scales. Here we introduce pDeepXL, a deep neural network to predict MS/MS spectra of cross-linked peptide pairs. To train pDeepXL, we used the transfer-learning technique because it facilitated the training with limited benchmark data of cross-linked peptide pairs. Test results on more than ten data sets showed that pDeepXL accurately predicted the spectra of both noncleavable DSS/BS3/Leiker cross-linked peptide pairs (>80% of predicted spectra have Pearson's r values higher than 0.9) and cleavable DSSO/DSBU cross-linked peptide pairs (>75% of predicted spectra have Pearson's r values higher than 0.9). pDeepXL also achieved the accurate prediction on unseen data sets using an online fine-tuning technique. Lastly, integrating pDeepXL into a database search engine increased the number of identified cross-link spectra by 18% on average.

Keywords: cross-linking mass spectrometry; deep learning; online fine-tuning; spectrum prediction; transfer learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Deep Learning*
  • Neural Networks, Computer
  • Peptides
  • Proteome
  • Tandem Mass Spectrometry*

Substances

  • Peptides
  • Proteome