PTMProphet: Fast and Accurate Mass Modification Localization for the Trans-Proteomic Pipeline

J Proteome Res. 2019 Dec 6;18(12):4262-4272. doi: 10.1021/acs.jproteome.9b00205. Epub 2019 Jul 22.

Abstract

Spectral matching sequence database search engines commonly used on mass spectrometry-based proteomics experiments excel at identifying peptide sequence ions, and in addition, possible sequence ions carrying post-translational modifications (PTMs), but most do not provide confidence metrics for the exact localization of those PTMs when several possible sites are available. Localization is absolutely required for downstream molecular cell biology analysis of PTM function in vitro and in vivo. Therefore, we developed PTMProphet, a free and open-source software tool integrated into the Trans-Proteomic Pipeline, which reanalyzes identified spectra from any search engine for which pepXML output is available to provide localization confidence to enable appropriate further characterization of biologic events. Localization of any type of mass modification (e.g., phosphorylation) is supported. PTMProphet applies Bayesian mixture models to compute probabilities for each site/peptide spectrum match where a PTM has been identified. These probabilities can be combined to compute a global false localization rate at any threshold to guide downstream analysis. We describe the PTMProphet tool, its underlying algorithms, and demonstrate its performance on ground-truth synthetic peptide reference data sets, one previously published small data set, one new larger data set, and also on a previously published phosphoenriched data set where the correct sites of modification are unknown. Data have been deposited to ProteomeXchange with identifier PXD013210.

Keywords: PTM localization; PTMProphet; PTMs; TPP; mass spectrometry; protein phosphorylation; proteomics; reference data set analysis.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Databases, Protein
  • Humans
  • Phosphopeptides / metabolism
  • Protein Processing, Post-Translational*
  • Proteomics / methods*
  • Software*
  • User-Computer Interface

Substances

  • Phosphopeptides