Peak Identification and Quantification by Proteomic Mass Spectrogram Decomposition

J Proteome Res. 2021 May 7;20(5):2291-2298. doi: 10.1021/acs.jproteome.0c00819. Epub 2021 Mar 4.

Abstract

Recent advances in the liquid chromatography/mass spectrometry (LC/MS) technology have improved the sensitivity, resolution, and speed of proteome analysis, resulting in increasing demand for more sophisticated algorithms to interpret complex mass spectrograms. Here, we propose a novel statistical method, proteomic mass spectrogram decomposition (ProtMSD), for joint identification and quantification of peptides and proteins. Given the proteomic mass spectrogram and the reference mass spectra of all possible peptide ions associated with proteins as a dictionary, ProtMSD estimates the chromatograms of those peptide ions under a group sparsity constraint without using the conventional careful preprocessing (e.g., thresholding and peak picking). We show that the method was significantly improved using protein-peptide hierarchical relationships, isotopic distribution profiles, reference retention times of peptide ions, and prelearned mass spectra of noise. We examined the concept of database search, library search, and match-between-runs. Our ProtMSD showed excellent agreements of 3277 peptide ions (94.79%) and 493 proteins (98.21%) with Mascot/Skyline for an Escherichia coli proteome sample and of 4460 peptide ions (103%) and 588 proteins (101%) with match-between-runs by MaxQuant for a yeast proteome sample. This is the first attempt to use a matrix decomposition technique as a tool for LC/MS-based proteome identification and quantification.

Keywords: bioinformatics; machine learning; mass spectrogram; match-between-runs; matrix decomposition; protein identification and quantification; shotgun proteomics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromatography, Liquid
  • Mass Spectrometry
  • Peptides
  • Proteome*
  • Proteomics*

Substances

  • Peptides
  • Proteome