An Adaptive Pipeline To Maximize Isobaric Tagging Data in Large-Scale MS-Based Proteomics

J Proteome Res. 2018 Jun 1;17(6):2165-2173. doi: 10.1021/acs.jproteome.8b00110. Epub 2018 May 4.

Abstract

Isobaric tagging is the method of choice in mass-spectrometry-based proteomics for comparing several conditions at a time. Despite its multiplexing capabilities, some drawbacks appear when multiple experiments are merged for comparison in large sample-size studies due to the presence of missing values, which result from the stochastic nature of the data-dependent acquisition mode. Another indirect cause of data incompleteness might derive from the proteomic-typical data-processing workflow that first identifies proteins in individual experiments and then only quantifies those identified proteins, leaving a large number of unmatched spectra with quantitative information unexploited. Inspired by untargeted metabolomic and label-free proteomic workflows, we developed a quantification-driven bioinformatic pipeline (Quantify then Identify (QtI)) that optimizes the processing of isobaric tandem mass tag (TMT) data from large-scale studies. This pipeline includes innovative features, such as peak filtering with a self-adaptive preprocessing pipeline optimization method, Peptide Match Rescue, and Optimized Post-Translational Modification. QtI outperforms a classical benchmark workflow in terms of quantification and identification rates, significantly reducing missing data while preserving unmatched features for quantitative comparison. The number of unexploited tandem mass spectra was reduced by 77 and 62% for two human cerebrospinal fluid and plasma data sets, respectively.

Keywords: algorithms; bioinformatics; biomarkers; discovery; isobaric tagging; machine learning; protein identification; quantification; tandem mass spectrometry; tandem mass tag.

MeSH terms

  • Algorithms
  • Cerebrospinal Fluid / chemistry
  • Computational Biology
  • Datasets as Topic
  • Humans
  • Plasma / chemistry
  • Protein Processing, Post-Translational
  • Proteomics / methods*
  • Staining and Labeling / methods*
  • Tandem Mass Spectrometry / methods*
  • Workflow*