Uncertainty-Aware Protein-Level Quantification and Differential Expression Analysis of Proteomics Data with seaMass

Methods Mol Biol. 2023:2426:141-162. doi: 10.1007/978-1-0716-1967-4_8.

Abstract

seaMass is an R package for protein-level quantification, normalization, and differential expression analysis of proteomics mass spectrometry data after peptide identification, protein grouping, and feature-level quantification. Using the concept of a blocked experimental design, seaMass can analyze all common discovery proteomics paradigms, including label-free (e.g., Waters Progenesis input), SILAC (e.g., MaxQuant input), isotope labelling (e.g., SCIEX ProteinPilot iTraq and Thermo ProteomeDiscoverer TMT input), and data-independent acquisition (e.g., OpenSWATH-PyProphet input), and is able to scale to study with hundreds of assays or more. By utilizing hierarchical Bayesian modelling, seaMass assesses the quantification reliability of each feature and peptide across assays so that only those in consensus influence the resulting protein group quantification strongly. Similarly, unexplained variation in each individual assay is captured, providing both a metric for quality control and automatic down-weighting of suspect assays. To achieve this, each protein group-level quantification outputted by seaMass is accompanied by the standard deviation of its posterior uncertainty. Moreover, seaMass integrates a flexible differential expression analysis subsystem with false discovery rate control based on the popular MCMCglmm package for Bayesian mixed-effects modelling, and also provides uncertainty-aware principal components analysis. We provide a description for using seaMass to perform an end-to-end analysis using a real dataset associated with a published clinical proteomics study.

Keywords: Bayesian modelling; Differential expression analysis; False discovery rate control; Protein quantification; Quantitative proteomics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bayes Theorem
  • Peptides
  • Proteins*
  • Proteome / metabolism
  • Proteomics* / methods
  • Reproducibility of Results
  • Uncertainty

Substances

  • Proteins
  • Peptides
  • Proteome