A Bayesian Null Interval Hypothesis Test Controls False Discovery Rates and Improves Sensitivity in Label-Free Quantitative Proteomics

J Proteome Res. 2020 May 1;19(5):1975-1981. doi: 10.1021/acs.jproteome.9b00796. Epub 2020 Apr 14.

Abstract

Statistical significance tests are a common feature in quantitative proteomics workflows. The Student's t-test is widely used to compute the statistical significance of a protein's change between two groups of samples. However, the t-test's null hypothesis asserts that the difference in means between two groups is exactly zero, often marking small but uninteresting fold-changes as statistically significant. Compensations to address this issue are widely used in quantitative proteomics, but we suggest that a replacement of the t-test with a Bayesian approach offers a better path forward. In this article, we describe a Bayesian hypothesis test in which the null hypothesis is an interval rather than a single point at zero; the width of the interval is estimated from population statistics. The improved sensitivity of the method substantially increases the number of truly changing proteins detected in two benchmark data sets (ProteomeXchange identifiers PXD005590 and PXD016470). The method has been implemented within FlashLFQ, an open-source software program that quantifies bottom-up proteomics search results obtained from any search tool. FlashLFQ is rapid, sensitive, and accurate and is available both as an easy-to-use graphical user interface (Windows) and as a command-line tool (Windows/Linux/OSX).

Keywords: Bayesian hypothesis test; Bayesian statistics; label-free quantification; quantitative proteomics; software.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Bayes Theorem
  • Humans
  • Proteins
  • Proteomics*
  • Software*
  • Workflow

Substances

  • Proteins