Finding the significant markers: statistical analysis of proteomic data

Methods Mol Biol. 2008:428:327-47. doi: 10.1007/978-1-59745-117-8_17.

Abstract

After separation through two-dimensional gel electrophoresis (2DE), several hundreds of individual protein abundances can be quantified in a cell population or sample tissue. Both a good experimental setup and a valid statistical approach are essential to get insight into the data and to draw correct conclusions. High-throughput 2DE proteomics yield complex and large datasets with a huge disproportion between the hundreds of variables and the restricted number of replicates. However, the most commonly used statistical tests have been designed to cope with a high number of replicates and a restricted number of variables. There is some inconsistency in the proteomics community related to the use of statistics. Two approaches of data analysis can be distinguished: exploratory data analysis and confirmatory data analysis. Currently, most proteomic data are analyzed with the emphasis on confirmatory analysis and do not take into account the exploratory data analysis. This chapter gives an overview of the typical statistical exploratory and confirmatory tools available and suggests case-specific guidelines for a reliable statistical approach that can be used for 2DE analysis. Examples are given for an experimental setup based on classical staining methods as well as for the more advanced difference gel electrophoresis.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Analysis of Variance
  • Biomarkers / analysis*
  • Data Interpretation, Statistical
  • Electrophoresis, Gel, Two-Dimensional / statistics & numerical data
  • Humans
  • Multivariate Analysis
  • Principal Component Analysis
  • Proteome / isolation & purification*
  • Proteomics / statistics & numerical data*

Substances

  • Biomarkers
  • Proteome