Correlation Patterns in Experimental Data Are Affected by Normalization Procedures: Consequences for Data Analysis and Network Inference

J Proteome Res. 2017 Feb 3;16(2):619-634. doi: 10.1021/acs.jproteome.6b00704. Epub 2016 Dec 15.

Abstract

Normalization is a fundamental step in data processing to account for the sample-to-sample variation observed in biological samples. However, data structure is affected by normalization. In this paper, we show how, and to what extent, the correlation structure is affected by the application of 11 different normalization procedures. We also discuss the consequences for data analysis and interpretation, including principal component analysis, partial least-squares discrimination, and the inference of metabolite-metabolite association networks.

Keywords: COVSCA; NMR; PCA; PLS-DA; covariance analysis; sample-to-sample variation; spurious correlation; urine dilution.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Least-Squares Analysis
  • Metabolome / genetics*
  • Principal Component Analysis*
  • Proteome / chemistry
  • Proteome / genetics
  • Proteome / standards*
  • Proteomics / standards
  • Proteomics / statistics & numerical data*
  • Swine
  • Urine / chemistry

Substances

  • Proteome