Inter-gene correlation on oligonucleotide arrays: how much does normalization matter?

Am J Pharmacogenomics. 2005;5(4):271-9. doi: 10.2165/00129785-200505040-00007.

Abstract

Background and objective: Normalization is a standard low-level preprocessing procedure in microarray data analysis to minimize the systematic technological variations and produce more reliable results. A variety of normalization approaches have been introduced and are widely applied. Normalization, however, remains controversial. The sensitivity of array results to normalization is an open question. No clear standard for comparing or judging normalization methods has yet emerged, and the effects of normalization on gene-to-gene co-expression are unclear.

Methods: In this investigation, we applied 1-, 2-, and N-quantile normalization to several publicly available microarray datasets quantified with either MAS 5.0 or dCHIP and evaluated the effect on gene-to-gene co-expression. We introduced a graphical method to explore trends in gene correlation.

Results: We found clear differences in the distributions of gene dependencies by normalization method. Increasing the number of standardized quantiles in the normalization reduced trends in correlation by signal intensity in MAS 5.0 quantifications but not dCHIP. Increasing the number of standardized quantiles did not markedly reduce the correlation of known overlapping targets with MAS 5.0.

Conclusions: Normalization plays a very important role in the estimation of inter-gene dependency. Caution should be used when making inferences concerning gene-wise dependencies with microarrays until this source of variation is better understood.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms
  • Data Interpretation, Statistical
  • Databases, Genetic
  • Gene Expression
  • Oligonucleotide Array Sequence Analysis / standards*
  • Oligonucleotide Array Sequence Analysis / statistics & numerical data