Evaluating different methods of microarray data normalization

André Fujita; João Ricardo Sato; Leonardo de Oliveira Rodrigues; Carlos Eduardo Ferreira; Mari Cleide Sogayar

doi:10.1186/1471-2105-7-469

Evaluating different methods of microarray data normalization

BMC Bioinformatics. 2006 Oct 23:7:469. doi: 10.1186/1471-2105-7-469.

Authors

André Fujita¹, João Ricardo Sato, Leonardo de Oliveira Rodrigues, Carlos Eduardo Ferreira, Mari Cleide Sogayar

Affiliation

¹ Institute of Mathematics and Statistics, University of São Paulo, Rua do Matão, 1010--São Paulo, 05508-090 SP, Brazil. fujita@ime.usp.br

Abstract

Background: With the development of DNA hybridization microarray technologies, nowadays it is possible to simultaneously assess the expression levels of thousands to tens of thousands of genes. Quantitative comparison of microarrays uncovers distinct patterns of gene expression, which define different cellular phenotypes or cellular responses to drugs. Due to technical biases, normalization of the intensity levels is a pre-requisite to performing further statistical analyses. Therefore, choosing a suitable approach for normalization can be critical, deserving judicious consideration.

Results: Here, we considered three commonly used normalization approaches, namely: Loess, Splines and Wavelets, and two non-parametric regression methods, which have yet to be used for normalization, namely, the Kernel smoothing and Support Vector Regression. The results obtained were compared using artificial microarray data and benchmark studies. The results indicate that the Support Vector Regression is the most robust to outliers and that Kernel is the worst normalization technique, while no practical differences were observed between Loess, Splines and Wavelets.

Conclusion: In face of our results, the Support Vector Regression is favored for microarray normalization due to its superiority when compared to the other methods for its robustness in estimating the normalization curve.

Publication types

Comparative Study
Research Support, Non-U.S. Gov't

MeSH terms

Animals
Cells, Cultured
Data Interpretation, Statistical
Evaluation Studies as Topic
Gene Expression Profiling*
Mice
Oligonucleotide Array Sequence Analysis*
Reference Values
Regression Analysis