An evaluation framework for statistical tests on microarray data

Michael Dondrup; Andrea T Hüser; Dominik Mertens; Alexander Goesmann

doi:10.1016/j.jbiotec.2009.01.009

An evaluation framework for statistical tests on microarray data

J Biotechnol. 2009 Mar 10;140(1-2):18-26. doi: 10.1016/j.jbiotec.2009.01.009.

Authors

Michael Dondrup¹, Andrea T Hüser, Dominik Mertens, Alexander Goesmann

Affiliation

¹ Center for Biotechnology, Bielefeld University, Bielefeld, Germany. mdondrup@cebitec.uni-bielefeld.de

PMID: 19297690
DOI: 10.1016/j.jbiotec.2009.01.009

Abstract

Microarray analysis has become a popular and routine method in functional genomics. It is typical for such experiments to involve a small number of replicates, which causes unreliable estimates of the sample variance. Microarrays have fostered the development of new statistical methods to analyze data resulting from experiments with small sample sizes. In this study, we tackle the problem of evaluating the performance of statistical tests for generating ranked gene lists from two-channel direct comparisons. We propose an evaluation method based on a oligonucleotide microarray with a large number of replicate spots yielding a maximum of 400 replicates per gene. We apply Spearman's rank correlation coefficient to ranked gene-lists generated by eight widely used microarray specific test statistics, which are applied to small random samples. We could show that variance stabilizing methods such as Cyber-T, SAM, and LIMMA can be beneficial for very small sample sizes and that SAM and the t-test provide stronger control of the type I error rate than the other methods. Specifically, we report that for four replicates all methods reach a high to very high correlation with our reference standard.

MeSH terms

Algorithms
Data Interpretation, Statistical
Microarray Analysis*
Reproducibility of Results
Statistics, Nonparametric*