Finding differentially expressed genes in two-channel DNA microarray datasets: how to increase reliability of data preprocessing

OMICS. 2008 Sep;12(3):171-82. doi: 10.1089/omi.2008.0032.

Abstract

Due to the great variety of preprocessing tools in two-channel expression microarray data analysis it is difficult to choose the most appropriate one for a given experimental setup. In our study, two independent two-channel inhouse microarray experiments as well as a publicly available dataset were used to investigate the influence of the selection of preprocessing methods (background correction, normalization, and duplicate spots correlation calculation) on the discovery of differentially expressed genes. Here we are showing that both the list of differentially expressed genes and the expression values of selected genes depend significantly on the preprocessing approach applied. The choice of normalization method to be used had the highest impact on the results. We propose a simple but efficient approach to increase the reliability of obtained results, where two normalization methods which are theoretically distinct from one another are used on the same dataset. Then the intersection of results, that is, the lists of differentially expressed genes, is used in order to get a more accurate estimation of the genes that were de facto differentially expressed.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Databases, Genetic*
  • Gene Expression
  • Gene Expression Profiling / methods*
  • Mice
  • Oligonucleotide Array Sequence Analysis / methods*
  • Reproducibility of Results
  • Solanum tuberosum / genetics
  • Vitis / genetics