Gene filtering in the analysis of Illumina microarray experiments

Anyiawung Chiara Forcheh; Geert Verbeke; Adetayo Kasim; Dan Lin; Ziv Shkedy; Willem Talloen; Hinrich W  H Göhlmann; Lieven Clement

doi:10.2202/1544-6115.1710

Gene filtering in the analysis of Illumina microarray experiments

Stat Appl Genet Mol Biol. 2012 Jan 6;11(2):/j/sagmb.2012.11.issue-2/1544-6115.1710/1544-6115.1710.xml. doi: 10.2202/1544-6115.1710.

Authors

Anyiawung Chiara Forcheh¹, Geert Verbeke, Adetayo Kasim, Dan Lin, Ziv Shkedy, Willem Talloen, Hinrich W H Göhlmann, Lieven Clement

Affiliation

¹ Interuniversity Institute for Biostatistics and Statistical Bioinformatics, Katholieke Universiteit Leuven and Universiteit Hasselt.

PMID: 22499694
DOI: 10.2202/1544-6115.1710

Abstract

Illumina bead arrays are microarrays that contain a random number of technical replicates (beads) for every probe (bead type) within the same array. Typically around 30 beads are placed at random positions on the array surface, which opens unique opportunities for quality control. Most preprocessing methods for Illumina bead arrays are ported from the Affymetrix microarray platform and ignore the availability of the technical replicates. The large number of beads for a particular bead type on the same array, however, should be highly correlated, otherwise they just measure noise and can be removed from the downstream analysis. Hence, filtering bead types can be considered as an important step of the preprocessing procedure for Illumina platform. This paper proposes a filtering method for Illumina bead arrays, which builds upon the mixed model framework. Bead types are called informative/non-informative (I/NI) based on a trade-off between within and between array variabilities. The method is illustrated on a publicly available Illumina Spike-in data set (Dunning et al., 2008) and we also show that filtering results in a more powerful analysis of differentially expressed genes.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Computational Biology / methods
Gene Expression Profiling / methods*
High-Throughput Nucleotide Sequencing
Models, Statistical*
Oligonucleotide Array Sequence Analysis*