Randomized probe selection algorithm for microarray design

J Theor Biol. 2007 Oct 7;248(3):512-21. doi: 10.1016/j.jtbi.2007.05.036. Epub 2007 Jun 11.

Abstract

DNA microarray technology, originally developed to measure the level of gene expression, has become one of the most widely used tools in genomic study. The crux of microarray design lies in how to select a unique probe that distinguishes a given genomic sequence from other sequences. Due to its significance, probe selection attracts a lot of attention. Various probe selection algorithms have been developed in recent years. Good probe selection algorithms should produce a small number of candidate probes. Efficiency is also crucial because the data involved are usually huge. Most existing algorithms are usually not sufficiently selective and quite a large number of probes are returned. We propose a new direction to tackle the problem and give an efficient algorithm based on randomization to select a small set of probes and demonstrate that such a small set of probes is sufficient to distinguish each sequence from all the other sequences. Based on the algorithm, we have developed probe selection software RandPS, which runs efficiently in practice. The software is available on our website (http://www.csc.liv.ac.uk/ approximately cindy/RandPS/RandPS.htm). We test our algorithm via experiments on different genomes (Escherichia coli, Saccharamyces cerevisiae, etc.) and our algorithm is able to output unique probes for most of the genes efficiently. The other genes can be identified by a combination of at most two probes.

MeSH terms

  • Algorithms*
  • Animals
  • Arabidopsis / genetics
  • Base Sequence
  • Chromosomes, Human, Pair 1 / genetics
  • Chromosomes, Mammalian / genetics
  • DNA Probes / genetics*
  • Escherichia coli / genetics
  • Genes, Bacterial / genetics
  • Genes, Fungal / genetics
  • Humans
  • Mice
  • Neurospora crassa / genetics
  • Oligonucleotide Array Sequence Analysis / methods*
  • Random Allocation
  • Saccharomyces cerevisiae / genetics
  • Schizosaccharomyces / genetics
  • Software
  • Time Factors

Substances

  • DNA Probes