An a posteriori strategy for enhancing gene discovery in anonymous cDNA microarray experiments

Bioinformatics. 2004 Jul 22;20(11):1721-7. doi: 10.1093/bioinformatics/bth152. Epub 2004 Feb 26.

Abstract

Motivation: Because of the high cost of sequencing, the bulk of gene discovery is performed using anonymous cDNA microarrays. Though the clones on such arrays are easier and cheaper to construct and utilize than unigene and oligonucleotide arrays, they are there in proportion to their corresponding gene expression activity in the tissue being examined. The associated redundancy will be there in any pool of possibly interesting differentially expressed clones identified in a microarray experiment for subsequent sequencing and investigation. An a posteriori sampling strategy is proposed to enhance gene discovery by reducing the impact of the redundancy in the identified pool.

Results: The proposed strategy exploits the fact that individual genes that are highly expressed in a tissue are more likely to be present as a number of spots in an anonymous library and, as a direct consequence, are also likely to give higher fluorescence intensity responses when present in a probe in a cDNA microarray experiment. Consequently, spots that respond with low intensities will have a lower redundancy and so should be sequenced in preference to those with the highest intensities. The proposed method, which formalizes how the fluorescence intensity of a spot should be assessed, is validated using actual microarray data, where the sequences of all the clones in the identified pool had been previously determined. For such validations, the concept of a repeat plot is introduced. It is also utilized to visualize and examine different measures for the characterization of fluorescence intensity. In addition, as confirmatory evidence, sequencing from the lowest to the highest intensities in a pool, with all the sequences known, is compared graphically with their random sequencing. The results establish that, in general, the opportunity for gene discovery is enhanced by avoiding the pooling of different biological libraries (because their construction will have involved different hybridization episodes) and concentrating on the clones with lower fluorescence intensities.

Publication types

  • Comparative Study
  • Evaluation Study
  • Validation Study

MeSH terms

  • Algorithms*
  • Gene Expression Profiling / methods*
  • Genetic Variation
  • In Situ Hybridization, Fluorescence / methods*
  • Microscopy, Fluorescence / methods*
  • Models, Genetic*
  • Models, Statistical
  • Oligonucleotide Array Sequence Analysis / methods*
  • Reproducibility of Results
  • Sample Size
  • Sensitivity and Specificity
  • Sequence Analysis, DNA / methods*