Characterization of mismatch and high-signal intensity probes associated with Affymetrix genechips

Bioinformatics. 2007 Aug 15;23(16):2088-95. doi: 10.1093/bioinformatics/btm306. Epub 2007 Jun 6.

Abstract

Motivation: For Affymetrix microarray platforms, gene expression is determined by computing the difference in signal intensities between perfect match (PM) and mismatch (MM) probesets. Although the use of PM is not controversial, MM probesets have been associated with variance and ultimately inaccurate gene expression calls. A principal focus of this study was to investigate the nature of the MM signal intensities and demonstrate its contribution to the experimental results.

Results: While most MM intensities were likely associated with random noise, a subset of approximately 20% (99,485) of the MM probes displayed relatively high signal intensities to the corresponding PM probes (MM > PM) in a non-random fashion; 13,440 of these probes demonstrated exceptionally high 'outlier' intensities. About 15,938 PM probes also demonstrated exceptionally high outlier intensities consistently across all hybridizations. About 92% of the MM > PM probes had either a dThymidine (dT) or a dCytidine (dC) at the 13th position of the probe sequence. MM and PM probes displaying extremely high outlier intensities contained high dC rich nucleotides, and low dA contents at other nucleotides positions along the 25mer probe sequence. Differentially expressed genes generated using Genechip Operating System (GCOS) or modified PM-only methods were also examined. Of those candidate genes identified in the PM-only method, 157 of them were designated by GCOS as absent across all datasets and many others contained probes with MM > PM signal intensities. Our data suggests that MM intensity from PM signal can be a major source of error analysis, leading to fewer potentially biologically important candidate genes.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Base Pair Mismatch / genetics*
  • DNA Probes / genetics*
  • Equipment Design
  • Equipment Failure Analysis
  • Gene Expression Profiling / instrumentation*
  • Gene Expression Profiling / methods
  • In Situ Hybridization, Fluorescence / methods*
  • Oligonucleotide Array Sequence Analysis / instrumentation*
  • Oligonucleotide Array Sequence Analysis / methods
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Sequence Analysis, DNA / methods*

Substances

  • DNA Probes