A nitty-gritty aspect of correlation and network inference from gene expression data

Biol Direct. 2008 Aug 20:3:35. doi: 10.1186/1745-6150-3-35.

Abstract

Background: All currently available methods of network/association inference from microarray gene expression measurements implicitly assume that such measurements represent the actual expression levels of different genes within each cell included in the biological sample under study. Contrary to this common belief, modern microarray technology produces signals aggregated over a random number of individual cells, a "nitty-gritty" aspect of such arrays, thereby causing a random effect that distorts the correlation structure of intra-cellular gene expression levels.

Results: This paper provides a theoretical consideration of the random effect of signal aggregation and its implications for correlation analysis and network inference. An attempt is made to quantitatively assess the magnitude of this effect from real data. Some preliminary ideas are offered to mitigate the consequences of random signal aggregation in the analysis of gene expression data.

Conclusion: Resulting from the summation of expression intensities over a random number of individual cells, the observed signals may not adequately reflect the true dependence structure of intra-cellular gene expression levels needed as a source of information for network reconstruction. Whether the reported effect is extrime or not, the important point, is to reconize and incorporate such signal source for proper inference. The usefulness of inference on genetic regulatory structures from microarray data depends critically on the ability of investigators to overcome this obstacle in a scientifically sound way.

Reviewers: This article was reviewed by Byung Soo KIM, Jeanne Kowalski and Geoff McLachlan.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Animals
  • Computational Biology / methods
  • Computational Biology / statistics & numerical data
  • Gene Expression Profiling / methods
  • Gene Expression Profiling / statistics & numerical data*
  • Humans
  • Models, Genetic*
  • Oligonucleotide Array Sequence Analysis / methods
  • Oligonucleotide Array Sequence Analysis / statistics & numerical data*
  • Statistics, Nonparametric