Domain-enhanced analysis of microarray data using GO annotations

Bioinformatics. 2007 May 15;23(10):1225-34. doi: 10.1093/bioinformatics/btm092. Epub 2007 Mar 22.

Abstract

Motivation: New biological systems technologies give scientists the ability to measure thousands of bio-molecules including genes, proteins, lipids and metabolites. We use domain knowledge, e.g. the Gene Ontology, to guide analysis of such data. By focusing on domain-aggregated results at, say the molecular function level, increased interpretability is available to biological scientists beyond what is possible if results are presented at the gene level.

Results: We use a 'top-down' approach to perform domain aggregation by first combining gene expressions before testing for differentially expressed patterns. This is in contrast to the more standard 'bottom-up' approach, where genes are first tested individually then aggregated by domain knowledge. The benefits are greater sensitivity for detecting signals. Our method, domain-enhanced analysis (DEA) is assessed and compared to other methods using simulation studies and analysis of two publicly available leukemia data sets.

Availability: Our DEA method uses functions available in R (http://www.r-project.org/) and SAS (http://www.sas.com/). The two experimental data sets used in our analysis are available in R as Bioconductor packages, 'ALL' and 'golubEsets' (http://www.bioconductor.org/).

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Burkitt Lymphoma / genetics*
  • Computational Biology*
  • Computer Simulation
  • Gene Expression Profiling
  • Humans
  • Leukemia-Lymphoma, Adult T-Cell / genetics*
  • Oligonucleotide Array Sequence Analysis*
  • Sensitivity and Specificity
  • Software*