Weighted set enrichment of gene expression data

BMC Syst Biol. 2013;7 Suppl 4(Suppl 4):S10. doi: 10.1186/1752-0509-7-S4-S10. Epub 2013 Oct 23.

Abstract

Background: Sets of genes that are known to be associated with each other can be used to interpret microarray data. This gene set approach to microarray data analysis can illustrate patterns of gene expression which may be more informative than analyzing the expression of individual genes. Various statistical approaches exist for the analysis of gene sets. There are three main classes of these methods: over-representation analysis, functional class scoring, and pathway topology based methods.

Methods: We propose weighted hypergeometric and weighted chi-squared methods in order to assign a rank to the degree to which each gene participates in the enrichment. Each gene is assigned a weight determined by the absolute value of its log fold change, which is then raised to a certain power. The power value can be adjusted as needed. Datasets from the Gene Expression Omnibus are used to test the method. The significantly enriched pathways are validated through searching the literature in order to determine their relevance to the dataset.

Results: Although these methods detect fewer significantly enriched pathways, they can potentially produce more relevant results. Furthermore, we compare the results of different enrichment methods on a set of microarray studies all containing data from various rodent neuropathic pain models.

Discussion: Our method is able to produce more consistent results than other methods when evaluated on similar datasets. It can also potentially detect relevant pathways that are not identified by the standard methods. However, the lack of biological ground truth makes validating the method difficult.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Chlamydophila pneumoniae / physiology
  • Computational Biology / methods*
  • Dendritic Cells / metabolism
  • Dendritic Cells / microbiology
  • Gene Expression Profiling / methods*
  • Neuralgia / genetics