Extracting the Strongest Signals from Omics Data: Differentially Expressed Pathways and Beyond

Methods Mol Biol. 2017:1613:125-159. doi: 10.1007/978-1-4939-7027-8_7.

Abstract

The analysis of gene sets (in a form of functionally related genes or pathways) has become the method of choice for extracting the strongest signals from omics data. The motivation behind using gene sets instead of individual genes is two-fold. First, this approach incorporates pre-existing biological knowledge into the analysis and facilitates the interpretation of experimental results. Second, it employs a statistical hypotheses testing framework. Here, we briefly review main Gene Set Analysis (GSA) approaches for testing differential expression of gene sets and several GSA approaches for testing statistical hypotheses beyond differential expression that allow extracting additional biological information from the data. We distinguish three major types of GSA approaches testing: (1) differential expression (DE), (2) differential variability (DV), and (3) differential co-expression (DC) of gene sets between two phenotypes. We also present comparative power analysis and Type I error rates for different approaches in each major type of GSA on simulated data. Our evaluation presents a concise guideline for selecting GSA approaches best performing under particular experimental settings. The value of the three major types of GSA approaches is illustrated with real data example. While being applied to the same data set, major types of GSA approaches result in complementary biological information.

Keywords: Competitive; Differential co-expression; Differential expression; Differential variability; Gene set analysis approaches; Hypotheses testing; Omics data; Self-contained.

Publication types

  • Comparative Study
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, N.I.H., Extramural

MeSH terms

  • Computational Biology / methods*
  • Computer Simulation
  • Data Mining
  • Gene Expression
  • Gene Expression Profiling
  • Gene Regulatory Networks*
  • Genomics
  • Humans
  • Oligonucleotide Array Sequence Analysis
  • Phenotype