CoGA: An R Package to Identify Differentially Co-Expressed Gene Sets by Analyzing the Graph Spectra

PLoS One. 2015 Aug 27;10(8):e0135831. doi: 10.1371/journal.pone.0135831. eCollection 2015.

Abstract

Gene set analysis aims to identify predefined sets of functionally related genes that are differentially expressed between two conditions. Although gene set analysis has been very successful, by incorporating biological knowledge about the gene sets and enhancing statistical power over gene-by-gene analyses, it does not take into account the correlation (association) structure among the genes. In this work, we present CoGA (Co-expression Graph Analyzer), an R package for the identification of groups of differentially associated genes between two phenotypes. The analysis is based on concepts of Information Theory applied to the spectral distributions of the gene co-expression graphs, such as the spectral entropy to measure the randomness of a graph structure and the Jensen-Shannon divergence to discriminate classes of graphs. The package also includes common measures to compare gene co-expression networks in terms of their structural properties, such as centrality, degree distribution, shortest path length, and clustering coefficient. Besides the structural analyses, CoGA also includes graphical interfaces for visual inspection of the networks, ranking of genes according to their "importance" in the network, and the standard differential expression analysis. We show by both simulation experiments and analyses of real data that the statistical tests performed by CoGA indeed control the rate of false positives and is able to identify differentially co-expressed genes that other methods failed.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Biomarkers, Tumor / genetics
  • Brain Neoplasms / genetics*
  • Computational Biology / methods*
  • Computer Graphics*
  • Gene Expression Profiling*
  • Gene Expression Regulation
  • Gene Regulatory Networks*
  • Humans
  • Models, Biological
  • Oligonucleotide Array Sequence Analysis / methods*

Substances

  • Biomarkers, Tumor

Grants and funding

This work was supported by São Paulo Research Foundation 2014/09576-5 (http://www.fapesp.br/, AF), São Paulo Research Foundation 2013/03447-6 (http://www.fapesp.br/, AF), São Paulo Research Foundation 2011/50761-2 (http://www.fapesp.br/, AF), São Paulo Research Foundation 2012/25417-9 (http://www.fapesp.br/, SSS), National Council for Scientific and Technological Development 304020/2013-3 (http://www.cnpq.br/, AF), National Council for Scientific and Technological Development 473063/2013-1 (http://www.cnpq.br/, AF), Coordenação de Aperfeiçoamento de Pessoal de Nivel Superior (AF), and Núcleo de Apoio a Pesquisa—Pró-Reitoria de Pesquisa da Universidade de São Paulo (AF). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.