Comparing Statistical Tests for Differential Network Analysis of Gene Modules

Jaron Arbet; Yaxu Zhuang; Elizabeth Litkowski; Laura Saba; Katerina Kechris

doi:10.3389/fgene.2021.630215

Comparing Statistical Tests for Differential Network Analysis of Gene Modules

Front Genet. 2021 May 19:12:630215. doi: 10.3389/fgene.2021.630215. eCollection 2021.

Authors

Jaron Arbet¹, Yaxu Zhuang¹, Elizabeth Litkowski², Laura Saba³, Katerina Kechris¹

Affiliations

¹ Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, United States.
² Department of Epidemiology, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, United States.
³ Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora CO, United States.

Abstract

Genes often work together to perform complex biological processes, and "networks" provide a versatile framework for representing the interactions between multiple genes. Differential network analysis (DiNA) quantifies how this network structure differs between two or more groups/phenotypes (e.g., disease subjects and healthy controls), with the goal of determining whether differences in network structure can help explain differences between phenotypes. In this paper, we focus on gene co-expression networks, although in principle, the methods studied can be used for DiNA for other types of features (e.g., metabolome, epigenome, microbiome, proteome, etc.). Three common applications of DiNA involve (1) testing whether the connections to a single gene differ between groups, (2) testing whether the connection between a pair of genes differs between groups, or (3) testing whether the connections within a "module" (a subset of 3 or more genes) differs between groups. This article focuses on the latter, as there is a lack of studies comparing statistical methods for identifying differentially co-expressed modules (DCMs). Through extensive simulations, we compare several previously proposed test statistics and a new p-norm difference test (PND). We demonstrate that the true positive rate of the proposed PND test is competitive with and often higher than the other methods, while controlling the false positive rate. The R package discoMod (differentially co-expressed modules) implements the proposed method and provides a full pipeline for identifying DCMs: clustering tools to derive gene modules, tests to identify DCMs, and methods for visualizing the results.

Keywords: differential network analysis; differentially co-expressed modules; gene co-expression networks; networks; statistical inference.

Abstract

Grants and funding