SigMod: an exact and efficient method to identify a strongly interconnected disease-associated module in a gene network

Bioinformatics. 2017 May 15;33(10):1536-1544. doi: 10.1093/bioinformatics/btx004.

Abstract

Motivation: Apart from single marker-based tests classically used in genome-wide association studies (GWAS), network-assisted analysis has become a promising approach to identify a set of genes associated with disease. To date, most network-assisted methods aim at finding genes connected in a background network, whatever the density or strength of their connections. This can hamper the findings as sparse connections are non-robust against noise from either the GWAS results or the network resource.

Results: We present SigMod, a novel and efficient method integrating GWAS results and gene network to identify a strongly interconnected gene module enriched in high association signals. Our method is formulated as a binary quadratic optimization problem, which can be solved exactly through graph min-cut algorithms. Compared to existing methods, SigMod has several desirable properties: (i) edge weights quantifying confidence of connections between genes are taken into account, (ii) the selection path can be computed rapidly, (iii) the identified gene module is strongly interconnected, hence includes genes of high functional relevance, and (iv) the method is robust against noise from either the GWAS results or the network resource. We applied SigMod to both simulated and real data. It was found to outperform state-of-the-art network-assisted methods in identifying disease-associated genes. When SigMod was applied to childhood-onset asthma GWAS results, it successfully identified a gene module enriched in consistently high association signals and made of functionally related genes that are biologically relevant for asthma.

Availability and implementation: An R package SigMod is available at: https://github.com/YuanlongLiu/SigMod.

Contact: yuanlong.liu@inserm.fr.

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Algorithms
  • Asthma / genetics
  • Computational Biology / methods*
  • Gene Regulatory Networks*
  • Genetic Predisposition to Disease
  • Genome-Wide Association Study / methods*
  • Humans
  • Polymorphism, Single Nucleotide*
  • Software*