Complex-disease networks of trait-associated single-nucleotide polymorphisms (SNPs) unveiled by information theory

J Am Med Inform Assoc. 2012 Mar-Apr;19(2):295-305. doi: 10.1136/amiajnl-2011-000482. Epub 2012 Jan 25.

Abstract

Objective: Thousands of complex-disease single-nucleotide polymorphisms (SNPs) have been discovered in genome-wide association studies (GWAS). However, these intragenic SNPs have not been collectively mined to unveil the genetic architecture between complex clinical traits. The authors hypothesize that biological annotations of host genes of trait-associated SNPs may reveal the biomolecular modularity across complex-disease traits and offer insights for drug repositioning.

Methods: Trait-to-polymorphism (SNPs) associations confirmed in GWAS were used. A novel method to quantify trait-trait similarity anchored in Gene Ontology annotations of human proteins and information theory was developed. The results were then validated with the shortest paths of physical protein interactions between biologically similar traits.

Results: A network was constructed consisting of 280 significant intertrait similarities among 177 disease traits, which covered 1438 well-validated disease-associated SNPs. Thirty-nine percent of intertrait connections were confirmed by curators, and the following additional studies demonstrated the validity of a proportion of the remainder. On a phenotypic trait level, higher Gene Ontology similarity between proteins correlated with smaller 'shortest distance' in protein interaction networks of complexly inherited diseases (Spearman p<2.2×10(-16)). Further, 'cancer traits' were similar to one another, as were 'metabolic syndrome traits' (Fisher's exact test p=0.001 and 3.5×10(-7), respectively).

Conclusion: An imputed disease network by information-anchored functional similarity from GWAS trait-associated SNPs is reported. It is also demonstrated that small shortest paths of protein interactions correlate with complex-disease function. Taken together, these findings provide the framework for investigating drug targets with unbiased functional biomolecular networks rather than worn-out single-gene and subjective canonical pathway approaches.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Disease / genetics*
  • Genome-Wide Association Study
  • Humans
  • Information Theory*
  • Models, Biological*
  • Phenotype
  • Polymorphism, Single Nucleotide*
  • Protein Interaction Maps