Information-theoretic metrics for visualizing gene-environment interactions

Am J Hum Genet. 2007 Nov;81(5):939-63. doi: 10.1086/521878. Epub 2007 Oct 3.

Abstract

The purpose of our work was to develop heuristics for visualizing and interpreting gene-environment interactions (GEIs) and to assess the dependence of candidate visualization metrics on biological and study-design factors. Two information-theoretic metrics, the k-way interaction information (KWII) and the total correlation information (TCI), were investigated. The effectiveness of the KWII and TCI to detect GEIs in a diverse range of simulated data sets and a Crohn disease data set was assessed. The sensitivity of the KWII and TCI spectra to biological and study-design variables was determined. Head-to-head comparisons with the relevance-chain, multifactor dimensionality reduction, and the pedigree disequilibrium test (PDT) methods were obtained. The KWII and TCI spectra, which are graphical summaries of the KWII and TCI for each subset of environmental and genotype variables, were found to detect each known GEI in the simulated data sets. The patterns in the KWII and TCI spectra were informative for factors such as case-control misassignment, locus heterogeneity, allele frequencies, and linkage disequilibrium. The KWII and TCI spectra were found to have excellent sensitivity for identifying the key disease-associated genetic variations in the Crohn disease data set. In head-to-head comparisons with the relevance-chain, multifactor dimensionality reduction, and PDT methods, the results from visual interpretation of the KWII and TCI spectra performed satisfactorily. The KWII and TCI are promising metrics for visualizing GEIs. They are capable of detecting interactions among numerous single-nucleotide polymorphisms and environmental variables for a diverse range of GEI models.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Chromosomes, Human, Pair 5 / genetics
  • Computer Simulation
  • Crohn Disease / genetics
  • Epistasis, Genetic*
  • Female
  • Gene Frequency
  • Humans
  • Information Theory*
  • Linkage Disequilibrium / genetics
  • Male
  • Models, Genetic*
  • Pedigree
  • Phenotype
  • Polymorphism, Single Nucleotide / genetics
  • Risk Factors
  • Sample Size