Identifying gene-environment interactions incorporating prior information

Stat Med. 2019 Apr 30;38(9):1620-1633. doi: 10.1002/sim.8064. Epub 2019 Jan 13.

Abstract

For many complex diseases, gene-environment (G-E) interactions have independent contributions beyond the main G and E effects. Despite extensive effort, it still remains challenging to identify G-E interactions. With the long accumulation of experiments and data, for many biomedical problems of common interest, there are existing studies that can be relevant and informative for the identification of G-E interactions and/or main effects. In this study, our goal is to identify G-E interactions (as well as their corresponding main G effects) under a joint statistical modeling framework. Significantly advancing from the existing studies, a quasi-likelihood-based approach is developed to incorporate information mined from the existing literature. A penalization approach is adopted for identification and selection and respects the "main effects, interactions" hierarchical structure. Simulation shows that, when the existing information is of high quality, significant improvement can be observed. On the other hand, when the existing information is less informative, the proposed method still performs reasonably (and hence demonstrates a certain degree of "robustness"). The analysis of The Cancer Genome Atlas (TCGA) data on cutaneous melanoma and glioblastoma multiforme demonstrates the practical applicability of the proposed approach and also leads to sensible findings.

Keywords: G-E interaction; penalized joint analysis; prior information; quasi-likelihood.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Central Nervous System Neoplasms / genetics
  • Computer Simulation
  • Confounding Factors, Epidemiologic*
  • Gene-Environment Interaction
  • Glioblastoma / genetics
  • Humans
  • Likelihood Functions*
  • Melanoma / genetics
  • Multivariate Analysis
  • Skin Neoplasms / genetics