Composite kernel machine regression based on likelihood ratio test for joint testing of genetic and gene-environment interaction effect

Biometrics. 2019 Jun;75(2):625-637. doi: 10.1111/biom.13003. Epub 2019 Mar 30.

Abstract

Most common human diseases are a result from the combined effect of genes, the environmental factors, and their interactions such that including gene-environment (GE) interactions can improve power in gene mapping studies. The standard strategy is to test the SNPs, one-by-one, using a regression model that includes both the SNP effect and the GE interaction. However, the SNP-by-SNP approach has serious limitations, such as the inability to model epistatic SNP effects, biased estimation, and reduced power. Thus, in this article, we develop a kernel machine regression framework to model the overall genetic effect of a SNP-set, considering the possible GE interaction. Specifically, we use a composite kernel to specify the overall genetic effect via a nonparametric function andwe model additional covariates parametrically within the regression framework. The composite kernel is constructed as a weighted average of two kernels, one corresponding to the genetic main effect and one corresponding to the GE interaction effect. We propose a likelihood ratio test (LRT) and a restricted likelihood ratio test (RLRT) for statistical significance. We derive a Monte Carlo approach for the finite sample distributions of LRT and RLRT statistics. Extensive simulations and real data analysis show that our proposed method has correct type I error and can have higher power than score-based approaches under many situations.

Keywords: gene-environment interactions; kernel machine testing; likelihood ratio test; multiple variance components; spectral decomposition; unidentifiable conditions.

MeSH terms

  • Computer Simulation
  • Gene-Environment Interaction*
  • Humans
  • Likelihood Functions*
  • Models, Genetic*
  • Polymorphism, Single Nucleotide
  • Regression Analysis
  • Spatial Analysis*