Equivalence of kernel machine regression and kernel distance covariance for multidimensional phenotype association studies

Biometrics. 2015 Sep;71(3):812-20. doi: 10.1111/biom.12314. Epub 2015 May 1.

Abstract

Associating genetic markers with a multidimensional phenotype is an important yet challenging problem. In this work, we establish the equivalence between two popular methods: kernel-machine regression (KMR), and kernel distance covariance (KDC). KMR is a semiparametric regression framework that models covariate effects parametrically and genetic markers non-parametrically, while KDC represents a class of methods that include distance covariance (DC) and Hilbert-Schmidt independence criterion (HSIC), which are nonparametric tests of independence. We show that the equivalence between the score test of KMR and the KDC statistic under certain conditions can lead to a novel generalization of the KDC test that incorporates covariates. Our contributions are 3-fold: (1) establishing the equivalence between KMR and KDC; (2) showing that the principles of KMR can be applied to the interpretation of KDC; (3) the development of a broader class of KDC statistics, where the class members are statistics corresponding to different kernel combinations. Finally, we perform simulation studies and an analysis of real data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) study. The ADNI study suggest that SNPs of FLJ16124 exhibit pairwise interaction effects that are strongly correlated to the changes of brain region volumes.

Keywords: Confounding; Distance covariance; Hilbert-Schmidt independence criterion; Neuroimaging genomics; Permutation test.

Publication types

  • Comparative Study
  • Evaluation Study

MeSH terms

  • Alzheimer Disease / diagnosis
  • Alzheimer Disease / epidemiology*
  • Alzheimer Disease / genetics*
  • Analysis of Variance
  • Computer Simulation
  • Data Interpretation, Statistical
  • Genetic Association Studies / methods*
  • Genetic Markers / genetics
  • Genetic Predisposition to Disease / epidemiology
  • Genetic Predisposition to Disease / genetics
  • Models, Statistical*
  • Polymorphism, Single Nucleotide / genetics*
  • Prevalence
  • Regression Analysis*
  • Reproducibility of Results
  • Risk Factors
  • Sensitivity and Specificity

Substances

  • Genetic Markers