Exploration of empirical Bayes hierarchical modeling for the analysis of genome-wide association study data

Biostatistics. 2011 Jul;12(3):445-61. doi: 10.1093/biostatistics/kxq072. Epub 2011 Jan 20.

Abstract

In the analysis of genome-wide association (GWA) data, the aim is to detect statistical associations between single nucleotide polymorphisms (SNPs) and the disease or trait of interest. These SNPs, or the particular regions of the genome they implicate, are then considered for further study. We demonstrate through a comprehensive simulation study that the inclusion of additional, biologically relevant information through a 2-level empirical Bayes hierachical model framework offers a more robust method of detecting associated SNPs. The empirical Bayes approach is an objective means of analyzing the data without the need for the setting of subjective parameter estimates. This framework gives more stable estimates of effects through a reduction of the variability in the usual effect estimates. We also demonstrate the consequences of including additional information that is not informative and examine power and false-positive rates. We apply the methodology to a number of genome-wide association (GWA) data sets with the inclusion of additional biological information. Our results agree with previous findings and in the case of one data set (Crohn's disease) suggest an additional region of interest.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Arthritis, Rheumatoid / genetics
  • Bayes Theorem*
  • Case-Control Studies
  • Computer Simulation
  • Coronary Artery Disease / genetics
  • Crohn Disease / genetics
  • Diabetes Mellitus, Type 2 / genetics
  • Genome-Wide Association Study / methods*
  • Genotype
  • Humans
  • Models, Genetic*
  • Models, Statistical*
  • Phenotype
  • Polymorphism, Single Nucleotide