Instrumental variable model average with applications in Mendelian randomization

Stat Med. 2023 Aug 30;42(19):3547-3567. doi: 10.1002/sim.9819. Epub 2023 Jun 12.

Abstract

Mendelian randomization is a technique used to examine the causal effect of a modifiable exposure on a trait using an observational study by utilizing genetic variants. The use of many instruments can help to improve the estimation precision but may suffer bias when the instruments are weakly associated with the exposure. To overcome the difficulty of high-dimensionality, we propose a model average estimator which involves using different subsets of instruments (single nucleotide polymorphisms, SNPs) to predict the exposure in the first stage, followed by weighting the submodels' predictions using penalization by common penalty functions such as least absolute shrinkage and selection operator (LASSO), smoothly clipped absolute deviation (SCAD) and minimax concave penalty (MCP). The model averaged predictions are then used as a genetically predicted exposure to obtain the estimation of the causal effect on the response in the second stage. The novelty of our model average estimator also lies in that it allows the number of submodels and the submodels' sizes to grow with the sample size. The practical performance of the estimator is examined in a series of numerical studies. We apply the proposed method on a real genetic dataset investigating the relationship between stature and blood pressure.

Keywords: causal inference; genetics; genome-wide association study; instrument variable; model average; penalty function; single nucleotide polymorphism.

Publication types

  • Observational Study

MeSH terms

  • Blood Pressure / genetics
  • Causality
  • Genetic Variation*
  • Genome-Wide Association Study
  • Humans
  • Mendelian Randomization Analysis* / methods
  • Phenotype
  • Polymorphism, Single Nucleotide