Instrumental variable model average with applications in Mendelian randomization

Loraine Liping Seng; Ching-Ti Liu; Jingli Wang; Jialiang Li

doi:10.1002/sim.9819

Instrumental variable model average with applications in Mendelian randomization

Stat Med. 2023 Aug 30;42(19):3547-3567. doi: 10.1002/sim.9819. Epub 2023 Jun 12.

Authors

Loraine Liping Seng^{1

2}, Ching-Ti Liu^{3

4}, Jingli Wang⁵, Jialiang Li^{1

2}

Affiliations

¹ Department of Statistics and Data Science, National University of Singapore, Singapore.
² Duke-NUS Graduate Medical School, National University of Singapore, Singapore.
³ Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, USA.
⁴ Department of Statistics, National Cheng Kung University, Tainan, Taiwan.
⁵ School of Statistics and Data Science, Nankai University, China.

PMID: 37476915
DOI: 10.1002/sim.9819

Abstract

Mendelian randomization is a technique used to examine the causal effect of a modifiable exposure on a trait using an observational study by utilizing genetic variants. The use of many instruments can help to improve the estimation precision but may suffer bias when the instruments are weakly associated with the exposure. To overcome the difficulty of high-dimensionality, we propose a model average estimator which involves using different subsets of instruments (single nucleotide polymorphisms, SNPs) to predict the exposure in the first stage, followed by weighting the submodels' predictions using penalization by common penalty functions such as least absolute shrinkage and selection operator (LASSO), smoothly clipped absolute deviation (SCAD) and minimax concave penalty (MCP). The model averaged predictions are then used as a genetically predicted exposure to obtain the estimation of the causal effect on the response in the second stage. The novelty of our model average estimator also lies in that it allows the number of submodels and the submodels' sizes to grow with the sample size. The practical performance of the estimator is examined in a series of numerical studies. We apply the proposed method on a real genetic dataset investigating the relationship between stature and blood pressure.

Keywords: causal inference; genetics; genome-wide association study; instrument variable; model average; penalty function; single nucleotide polymorphism.

Publication types

Observational Study

MeSH terms

Blood Pressure / genetics
Causality
Genetic Variation*
Genome-Wide Association Study
Humans
Mendelian Randomization Analysis* / methods
Phenotype
Polymorphism, Single Nucleotide