Distributed eQTL analysis with auxiliary information

J Stat Plan Inference. 2024 Jan:228:34-45. doi: 10.1016/j.jspi.2023.06.003. Epub 2023 Jun 28.

Abstract

Expression quantitative trait locus (eQTL) analysis is a useful tool to identify genetic loci that are associated with gene expression levels. Large collaborative efforts such as the Genotype-Tissue Expression (GTEx) project provide valuable resources for eQTL analysis in different tissues. Most existing methods, however, either focus on one tissue at a time, or analyze multiple tissues to identify eQTLs jointly present in multiple tissues. There is a lack of powerful methods to identify eQTLs in a target tissue while effectively borrowing strength from auxiliary tissues. In this paper, we propose a novel statistical framework to improve the eQTL detection efficacy in the tissue of interest with auxiliary information from other tissues. This framework can enhance the power of the hypothesis test for eQTL effects by incorporating shared and specific effects from multiple tissues into the test statistics. We also devise data-driven and distributed computing approaches for efficient implementation of eQTL detection when the number of tissues is large. Numerical studies in simulation demonstrate the efficacy of the proposed method, and the real data analysis of the GTEx example provides novel insights into eQTL findings in different tissues.

Keywords: Auxiliary information; Distributed computing; Multiple testing; eQTL analysis.