Sparse meta-analysis with high-dimensional data

Qianchuan He; Hao Helen Zhang; Christy L Avery; D Y Lin

doi:10.1093/biostatistics/kxv038

Sparse meta-analysis with high-dimensional data

Biostatistics. 2016 Apr;17(2):205-20. doi: 10.1093/biostatistics/kxv038. Epub 2015 Sep 21.

Authors

Qianchuan He¹, Hao Helen Zhang², Christy L Avery³, D Y Lin⁴

Affiliations

¹ Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA.
² Department of Mathematics, The University of Arizona, Tucson, AZ 85721, USA.
³ Department of Epidemiology, University of North Carolina, Chapel Hill, NC 27599, USA.
⁴ Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, USA lin@bios.unc.edu.

Abstract

Meta-analysis plays an important role in summarizing and synthesizing scientific evidence derived from multiple studies. With high-dimensional data, the incorporation of variable selection into meta-analysis improves model interpretation and prediction. Existing variable selection methods require direct access to raw data, which may not be available in practical situations. We propose a new approach, sparse meta-analysis (SMA), in which variable selection for meta-analysis is based solely on summary statistics and the effect sizes of each covariate are allowed to vary among studies. We show that the SMA enjoys the oracle property if the estimated covariance matrix of the parameter estimators from each study is available. We also show that our approach achieves selection consistency and estimation consistency even when summary statistics include only the variance estimators or no variance/covariance information at all. Simulation studies and applications to high-throughput genomics studies demonstrate the usefulness of our approach.

Keywords: Fixed-effects models; Genomics studies; Oracle property; Random-effects models; Variable selection; Within-group sparsity.

Publication types

Research Support, N.I.H., Extramural
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Computer Simulation
Data Interpretation, Statistical*
Genome-Wide Association Study / methods*
Genomics / methods*
Humans
Meta-Analysis as Topic*
Models, Statistical*

Abstract

Publication types

MeSH terms

Grants and funding