Multi-subgroup gene screening using semi-parametric hierarchical mixture models and the optimal discovery procedure: Application to a randomized clinical trial in multiple myeloma

Shigeyuki Matsui; Hisashi Noma; Pingping Qu; Yoshio Sakai; Kota Matsui; Christoph Heuck; John Crowley

doi:10.1111/biom.12716

Multi-subgroup gene screening using semi-parametric hierarchical mixture models and the optimal discovery procedure: Application to a randomized clinical trial in multiple myeloma

Biometrics. 2018 Mar;74(1):313-320. doi: 10.1111/biom.12716. Epub 2017 May 12.

Authors

Shigeyuki Matsui¹, Hisashi Noma², Pingping Qu³, Yoshio Sakai⁴, Kota Matsui¹, Christoph Heuck⁵, John Crowley³

Affiliations

¹ Department of Biostatistics, Nagoya University Graduate School of Medicine, 65 Tsurumai-cho, Showa-ku, Nagoya, Aichi 466-8550, Japan.
² Department of Data Science, The Institute of Statistical Mathematics, Tachikawa, Tokyo, Japan.
³ Cancer Research And Biostatistics, Seattle, Washington, U.S.A.
⁴ Department of Gastroenterology, Kanazawa University Hospital, Kanazawa, Ishikawa, Japan.
⁵ The Myeloma Institute, University of Arkansas for Medical Science, Little Rock, Arkansas, U.S.A.

PMID: 28498490
DOI: 10.1111/biom.12716

Abstract

This article proposes an efficient approach to screening genes associated with a phenotypic variable of interest in genomic studies with subgroups. In order to capture and detect various association profiles across subgroups, we flexibly estimate the underlying effect size distribution across subgroups using a semi-parametric hierarchical mixture model for subgroup-specific summary statistics from independent subgroups. We then perform gene ranking and selection using an optimal discovery procedure based on the fitted model with control of false discovery rate. Efficiency of the proposed approach, compared with that based on standard regression models with covariates representing subgroups, is demonstrated through application to a randomized clinical trial with microarray gene expression data in multiple myeloma, and through a simulation experiment.

Keywords: Gene screening; Multiple subgroups; Optimal discovery procedure; Prognostic and predictive genes; Randomized clinical trials; Semi-parametric hierarchical mixture models.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Gene Expression Profiling
Genetic Testing*
Humans
Models, Statistical*
Multiple Myeloma / genetics
Oligonucleotide Array Sequence Analysis
Randomized Controlled Trials as Topic