Spike-and-slab type variable selection in the Cox proportional hazards model for high-dimensional features

J Appl Stat. 2021 Mar 4;49(9):2189-2207. doi: 10.1080/02664763.2021.1893285. eCollection 2022.

Abstract

In this paper, we develop a variable selection framework with the spike-and-slab prior distribution via the hazard function of the Cox model. Specifically, we consider the transformation of the score and information functions for the partial likelihood function evaluated at the given data from the parameter space into the space generated by the logarithm of the hazard ratio. Thereby, we reduce the nonlinear complexity of the estimation equation for the Cox model and allow the utilization of a wider variety of stable variable selection methods. Then, we use a stochastic variable search Gibbs sampling approach via the spike-and-slab prior distribution to obtain the sparsity structure of the covariates associated with the survival outcome. Additionally, we conduct numerical simulations to evaluate the finite-sample performance of our proposed method. Finally, we apply this novel framework on lung adenocarcinoma data to find important genes associated with decreased survival in subjects with the disease.

Keywords: 62J05; 62N02; Bayesian modeling; Markov chain Monte Carlo; latent indicator; lung adenocarcinoma; score function; stochastic variable search.

Grants and funding

Wu and Ahn were supported by National Institute of General Medical Sciences of the National Institutes of Health (NIH/NIGMS) under grant number P20 GM103650. Ahn was also supported by the Simons Foundation Award ID 714241. Yang was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government(MSIT) (No. NRF2021R1C1C1007023).