Variable selection via penalized generalized estimating equations for a marginal survival model

Stat Methods Med Res. 2020 Sep;29(9):2493-2506. doi: 10.1177/0962280220901728. Epub 2020 Jan 29.

Abstract

Clustered and multivariate survival times, such as times to recurrent events, commonly arise in biomedical and health research, and marginal survival models are often used to model such data. When a large number of predictors are available, variable selection is always an important issue when modeling such data with a survival model. We consider a Cox's proportional hazards model for a marginal survival model. Under the sparsity assumption, we propose a penalized generalized estimating equation approach to select important variables and to estimate regression coefficients simultaneously in the marginal model. The proposed method explicitly models the correlation structure within clusters or correlated variables by using a prespecified working correlation matrix. The asymptotic properties of the estimators from the penalized generalized estimating equations are established and the number of candidate covariates is allowed to increase in the same order as the number of clusters does. We evaluate the performance of the proposed method through a simulation study and analyze two real datasets for the application.

Keywords: Clustered failure time; correlation structure; diverging number of predictors; generalized estimating equations; marginal Cox’s proportional hazards model; multivariate survival time.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation
  • Models, Statistical*
  • Proportional Hazards Models
  • Research Design*