Variable selection for joint models of multivariate skew-normal longitudinal and survival data

Stat Methods Med Res. 2023 Sep;32(9):1694-1710. doi: 10.1177/09622802231181767. Epub 2023 Jul 5.

Abstract

Many joint models of multivariate skew-normal longitudinal and survival data have been presented to accommodate for the non-normality of longitudinal outcomes in recent years. But existing work did not consider variable selection. This article investigates simultaneous parameter estimation and variable selection in joint modeling of longitudinal and survival data. The penalized splines technique is used to estimate unknown log baseline hazard function, the rectangle integral method is adopted to approximate conditional survival function. Monte Carlo expectation-maximization algorithm is developed to estimate model parameters. Based on local linear approximations to conditional expectation of likelihood function and penalty function, a one-step sparse estimation procedure is proposed to circumvent the computationally challenge in optimizing the penalized conditional expectation of likelihood function, which is utilized to select significant covariates and trajectory functions, and identify the departure from normality of longitudinal data. The conditional expectation of likelihood function-based Bayesian information criterion is developed to select the optimal tuning parameter. Simulation studies and a real example from the clinical trial are used to illustrate the proposed methodologies.

Keywords: Longitudinal data; Monte Carlo expectation-maximization algorithm; skew-normal distribution; survival data; variable selection.

MeSH terms

  • Algorithms*
  • Bayes Theorem
  • Computer Simulation
  • Likelihood Functions
  • Longitudinal Studies
  • Models, Statistical*
  • Monte Carlo Method