Estimation in multivariate t$t$ linear mixed models for longitudinal data with multiple outputs: Application to PBCseq data analysis

Biom J. 2022 Mar;64(3):539-556. doi: 10.1002/bimj.202000015. Epub 2021 Nov 25.

Abstract

In many biomedical studies or clinical trials, we have data with more than one response variable on the same subject repeatedly measured over time. In analyzing such data, we adopt a multivariate linear mixed-effects longitudinal model. On the other hand, in longitudinal data, we often find features that do not impact modeling the response variable and are eliminated from the study. In this paper, we consider the problem of simultaneous variable selection and estimation in a multivariate t linear mixed-effects model (MtLMM) for analyzing longitudinally measured multioutcome data. This work's motivation comes from a cohort study of patients with primary biliary cirrhosis. The interest is eliminating insignificant variables using the smoothly clipped and absolute deviation penalty function in the MtLMM. The proposed penalized model offers robustness and flexibility to accommodate fat tails. An expectation conditional maximization algorithm is employed for the computation of maximum likelihood estimates of parameters. The calculation of standard errors is affected by an information-based method. The methodology is illustrated by analyzing Mayo Clinic Primary Biliary Cirrhosis sequential (PBCseq) data and a simulation study. We found drugs and sex can be eliminated from the PBCseq analysis, and over time the disease progresses.

Keywords: SCAD penalty; heavy-tailed distribution; longitudinal data; multivariate mixed-effects model; penalized.

MeSH terms

  • Algorithms
  • Cohort Studies
  • Computer Simulation
  • Data Analysis*
  • Humans
  • Likelihood Functions
  • Linear Models
  • Liver Cirrhosis, Biliary* / genetics
  • Longitudinal Studies