Modeling of correlated data with informative cluster sizes: An evaluation of joint modeling and within-cluster resampling approaches

Stat Methods Med Res. 2017 Aug;26(4):1881-1895. doi: 10.1177/0962280215592268. Epub 2015 Jun 24.

Abstract

Joint modeling and within-cluster resampling are two approaches that are used for analyzing correlated data with informative cluster sizes. Motivated by a developmental toxicity study, we examined the performances and validity of these two approaches in testing covariate effects in generalized linear mixed-effects models. We show that the joint modeling approach is robust to the misspecification of cluster size models in terms of Type I and Type II errors when the corresponding covariates are not included in the random effects structure; otherwise, statistical tests may be affected. We also evaluate the performance of the within-cluster resampling procedure and thoroughly investigate the validity of it in modeling correlated data with informative cluster sizes. We show that within-cluster resampling is a valid alternative to joint modeling for cluster-specific covariates, but it is invalid for time-dependent covariates. The two methods are applied to a developmental toxicity study that investigated the effect of exposure to diethylene glycol dimethyl ether.

Keywords: Joint random effects model; hypothesis testing; nonignorable cluster size; power; within-cluster resampling.

MeSH terms

  • Animals
  • Cluster Analysis*
  • Ethylene Glycols / toxicity
  • Female
  • Linear Models*
  • Methyl Ethers / toxicity
  • Rabbits
  • Reproducibility of Results
  • Toxicity Tests

Substances

  • Ethylene Glycols
  • Methyl Ethers
  • diglyme