Unequal intra-group variance in trajectory classification

Stat Med. 2018 Dec 10;37(28):4155-4166. doi: 10.1002/sim.7921. Epub 2018 Aug 2.

Abstract

Classifying patients into groups according to longitudinal series of measurements (ie, trajectory classification) has become frequent in clinical research. Most classification models suppose an equal intra-group variance across groups. This assumption is sometimes inappropriate because measurements in diseased subjects are often more heterogeneous than in healthy ones. We developed a new classification model for trajectories that uses unequal intra-group variance across groups and evaluated its impact on classification using simulations and a clinical study. The classification and typical trajectories were estimated using the classification Expectation Maximization (EM) algorithm to maximize the classification likelihood, the log-likelihood being profiled during the Maximization (M) step of the algorithm. The simulations showed that assuming equal intra-group variance resulted in a high misclassification rate (up to 50%) when the real intra-group variances were different. This rate was greatly reduced by allowing intra-group variances to be different. Similar classification was obtained when the real intra-group variances were equal, except when the total sample size and the number of repeated measurements were small. In a randomized trial that compared the effect of low vs standard cyclosporine A dose on creatinine levels after cardiac transplantation, the classification model with unequal intra-group variance led to more meaningful groups than with equal intra-group variance and showed distinct benefits of low dose. In conclusion, we recommend the use of a classification model for trajectories that allows for unequal intra-group variance across groups except when the number of repeated measurements and total sample size are small.

Keywords: ECM algorithm; classification; heterogeneity; intra-group variance; longitudinal measure; trajectories.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Biological Variation, Population*
  • Biomarkers
  • Classification
  • Data Interpretation, Statistical*
  • Humans
  • Likelihood Functions
  • Models, Statistical
  • Normal Distribution
  • Sample Size
  • Treatment Outcome*

Substances

  • Biomarkers