Repeated measures discriminant analysis using multivariate generalized estimation equations

Stat Methods Med Res. 2022 Apr;31(4):646-657. doi: 10.1177/09622802211032705. Epub 2021 Dec 13.

Abstract

Discriminant analysis procedures that assume parsimonious covariance and/or means structures have been proposed for distinguishing between two or more populations in multivariate repeated measures designs. However, these procedures rely on the assumptions of multivariate normality which is not tenable in multivariate repeated measures designs which are characterized by binary, ordinal, or mixed types of response distributions. This study investigates the accuracy of repeated measures discriminant analysis (RMDA) based on the multivariate generalized estimating equations (GEE) framework for classification in multivariate repeated measures designs with the same or different types of responses repeatedly measured over time. Monte Carlo methods were used to compare the accuracy of RMDA procedures based on GEE, and RMDA based on maximum likelihood estimators (MLE) under diverse simulation conditions, which included number of repeated measure occasions, number of responses, sample size, correlation structures, and type of response distribution. RMDA based on GEE exhibited higher average classification accuracy than RMDA based on MLE especially in multivariate non-normal distributions. Three repeatedly measured responses namely severity of epilepsy, current number of anti-epileptic drugs, and parent-reported quality of life in children with epilepsy were used to demonstrate the application of these procedures.

Keywords: Discriminant analysis; classification; generalized estimating equation; multivariate non-normal distribution; multivariate repeated measures data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Child
  • Computer Simulation
  • Discriminant Analysis
  • Humans
  • Models, Statistical*
  • Monte Carlo Method
  • Quality of Life*
  • Sample Size