Error-rate estimation in discriminant analysis of non-linear longitudinal data: A comparison of resampling methods

Stat Methods Med Res. 2018 Apr;27(4):1153-1167. doi: 10.1177/0962280216656246. Epub 2016 Jul 8.

Abstract

Consider longitudinal observations across different subjects such that the underlying distribution is determined by a non-linear mixed-effects model. In this context, we look at the misclassification error rate for allocating future subjects using cross-validation, bootstrap algorithms (parametric bootstrap, leave-one-out, .632 and [Formula: see text]), and bootstrap cross-validation (which combines the first two approaches), and conduct a numerical study to compare the performance of the different methods. The simulation and comparisons in this study are motivated by real observations from a pregnancy study in which one of the main objectives is to predict normal versus abnormal pregnancy outcomes based on information gathered at early stages. Since in this type of studies it is not uncommon to have insufficient data to simultaneously solve the classification problem and estimate the misclassification error rate, we put special attention to situations when only a small sample size is available. We discuss how the misclassification error rate estimates may be affected by the sample size in terms of variability and bias, and examine conditions under which the misclassification error rate estimates perform reasonably well.

Keywords: Parametric bootstrap; bootstrap .632 and .632+; classification error rate; cross-validation bootstrap; leave-one-out bootstrap; longitudinal data; mixed-effects models; non-linear models.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Bias*
  • Biomedical Research / statistics & numerical data
  • Discriminant Analysis*
  • Female
  • Humans
  • Longitudinal Studies*
  • Nonlinear Dynamics
  • Pregnancy
  • Pregnancy Outcome
  • Sampling Studies*
  • Young Adult