Error-rate estimation in discriminant analysis of non-linear longitudinal data: A comparison of resampling methods

Rolando de la Cruz; Claudio Fuentes; Cristian Meza; Vicente Núñez-Antón

doi:10.1177/0962280216656246

Error-rate estimation in discriminant analysis of non-linear longitudinal data: A comparison of resampling methods

Stat Methods Med Res. 2018 Apr;27(4):1153-1167. doi: 10.1177/0962280216656246. Epub 2016 Jul 8.

Authors

Rolando de la Cruz¹, Claudio Fuentes², Cristian Meza³, Vicente Núñez-Antón⁴

Affiliations

¹ 1 Institute of Statistics, Pontificia Universidad Católica de Valparaíso, Chile.
² 2 Department of Statistics, Oregon State University, Corvallis, OR, USA.
³ 3 CIMFAV, Facultad de Ingeniería, Universidad de Valparaíso, Valparaíso, Chile.
⁴ 4 Department of Econometrics and Statistics (A.E.III), University of the Basque Country UPV/EHU, Bilbao, Spain.

PMID: 27405324
DOI: 10.1177/0962280216656246

Abstract

Consider longitudinal observations across different subjects such that the underlying distribution is determined by a non-linear mixed-effects model. In this context, we look at the misclassification error rate for allocating future subjects using cross-validation, bootstrap algorithms (parametric bootstrap, leave-one-out, .632 and [Formula: see text]), and bootstrap cross-validation (which combines the first two approaches), and conduct a numerical study to compare the performance of the different methods. The simulation and comparisons in this study are motivated by real observations from a pregnancy study in which one of the main objectives is to predict normal versus abnormal pregnancy outcomes based on information gathered at early stages. Since in this type of studies it is not uncommon to have insufficient data to simultaneously solve the classification problem and estimate the misclassification error rate, we put special attention to situations when only a small sample size is available. We discuss how the misclassification error rate estimates may be affected by the sample size in terms of variability and bias, and examine conditions under which the misclassification error rate estimates perform reasonably well.

Keywords: Parametric bootstrap; bootstrap .632 and .632+; classification error rate; cross-validation bootstrap; leave-one-out bootstrap; longitudinal data; mixed-effects models; non-linear models.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Adult
Bias*
Biomedical Research / statistics & numerical data
Discriminant Analysis*
Female
Humans
Longitudinal Studies*
Nonlinear Dynamics
Pregnancy
Pregnancy Outcome
Sampling Studies*
Young Adult