Estimating error rates in the classification of paired organs

Alexander Brenning; Berthold Lausen

doi:10.1002/sim.3310

Estimating error rates in the classification of paired organs

Stat Med. 2008 Sep 30;27(22):4515-31. doi: 10.1002/sim.3310.

Authors

Alexander Brenning¹, Berthold Lausen

Affiliation

¹ Institut für Medizininformatik, Biometrie und Epidemiologie, Universität Erlangen, Waldstr. 6, 91054 Erlangen, Germany.

PMID: 18465836
DOI: 10.1002/sim.3310

Abstract

Clinical data from paired organs present a dependence structure that has to be considered when making statistical inference or evaluating classification rules with resampling-based techniques (bootstrap, cross-validation). We introduce a paired cross-validation approach for the estimation of misclassification error rates in the classification of data from paired organs. The dependence structure of the sample is honored by subject-level cross-validation. Theoretical considerations as well as a case-control study on glaucoma diagnosis and a simulation study show that the variance of the paired cross-validation estimator is considerably lower than in traditional cross-validation error estimation on one randomly selected eye per subject. The actual variance reduction is mainly controlled by the contribution of differential misclassification between both eyes to the overall error rate. By contrast, 'ad hoc' cross-validation ignoring the autocorrelation of paired organs leads to biased error estimates. Using the double-bagging technique, we also show that classification accuracy can be improved by using information from both eyes in training machine-learning classifiers. In glaucoma detection, the reduction in misclassification error rates by training data from both eyes is equivalent to an increase in the sample size by one-third to one-half, which is an important achievement in clinical studies.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Bias
Case-Control Studies
Computer Simulation
Data Interpretation, Statistical*
Diagnostic Errors*
Glaucoma / diagnosis
Humans
Middle Aged
Registries
Reproducibility of Results
Sensitivity and Specificity