On the bias in the AUC variance estimate

Pattern Recognit Lett. 2024 Feb:178:62-68. doi: 10.1016/j.patrec.2023.12.012. Epub 2023 Dec 27.

Abstract

The area under the Receiver Operating Characteristic (ROC) curve (AUC) is a standard metric for quantifying and comparing binary classifiers. A popular approach to estimating the AUCs and the associated variabilities - the variance of the AUC or the full covariance matrix of multiple correlated AUCs - is the one proposed by DeLong et al [1], which is based on the Mann Whitney two-sample U-statistics. The bias of a variance estimator is an important factor in applications such as hypothesis testing and construction of confidence intervals - a negatively biased variance estimator may lead to incorrect conclusions, and a positive bias is conservative hence preferable. In this work, we show that the (co-)variance estimate in DeLong's approach is always positively biased. More specifically, the difference matrix between the expectation of the estimated covariance and the true covariance is a positive semi-definite matrix. This bias is non-negligible when the sample size is small, and quickly diminishes as the sample size increases. Our method relies on constructing, from the AUC kernel, a random variable whose (co-)variance matrix coincides with the bias, thereby establishing the claim. We also discuss alternative approaches to AUC variance estimation that may potentially reduce the bias.

Keywords: ANOVA; H-decomposition; area under the ROC curve (AUC); binary classification; jackknife; receiver operating characteristic (ROC); structural components.