Comparing multi-class classifier performance by multi-class ROC analysis: A nonparametric approach

Neurocomputing (Amst). 2024 May 28:583:127520. doi: 10.1016/j.neucom.2024.127520. Epub 2024 Mar 6.

Abstract

The area under the Receiver Operating Characteristic (ROC) curve (AUC) is a standard metric for quantifying and comparing binary classifiers. Real world applications often require classification into multiple (more than two) classes. For multi-class classifiers that produce class membership scores, a popular multi-class AUC (MAUC) variant is to average the pairwise AUC values [1]. Due to the complicated correlation patterns, the variance of MAUC is often estimated numerically using resampling techniques. This work is a generalization of DeLong's nonparameteric approach for binary AUC analysis [2] to MAUC. We first derive the closed-form expression of the covariance matrix of the pairwise AUCs within a single MAUC. Then by dropping higher order terms, we obtain an approximate covariance matrix with a compact, matrix factorization form, which then serves as the basis for variance estimation of a single MAUC. We further extend this approach to estimate the covariance of correlated MAUCs that arise from multiple competing classifiers. For the special case of binary correlated AUCs, our results coincide with that of DeLong. Our numerical studies confirm the accuracy of the variance and covariance estimates. We provide the source code of the proposed covariance estimation of correlated MAUCs on GitHub (https://tinyurl.com/euj6wvsz) for its easy adoption by machine learning and statistical analysis packages to quantify and compare multi-class classifiers.

Keywords: Ustatistics; area under the ROC curve (AUC); jackknife; multi-class AUC; multi-class classification; receiver operating characteristic (ROC).