The Effect of Class Imbalance on Precision-Recall Curves

Christopher K I Williams

doi:10.1162/neco_a_01362

The Effect of Class Imbalance on Precision-Recall Curves

Neural Comput. 2021 Apr 1;33(4):853-857. doi: 10.1162/neco_a_01362.

Author

Christopher K I Williams¹

Affiliation

¹ School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, U.K. ckiw@inf.ed.ac.uk.

PMID: 33513323
DOI: 10.1162/neco_a_01362

Abstract

In this note, I study how the precision of a binary classifier depends on the ratio r of positive to negative cases in the test set, as well as the classifier's true and false-positive rates. This relationship allows prediction of how the precision-recall curve will change with r, which seems not to be well known. It also allows prediction of how Fβ and the precision gain and recall gain measures of Flach and Kull (2015) vary with r.