On the meaning of the weighted alternative free-response operating characteristic figure of merit

Med Phys. 2016 May;43(5):2548. doi: 10.1118/1.4947125.

Abstract

Purpose: The free-response receiver operating characteristic (FROC) method is being increasingly used to evaluate observer performance in search tasks. Data analysis requires definition of a figure of merit (FOM) quantifying performance. While a number of FOMs have been proposed, the recommended one, namely, the weighted alternative FROC (wAFROC) FOM, is not well understood. The aim of this work is to clarify the meaning of this FOM by relating it to the empirical area under a proposed wAFROC curve.

Methods: The weighted wAFROC FOM is defined in terms of a quasi-Wilcoxon statistic that involves weights, coding the clinical importance, assigned to each lesion. A new wAFROC curve is proposed, the y-axis of which incorporates the weights, giving more credit for marking clinically important lesions, while the x-axis is identical to that of the AFROC curve. An expression is derived relating the area under the empirical wAFROC curve to the wAFROC FOM. Examples are presented with small numbers of cases showing how AFROC and wAFROC curves are affected by correct and incorrect decisions and how the corresponding FOMs credit or penalize these decisions. The wAFROC, AFROC, and inferred ROC FOMs were applied to three clinical data sets involving multiple reader FROC interpretations in different modalities.

Results: It is shown analytically that the area under the empirical wAFROC curve equals the wAFROC FOM. This theorem is the FROC analog of a well-known theorem developed in 1975 for ROC analysis, which gave meaning to a Wilcoxon statistic based ROC FOM. A similar equivalence applies between the area under the empirical AFROC curve and the AFROC FOM. The examples show explicitly that the wAFROC FOM gives equal importance to all diseased cases, regardless of the number of lesions, a desirable statistical property not shared by the AFROC FOM. Applications to the clinical data sets show that the wAFROC FOM yields results comparable to that using the AFROC FOM.

Conclusions: The equivalence theorem gives meaning to the weighted AFROC FOM, namely, it is identical to the empirical area under weighted AFROC curve.

MeSH terms

  • Algorithms
  • Area Under Curve
  • Breast / diagnostic imaging
  • Breast Diseases / diagnostic imaging
  • Calcinosis / diagnostic imaging
  • Computer Simulation
  • Data Interpretation, Statistical
  • Datasets as Topic
  • Humans
  • Mammography / instrumentation
  • Mammography / methods
  • Models, Anatomic
  • Models, Statistical*
  • Phantoms, Imaging
  • Positron-Emission Tomography / instrumentation
  • Positron-Emission Tomography / methods
  • ROC Curve*
  • Software
  • Tomography, X-Ray Computed / instrumentation
  • Tomography, X-Ray Computed / methods