Comparative statistical properties of expected utility and area under the ROC curve for laboratory studies of observer performance in screening mammography

Craig K Abbey; Brandon D Gallas; John M Boone; Loren T Niklason; Lubomir M Hadjiiski; Berkman Sahiner; Frank W Samuelson

doi:10.1016/j.acra.2013.12.011

Comparative statistical properties of expected utility and area under the ROC curve for laboratory studies of observer performance in screening mammography

Acad Radiol. 2014 Apr;21(4):481-90. doi: 10.1016/j.acra.2013.12.011.

Authors

Craig K Abbey¹, Brandon D Gallas², John M Boone³, Loren T Niklason⁴, Lubomir M Hadjiiski⁵, Berkman Sahiner², Frank W Samuelson²

Affiliations

¹ Department of Psychological and Brain Sciences, University of California, Santa Barbara, CA 93106. Electronic address: abbey@psych.ucsb.edu.
² US Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Silver Spring, MD.
³ Department of Radiology, UC Davis Medical Center, Sacramento, CA.
⁴ Hologic Inc., Bedford, MA.
⁵ Department of Radiology, University of Michigan Comprehensive Cancer Center, Ann Arbor, MI.

Abstract

Rationale and objectives: Our objective is to determine whether expected utility (EU) and the area under the receiver operator characteristic (AUC) are consistent with one another as endpoints of observer performance studies in mammography. These two measures characterize receiver operator characteristic performance somewhat differently. We compare these two study endpoints at the level of individual reader effects, statistical inference, and components of variance across readers and cases.

Materials and methods: We reanalyze three previously published laboratory observer performance studies that investigate various x-ray breast imaging modalities using EU and AUC. The EU measure is based on recent estimates of relative utility for screening mammography.

Results: The AUC and EU measures are correlated across readers for individual modalities (r = 0.93) and differences in modalities (r = 0.94 to 0.98). Statistical inference for modality effects based on multi-reader multi-case analysis is very similar, with significant results (P < .05) in exactly the same conditions. Power analyses show mixed results across studies, with a small increase in power on average for EU that corresponds to approximately a 7% reduction in the number of readers. Despite a large number of crossing receiver operator characteristic curves (59% of readers), modality effects only rarely have opposite signs for EU and AUC (6%).

Conclusions: We do not find any evidence of systematic differences between EU and AUC in screening mammography observer studies. Thus, when utility approaches are viable (i.e., an appropriate value of relative utility exists), practical effects such as statistical efficiency may be used to choose study endpoints.

Keywords: Expected utility; area under the ROC curve; observer performance studies.

Publication types

Case Reports
Comparative Study
Meta-Analysis
Research Support, N.I.H., Extramural
Research Support, U.S. Gov't, Non-P.H.S.
Research Support, U.S. Gov't, P.H.S.

MeSH terms

Breast Neoplasms / diagnostic imaging*
Clinical Competence / statistics & numerical data*
Clinical Laboratory Techniques / statistics & numerical data
Data Interpretation, Statistical
Female
Humans
Mammography / statistics & numerical data*
Mass Screening / statistics & numerical data*
Observer Variation
ROC Curve*
Radiographic Image Interpretation, Computer-Assisted / methods*
Reproducibility of Results
Sensitivity and Specificity

Abstract

Publication types

MeSH terms

Grants and funding