Reader studies for validation of CAD systems

Brandon D Gallas; David G Brown

doi:10.1016/j.neunet.2007.12.013

Reader studies for validation of CAD systems

Neural Netw. 2008 Mar-Apr;21(2-3):387-97. doi: 10.1016/j.neunet.2007.12.013. Epub 2007 Dec 23.

Authors

Brandon D Gallas¹, David G Brown

Affiliation

¹ NIBIB/CDRH Laboratory for the Assessment of Medical Imaging Systems, FDA, Silver Spring, MD 20993-0002, United States. brandon.gallas@fda.hhs.gov

PMID: 18215501
DOI: 10.1016/j.neunet.2007.12.013

Abstract

Evaluation of computational intelligence (CI) systems designed to improve the performance of a human operator is complicated by the need to include the effect of human variability. In this paper we consider human (reader) variability in the context of medical imaging computer-assisted diagnosis (CAD) systems, and we outline how to compare the detection performance of readers with and without the CAD. An effective and statistically powerful comparison can be accomplished with a receiver operating characteristic (ROC) experiment, summarized by the reader-averaged area under the ROC curve (AUC). The comparison requires sophisticated yet well-developed methods for multi-reader multi-case (MRMC) variance analysis. MRMC variance analysis accounts for random readers, random cases, and correlations in the experiment. In this paper, we extend the methods available for estimating this variability. Specifically, we present a method that can treat arbitrary study designs. Most methods treat only the fully-crossed study design, where every reader reads every case in two experimental conditions. We demonstrate our method with a computer simulation, and we assess the statistical power of a variety of study designs.

MeSH terms

Biomedical Research*
Computer Simulation
Diagnosis, Computer-Assisted*
Diagnostic Imaging
Disease
Humans
Monte Carlo Method
Neural Networks, Computer*
ROC Curve*
Reproducibility of Results
Species Specificity