CNN as model observer in a liver lesion detection task for x-ray computed tomography: A phantom study

Med Phys. 2018 Oct;45(10):4439-4447. doi: 10.1002/mp.13151. Epub 2018 Sep 18.

Abstract

Purpose: The purpose of this study was the evaluation of anthropomorphic model observers trained with neural networks for the prediction of a human observer's performance.

Methods: To simulate liver lesions, a phantom with contrast targets (acrylic spheres, varying diameters, +30 HU) was repeatedly scanned on a computed tomography scanner. Image data labeled with confidence ratings assessed in a reader study for a detection task of liver lesions were used to build several anthropomorphic model observers. Models were trained with images reconstructed with iterative reconstruction and evaluated with images reconstructed with filtered backprojection. A neural network, based on softmax regression (SR-MO), and convolutional neural networks (CNN-MO) were used to predict the performance of a human observer and compared to a channelized Hotelling observer [with Gabor channels and internal channel noise (CHOi)]. Model observers were evaluated by a receiver operating characteristic curve analysis and compared to the results in the reader study. Two strategies were used to train the SR-MO and CNN-MO: A) building a separate model for each lesion size; B) building one model that was applied to lesions of all sizes.

Results: All tested model observers and the human observer were highly correlated at each lesion size and dose level. With strategy A, Pearson's product-moment correlation coefficients r were 0.926 (95% confidence interval (CI): 0.679-0.985) for SR-MO and 0.979 (95% CI: 0.902-0.996) for CNN-MO. With strategy B, r was 0.860 (95% CI: 0.454-0.970) for SR-MO and 0.918 (95% CI: 0.651-0.983) for CNN-MO. For CHOi, r was 0.945 (95% CI: 0.755-0.989). With strategy A, mean absolute percentage differences (MAPD) between the model observers and the human observer were 3.7% for SR-MO and 1.2% for CNN-MO. With strategy B, MAPD were 3.7% for SR-MO and 3.0% for CNN-MO. For the CHOi the MAPD was 2.2%.

Conclusion: Convolutional neural network model observers can accurately predict the performance of a human observer for all lesion sizes and dose levels in the evaluated signal detection task.

Keywords: CNN; computed tomography; image quality; machine learning; model observer; neural network.

MeSH terms

  • Image Processing, Computer-Assisted / methods*
  • Liver Neoplasms / diagnostic imaging*
  • Neural Networks, Computer*
  • Phantoms, Imaging*
  • ROC Curve
  • Tomography, X-Ray Computed / instrumentation*