Extravalidation and reproducibility results of a commercial deep learning-based automatic detection algorithm for pulmonary nodules on chest radiographs at tertiary hospital

J Med Imaging Radiat Oncol. 2021 Feb;65(1):15-22. doi: 10.1111/1754-9485.13105. Epub 2020 Oct 8.

Abstract

Introduction: To extra validate and evaluate the reproducibility of a commercial deep convolutional neural network (DCNN) algorithm for pulmonary nodules on chest radiographs (CRs) and to compare its performance with radiologists.

Methods: This retrospective study enrolled 434 CRs (normal to abnormal ratio, 246:188) from 378 patients that visited a tertiary hospital. DCNN performance was compared with two radiology residents and two thoracic radiologists. Abnormality assessment (using the area under the receiver operating characteristics (AUROC)) and nodule detection (using jackknife alternative free-response ROC (JAFROC)) were compared among three groups (DCNN only, radiologist without DCNN and radiologist with DCNN). A subset of 56 paired cases, having two CRs taken within a 7-day period, were assessed for intraobserver reproducibility using the intraclass correlation coefficient. Independent characteristics of pulmonary nodules detected by DCNN were assessed by multiple logistic regression analysis.

Results: The AUROC for abnormality detection for the three groups were 0.87, 0.93 and 0.96, respectively (P < 0.05), whereas the JAFROC analysis of nodule detection was 0.926, 0.929 and 0.964. Reproducibility for the three groups was 0.80, 0.67 and 0.80, which shows an increase in radiologists using DCNN (P < 0.05). Nodules detected by DCNN were more solid, round-shaped and well marginated, not masked and laterally located (P < 0.05).

Conclusions: Extra validation results of DCNN showed high ROC results and there was a significant improvement in the performance when radiologists used DCNN. Reproducibility by DCNN alone showed good agreement, and there was an improvement from moderate to good agreement for radiologists using DCNN.

Keywords: algorithms; chest radiography; convolutional neural networks; deep learning.

MeSH terms

  • Algorithms
  • Deep Learning*
  • Humans
  • Radiography, Thoracic
  • Reproducibility of Results
  • Retrospective Studies
  • Tertiary Care Centers