Performance of an Artificial Intelligence-Based Platform Against Clinical Radiology Reports for the Evaluation of Noncontrast Chest CT

Acad Radiol. 2022 Feb:29 Suppl 2:S108-S117. doi: 10.1016/j.acra.2021.02.007. Epub 2021 Mar 10.

Abstract

Rationale and objectives: Research on implementation of artificial intelligence (AI) in radiology workflows and its impact on reports remains scarce. In this study, we aim to assess if an AI platform would perform better than clinical radiology reports in evaluating noncontrast chest computed tomography (CT) scans.

Materials and methods: Consecutive patients who had undergone noncontrast chest CT were retrospectively identified. The radiology reports were reviewed in a binary fashion for reporting of pulmonary lesions, pulmonary emphysema, aortic dilatation, coronary artery calcifications (CAC), and vertebral compression fractures (VCF). CT scans were then processed using an AI platform. The reports' findings and the AI results were subsequently compared to a consensus read by two board-certificated radiologists as reference.

Results: A total of 100 patients (mean age: 64.2 ± 14.8 years; 57% males) were included in this study. Aortic segmentation and calcium quantification failed to be processed by AI in 2 and 3 cases, respectively. AI showed superior diagnostic performance in identifying aortic dilatation (AI: sensitivity: 96.3%, specificity: 81.4%, AUC: 0.89) vs (Reports: sensitivity: 25.9%, specificity: 100%, AUC: 0.63), p <0.001; and CAC (AI: sensitivity: 89.8%, specificity: 100, AUC: 0.95) vs (Reports: sensitivity: 75.4%, specificity: 94.9%, AUC: 0.85), p = 0.005. Reports had better performance than AI in identifying pulmonary lesions (Reports: sensitivity: 97.6%, specificity: 100%, AUC: 0.99) vs (AI: sensitivity: 92.8%, specificity: 82.4%, AUC: 0.88), p = 0.024; and VCF (Reports: sensitivity:100%, specificity: 100%, AUC: 1.0) vs (AI: sensitivity: 100%, specificity: 63.7%, AUC: 0.82), p <0.001. A comparable diagnostic performance was noted in identifying pulmonary emphysema on AI (sensitivity: 80.6%, specificity: 66.7%. AUC: 0.74) and reports (sensitivity: 74.2%, specificity: 97.1%, AUC: 0.86), p = 0.064.

Conclusion: Our results demonstrate that incorporating AI support platforms into radiology workflows can provide significant added value to clinical radiology reporting.

Keywords: Artificial intelligence; Computed tomography, Radiology reports; Deep learning; Diagnostic performance.

MeSH terms

  • Aged
  • Artificial Intelligence
  • Female
  • Fractures, Compression*
  • Humans
  • Male
  • Middle Aged
  • Radiology*
  • Retrospective Studies
  • Spinal Fractures*
  • Tomography, X-Ray Computed / methods