Use of artificial intelligence in triaging of chest radiographs to reduce radiologists' workload

Eur Radiol. 2024 Feb;34(2):1094-1103. doi: 10.1007/s00330-023-10124-1. Epub 2023 Aug 24.

Abstract

Objectives: To evaluate whether deep learning-based detection algorithms (DLD)-based triaging can reduce outpatient chest radiograph interpretation workload while maintaining noninferior sensitivity.

Methods: This retrospective study included patients who underwent initial chest radiography at the outpatient clinic between June 1 and June 30, 2017. Readers interpreted radiographs with/without a commercially available DLD that detects nine radiologic findings (atelectasis, calcification, cardiomegaly, consolidation, fibrosis, nodules, pneumothorax, pleural effusion, and pneumoperitoneum). The reading order was determined in a randomized, crossover manner. The radiographs were classified into negative and positive examinations. In a 50% worklist reduction scenario, radiographs were sorted in descending order of probability scores: the lower half was regarded as negative exams, while the remaining were read with DLD by radiologists. The primary analysis evaluated noninferiority in sensitivity between radiologists reading all radiographs and simulating a 50% worklist reduction, with the inferiority margin of 5%. The specificities were compared using McNemar's test.

Results: The study included 1964 patients (median age [interquartile range], 55 years [40-67 years]). The sensitivity was 82.6% (195 of 236; 95% CI: 77.5%, 87.3%) when readers interpreted all chest radiographs without DLD and 83.5% (197 of 236; 95% CI: 78.8%, 88.1%) in the 50% worklist reduction scenario. The difference in sensitivity was 0.8% (95% CI: - 3.8%, 5.5%), establishing noninferiority of 50% worklist reduction (p = 0.01). The specificity increased from 86.7% (1498 of 1728) to 90.4% (1562 of 1728) (p < 0.001) with DLD-based triage.

Conclusion: Deep learning-based triaging may substantially reduce workload without lowering sensitivity while improving specificity.

Clinical relevance statement: Substantial workload reduction without lowering sensitivity was feasible using deep learning-based triaging of outpatient chest radiograph; however, the legal responsibility for incorrect diagnoses based on AI-standalone interpretation remains an issue that should be defined before clinical implementation.

Key points: • A 50% workload reduction simulation using deep learning-based detection algorithm maintained noninferior sensitivity while improving specificity. • The CT recommendation rate significantly decreased in the disease-negative patients, whereas it slightly increased in the disease-positive group without statistical significance. • In the exploratory analysis, the noninferiority of sensitivity was maintained until 70% of the workload was reduced; the difference in sensitivity was 0%.

Keywords: Deep learning; Radiography, thoracic; Triage.

MeSH terms

  • Adult
  • Aged
  • Artificial Intelligence*
  • Deep Learning*
  • Humans
  • Middle Aged
  • Radiography
  • Radiography, Thoracic
  • Radiologists
  • Retrospective Studies
  • Sensitivity and Specificity
  • Triage
  • Workload