Commercially Available Chest Radiograph AI Tools for Detecting Airspace Disease, Pneumothorax, and Pleural Effusion

Louis Lind Plesner; Felix C Müller; Mathias W Brejnebøl; Lene C Laustrup; Finn Rasmussen; Olav W Nielsen; Mikael Boesen; Michael Brun Andersen

doi:10.1148/radiol.231236

Commercially Available Chest Radiograph AI Tools for Detecting Airspace Disease, Pneumothorax, and Pleural Effusion

Radiology. 2023 Sep;308(3):e231236. doi: 10.1148/radiol.231236.

Authors

Louis Lind Plesner¹, Felix C Müller¹, Mathias W Brejnebøl¹, Lene C Laustrup¹, Finn Rasmussen¹, Olav W Nielsen¹, Mikael Boesen^#¹, Michael Brun Andersen^#¹

Affiliation

¹ From the Department of Radiology, Herlev and Gentofte Hospital, Borgmester Ib, Juuls vej 1 Herlev, Copenhagen 2730, Denmark (L.L.P., F.C.M., M.W.B., L.C.L., M.B.A.); Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark (L.L.P., M.W.B., O.W.N., M.B., M.B.A.); Radiological Artificial Intelligence Testcenter, RAIT.dk, Capital Region of Denmark (L.L.P., F.C.M., M.W.B., M.B., M.B.A.); Departments of Radiology (M.W.B., M.B.) and Cardiology (O.W.N.), Bispebjerg and Frederiksberg Hospital, Copenhagen, Denmark; and Department of Radiology, Aarhus University Hospital, Aarhus, Denmark (F.R.).

^# Contributed equally.

PMID: 37750768
DOI: 10.1148/radiol.231236

Abstract

Background Commercially available artificial intelligence (AI) tools can assist radiologists in interpreting chest radiographs, but their real-life diagnostic accuracy remains unclear. Purpose To evaluate the diagnostic accuracy of four commercially available AI tools for detection of airspace disease, pneumothorax, and pleural effusion on chest radiographs. Materials and Methods This retrospective study included consecutive adult patients who underwent chest radiography at one of four Danish hospitals in January 2020. Two thoracic radiologists (or three, in cases of disagreement) who had access to all previous and future imaging labeled chest radiographs independently for the reference standard. Area under the receiver operating characteristic curve, sensitivity, and specificity were calculated. Sensitivity and specificity were additionally stratified according to the severity of findings, number of findings on chest radiographs, and radiographic projection. The χ² and McNemar tests were used for comparisons. Results The data set comprised 2040 patients (median age, 72 years [IQR, 58-81 years]; 1033 female), of whom 669 (32.8%) had target findings. The AI tools demonstrated areas under the receiver operating characteristic curve ranging 0.83-0.88 for airspace disease, 0.89-0.97 for pneumothorax, and 0.94-0.97 for pleural effusion. Sensitivities ranged 72%-91% for airspace disease, 63%-90% for pneumothorax, and 62%-95% for pleural effusion. Negative predictive values ranged 92%-100% for all target findings. In airspace disease, pneumothorax, and pleural effusion, specificity was high for chest radiographs with normal or single findings (range, 85%-96%, 99%-100%, and 95%-100%, respectively) and markedly lower for chest radiographs with four or more findings (range, 27%-69%, 96%-99%, 65%-92%, respectively) (P < .001). AI sensitivity was lower for vague airspace disease (range, 33%-61%) and small pneumothorax or pleural effusion (range, 9%-94%) compared with larger findings (range, 81%-100%; P value range, > .99 to < .001). Conclusion Current-generation AI tools showed moderate to high sensitivity for detecting airspace disease, pneumothorax, and pleural effusion on chest radiographs. However, they produced more false-positive findings than radiology reports, and their performance decreased for smaller-sized target findings and when multiple findings were present. © RSNA, 2023 Supplemental material is available for this article. See also the editorial by Yanagawa and Tomiyama in this issue.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Adult
Aged
Artificial Intelligence
Deep Learning*
Female
Humans
Pleural Effusion* / diagnostic imaging
Pneumothorax* / diagnostic imaging
Radiography, Thoracic / methods
Retrospective Studies
Sensitivity and Specificity