Development and Validation of a Deep Learning-Based Automated Detection Algorithm for Major Thoracic Diseases on Chest Radiographs

Eui Jin Hwang; Sunggyun Park; Kwang-Nam Jin; Jung Im Kim; So Young Choi; Jong Hyuk Lee; Jin Mo Goo; Jaehong Aum; Jae-Joon Yim; Julien G Cohen; Gilbert R Ferretti; Chang Min Park; DLAD Development and Evaluation Group

doi:10.1001/jamanetworkopen.2019.1095

Development and Validation of a Deep Learning-Based Automated Detection Algorithm for Major Thoracic Diseases on Chest Radiographs

JAMA Netw Open. 2019 Mar 1;2(3):e191095. doi: 10.1001/jamanetworkopen.2019.1095.

Affiliations

¹ Department of Radiology, Seoul National University College of Medicine, Seoul, South Korea.
² Lunit Inc, Seoul, South Korea.
³ Department of Radiology, Seoul National University Boramae Medical Center, Seoul, South Korea.
⁴ Department of Radiology, Kyung Hee University Hospital at Gangdong, Kyung Hee University College of Medicine, Seoul, South Korea.
⁵ Department of Radiology, Eulji University Medical Center, College of Medicine, Seoul, South Korea.
⁶ Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, Seoul National University College of Medicine, Seoul, South Korea.
⁷ Pôle Imagerie, Centre Hospitalier Universitaire de Grenoble, La Tronche, France.

Abstract

Importance: Interpretation of chest radiographs is a challenging task prone to errors, requiring expert readers. An automated system that can accurately classify chest radiographs may help streamline the clinical workflow.

Objectives: To develop a deep learning-based algorithm that can classify normal and abnormal results from chest radiographs with major thoracic diseases including pulmonary malignant neoplasm, active tuberculosis, pneumonia, and pneumothorax and to validate the algorithm's performance using independent data sets.

Design, setting, and participants: This diagnostic study developed a deep learning-based algorithm using single-center data collected between November 1, 2016, and January 31, 2017. The algorithm was externally validated with multicenter data collected between May 1 and July 31, 2018. A total of 54 221 chest radiographs with normal findings from 47 917 individuals (21 556 men and 26 361 women; mean [SD] age, 51 [16] years) and 35 613 chest radiographs with abnormal findings from 14 102 individuals (8373 men and 5729 women; mean [SD] age, 62 [15] years) were used to develop the algorithm. A total of 486 chest radiographs with normal results and 529 with abnormal results (1 from each participant; 628 men and 387 women; mean [SD] age, 53 [18] years) from 5 institutions were used for external validation. Fifteen physicians, including nonradiology physicians, board-certified radiologists, and thoracic radiologists, participated in observer performance testing. Data were analyzed in August 2018.

Exposures: Deep learning-based algorithm.

Main outcomes and measures: Image-wise classification performances measured by area under the receiver operating characteristic curve; lesion-wise localization performances measured by area under the alternative free-response receiver operating characteristic curve.

Results: The algorithm demonstrated a median (range) area under the curve of 0.979 (0.973-1.000) for image-wise classification and 0.972 (0.923-0.985) for lesion-wise localization; the algorithm demonstrated significantly higher performance than all 3 physician groups in both image-wise classification (0.983 vs 0.814-0.932; all P < .005) and lesion-wise localization (0.985 vs 0.781-0.907; all P < .001). Significant improvements in both image-wise classification (0.814-0.932 to 0.904-0.958; all P < .005) and lesion-wise localization (0.781-0.907 to 0.873-0.938; all P < .001) were observed in all 3 physician groups with assistance of the algorithm.

Conclusions and relevance: The algorithm consistently outperformed physicians, including thoracic radiologists, in the discrimination of chest radiographs with major thoracic diseases, demonstrating its potential to improve the quality and efficiency of clinical practice.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Adult
Aged
Algorithms*
Deep Learning*
Female
Humans
Male
Middle Aged
Radiographic Image Interpretation, Computer-Assisted / methods*
Radiography, Thoracic / methods*
Reproducibility of Results
Thoracic Diseases / diagnostic imaging*