Diagnostic performance of artificial intelligence model for pneumonia from chest radiography

TaeWoo Kwon; Sang Pyo Lee; Dongmin Kim; Jinseong Jang; Myungjae Lee; Shin Uk Kang; Heejin Kim; Keunyoung Oh; Jinhee On; Young Jae Kim; So Jeong Yun; Kwang Nam Jin; Eun Young Kim; Kwang Gi Kim

doi:10.1371/journal.pone.0249399

Diagnostic performance of artificial intelligence model for pneumonia from chest radiography

PLoS One. 2021 Apr 15;16(4):e0249399. doi: 10.1371/journal.pone.0249399. eCollection 2021.

Authors

TaeWoo Kwon¹, Sang Pyo Lee², Dongmin Kim¹, Jinseong Jang¹, Myungjae Lee¹, Shin Uk Kang¹, Heejin Kim³, Keunyoung Oh³, Jinhee On³, Young Jae Kim⁴, So Jeong Yun⁴, Kwang Nam Jin⁵, Eun Young Kim⁶, Kwang Gi Kim⁴

Affiliations

¹ JLK, Incorporated, Eonju-ro, Gangnam-gu, Seoul, South Korea.
² Department of Internal Medicine, Gil Medical Center, Gachon University College of Medicine, Incheon, South Korea.
³ Korea National Tuberculosis Association (KNTA), Seoul, South Korea.
⁴ Department of Biomedical Engineering, Gachon University College of Medicine, Incheon, South Korea.
⁵ Department of Radiology, Seoul Metropolitan Government-Seoul National University Boramae Medical Center, Seoul, South Korea.
⁶ Department of Radiology, Gil Medical Center, Gachon University College of Medicine, Incheon, South Korea.

Abstract

Objective: The chest X-ray (CXR) is the most readily available and common imaging modality for the assessment of pneumonia. However, detecting pneumonia from chest radiography is a challenging task, even for experienced radiologists. An artificial intelligence (AI) model might help to diagnose pneumonia from CXR more quickly and accurately. We aim to develop an AI model for pneumonia from CXR images and to evaluate diagnostic performance with external dataset.

Methods: To train the pneumonia model, a total of 157,016 CXR images from the National Institutes of Health (NIH) and the Korean National Tuberculosis Association (KNTA) were used (normal vs. pneumonia = 120,722 vs.36,294). An ensemble model of two neural networks with DenseNet classifies each CXR image into pneumonia or not. To test the accuracy of the models, a separate external dataset of pneumonia CXR images (n = 212) from a tertiary university hospital (Gachon University Gil Medical Center GUGMC, Incheon, South Korea) was used; the diagnosis of pneumonia was based on both the chest CT findings and clinical information, and the performance evaluated using the area under the receiver operating characteristic curve (AUC). Moreover, we tested the change of the AI probability score for pneumonia using the follow-up CXR images (7 days after the diagnosis of pneumonia, n = 100).

Results: When the probability scores of the models that have a threshold of 0.5 for pneumonia, two models (models 1 and 4) having different pre-processing parameters on the histogram equalization distribution showed best AUC performances of 0.973 and 0.960, respectively. As expected, the ensemble model of these two models performed better than each of the classification models with 0.983 AUC. Furthermore, the AI probability score change for pneumonia showed a significant difference between improved cases and aggravated cases (Δ = -0.06 ± 0.14 vs. 0.06 ± 0.09, for 85 improved cases and 15 aggravated cases, respectively, P = 0.001) for CXR taken as a 7-day follow-up.

Conclusions: The ensemble model combined two different classification models for pneumonia that performed at 0.983 AUC for an external test dataset from a completely different data source. Furthermore, AI probability scores showed significant changes between cases of different clinical prognosis, which suggest the possibility of increased efficiency and performance of the CXR reading at the diagnosis and follow-up evaluation for pneumonia.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Adult
Aged
Area Under Curve
Artificial Intelligence*
Female
Humans
Male
Middle Aged
Pneumonia / diagnosis*
ROC Curve
Tertiary Care Centers
Thorax / diagnostic imaging*
Tomography, X-Ray Computed

Grants and funding

This research was supported by grants from Gachon University to KGK (GCU 2018-0669) and Korea ResearchDriven Hospital to EYK (Grant No. 2018-5287). The funder has no commercial interest in it. JLK Inc. provided support in the form of salaries for authors TK, DK, JJ, ML, and SUK. The specific roles of these authors are articulated in the ‘author contributions’ section. JLK Inc. was involved in data collection, but had no role in study design, analysis, decision to publish, or preparation of the manuscript.