Lung pneumonia severity scoring in chest X-ray images using transformers

Bouthaina Slika; Fadi Dornaika; Hamid Merdji; Karim Hammoudi

doi:10.1007/s11517-024-03066-3

Lung pneumonia severity scoring in chest X-ray images using transformers

Med Biol Eng Comput. 2024 Apr 9. doi: 10.1007/s11517-024-03066-3. Online ahead of print.

Authors

Bouthaina Slika^{1

2

3}, Fadi Dornaika^{4

5}, Hamid Merdji^{6

7}, Karim Hammoudi^{8

9}

Affiliations

¹ University of the Basque Country UPV/EHU, San Sebastian, Spain.
² Lebanese International University, Beirut, Lebanon.
³ Beirut International University, Beirut, Lebanon.
⁴ University of the Basque Country UPV/EHU, San Sebastian, Spain. fadi.dornaika@ehu.eus.
⁵ IKERBASQUE, Basque Foundation for Science, Bilbao, Spain. fadi.dornaika@ehu.eus.
⁶ INSERM, UMR 1260, Regenerative Nanomedicine (RNM), CRBS, University of Strasbourg, Strasbourg, France.
⁷ Hôpital Universitaire de Strasbourg, Strasbourg, France.
⁸ Université de Haute-Alsace IRIMAS, Mulhouse, France.
⁹ University of Strasbourg, Strasbourg, France.

PMID: 38589723
DOI: 10.1007/s11517-024-03066-3

Abstract

To create robust and adaptable methods for lung pneumonia diagnosis and the assessment of its severity using chest X-rays (CXR), access to well-curated, extensive datasets is crucial. Many current severity quantification approaches require resource-intensive training for optimal results. Healthcare practitioners require efficient computational tools to swiftly identify COVID-19 cases and predict the severity of the condition. In this research, we introduce a novel image augmentation scheme as well as a neural network model founded on Vision Transformers (ViT) with a small number of trainable parameters for quantifying COVID-19 severity and other lung diseases. Our method, named Vision Transformer Regressor Infection Prediction (ViTReg-IP), leverages a ViT architecture and a regression head. To assess the model's adaptability, we evaluate its performance on diverse chest radiograph datasets from various open sources. We conduct a comparative analysis against several competing deep learning methods. Our results achieved a minimum Mean Absolute Error (MAE) of 0.569 and 0.512 and a maximum Pearson Correlation Coefficient (PC) of 0.923 and 0.855 for the geographic extent score and the lung opacity score, respectively, when the CXRs from the RALO dataset were used in training. The experimental results reveal that our model delivers exceptional performance in severity quantification while maintaining robust generalizability, all with relatively modest computational requirements. The source codes used in our work are publicly available at https://github.com/bouthainas/ViTReg-IP .

Keywords: Automatic prediction; Chest X-ray; Severity quantification; Vision transformer.

Grants and funding

PID2021-126701OB-I00/Agencia Estatal de Investigación