Comparison of Convolutional Neural Networks and Transformers for the Classification of Images of COVID-19, Pneumonia and Healthy Individuals as Observed with Computed Tomography

Azucena Ascencio-Cabral; Constantino Carlos Reyes-Aldasoro

doi:10.3390/jimaging8090237

Comparison of Convolutional Neural Networks and Transformers for the Classification of Images of COVID-19, Pneumonia and Healthy Individuals as Observed with Computed Tomography

J Imaging. 2022 Sep 1;8(9):237. doi: 10.3390/jimaging8090237.

Authors

Azucena Ascencio-Cabral¹, Constantino Carlos Reyes-Aldasoro¹

Affiliation

¹ giCentre, Department of Computer Science, School of Science and Technology, City, University of London, London EC1V 0HB, UK.

Abstract

In this work, the performance of five deep learning architectures in classifying COVID-19 in a multi-class set-up is evaluated. The classifiers were built on pretrained ResNet-50, ResNet-50r (with kernel size 5×5 in the first convolutional layer), DenseNet-121, MobileNet-v3 and the state-of-the-art CaiT-24-XXS-224 (CaiT) transformer. The cross entropy and weighted cross entropy were minimised with Adam and AdamW. In total, 20 experiments were conducted with 10 repetitions and obtained the following metrics: accuracy (Acc), balanced accuracy (BA), F₁ and F₂ from the general Fβ macro score, Matthew's Correlation Coefficient (MCC), sensitivity (Sens) and specificity (Spec) followed by bootstrapping. The performance of the classifiers was compared by using the Friedman-Nemenyi test. The results show that less complex architectures such as ResNet-50, ResNet-50r and DenseNet-121 were able to achieve better generalization with rankings of 1.53, 1.71 and 3.05 for the Matthew Correlation Coefficient, respectively, while MobileNet-v3 and CaiT obtained rankings of 3.72 and 5.0, respectively.

Keywords: COVID-19; Friedman–Nemenyi tests; bootstrap; deep neural networks; transformer; weighted cross entropy.

Grants and funding

This research received no external funding.