A Deep CNN Transformer Hybrid Model for Skin Lesion Classification of Dermoscopic Images Using Focal Loss

Diagnostics (Basel). 2022 Dec 27;13(1):72. doi: 10.3390/diagnostics13010072.

Abstract

Skin cancers are the most cancers diagnosed worldwide, with an estimated > 1.5 million new cases in 2020. Use of computer-aided diagnosis (CAD) systems for early detection and classification of skin lesions helps reduce skin cancer mortality rates. Inspired by the success of the transformer network in natural language processing (NLP) and the deep convolutional neural network (DCNN) in computer vision, we propose an end-to-end CNN transformer hybrid model with a focal loss (FL) function to classify skin lesion images. First, the CNN extracts low-level, local feature maps from the dermoscopic images. In the second stage, the vision transformer (ViT) globally models these features, then extracts abstract and high-level semantic information, and finally sends this to the multi-layer perceptron (MLP) head for classification. Based on an evaluation of three different loss functions, the FL-based algorithm is aimed to improve the extreme class imbalance that exists in the International Skin Imaging Collaboration (ISIC) 2018 dataset. The experimental analysis demonstrates that impressive results of skin lesion classification are achieved by employing the hybrid model and FL strategy, which shows significantly high performance and outperforms the existing work.

Keywords: deep learning; focal loss; hybrid model; skin lesion.

Grants and funding

This research received no external funding.