FusionM4Net: A multi-stage multi-modal learning algorithm for multi-label skin lesion classification

Peng Tang; Xintong Yan; Yang Nan; Shao Xiang; Sebastian Krammer; Tobias Lasser

doi:10.1016/j.media.2021.102307

FusionM4Net: A multi-stage multi-modal learning algorithm for multi-label skin lesion classification

Med Image Anal. 2022 Feb:76:102307. doi: 10.1016/j.media.2021.102307. Epub 2021 Nov 22.

Authors

Peng Tang¹, Xintong Yan², Yang Nan³, Shao Xiang⁴, Sebastian Krammer⁵, Tobias Lasser⁶

Affiliations

¹ Department of Informatics and Munich School of BioEngineering, Technical University of Munich, Munich, Germany. Electronic address: tangp@in.tum.de.
² State Grid Henan Economic Research Institute, Zhengzhou, Henan 450052, China.
³ National Heart and Lung Institute, Imperial College London, London, UK.
⁴ Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Hubei 430079, China.
⁵ Department of Dermatology and Allergy, University Hospital, LMU Munich, Munich, Germany.
⁶ Department of Informatics and Munich School of BioEngineering, Technical University of Munich, Munich, Germany.

PMID: 34861602
DOI: 10.1016/j.media.2021.102307

Abstract

Skin disease is one of the most common diseases in the world. Deep learning-based methods have achieved excellent skin lesion recognition performance, most of which are based on only dermoscopy images. In recent works that use multi-modality data (patient's meta-data, clinical images, and dermoscopy images), the methods adopt a one-stage fusion approach and only optimize the information fusion at the feature level. These methods do not use information fusion at the decision level and thus cannot fully use the data of all modalities. This work proposes a novel two-stage multi-modal learning algorithm (FusionM4Net) for multi-label skin diseases classification. At the first stage, we construct a FusionNet, which exploits and integrates the representation of clinical and dermoscopy images at the feature level, and then uses a Fusion Scheme 1 to conduct the information fusion at the decision level. At the second stage, to further incorporate the patient's meta-data, we propose a Fusion Scheme 2, which integrates the multi-label predictive information from the first stage and patient's meta-data information to train an SVM cluster. The final diagnosis is formed by the fusion of the predictions from the first and second stages. Our algorithm was evaluated on the seven-point checklist dataset, a well-established multi-modality multi-label skin disease dataset. Without using the patient's meta-data, the proposed FusionM4Net's first stage (FusionM4Net-FS) achieved an average accuracy of 75.7% for multi-classification tasks and 74.9% for diagnostic tasks, which is more accurate than other state-of-the-art methods. By further fusing the patient's meta-data at FusionM4Net's second stage (FusionM4Net-SS), the entire FusionM4Net finally boosts the average accuracy to 77.0% and the diagnostic accuracy to 78.5%, which indicates its robust and excellent classification performance on the label-imbalanced dataset. The corresponding code is available at: https://github.com/pixixiaonaogou/MLSDR.

Keywords: Multi-label classification; Multi-modal learning; Multi-stage information fusion; Seven-points checklist criteria; Skin disease recognition.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Humans
Skin Diseases* / diagnostic imaging