A deep learning fusion network trained with clinical and high-frequency ultrasound images in the multi-classification of skin diseases in comparison with dermatologists: a prospective and multicenter study

EClinicalMedicine. 2024 Jan 5:67:102391. doi: 10.1016/j.eclinm.2023.102391. eCollection 2024 Jan.

Abstract

Background: Clinical appearance and high-frequency ultrasound (HFUS) are indispensable for diagnosing skin diseases by providing internal and external information. However, their complex combination brings challenges for primary care physicians and dermatologists. Thus, we developed a deep multimodal fusion network (DMFN) model combining analysis of clinical close-up and HFUS images for binary and multiclass classification in skin diseases.

Methods: Between Jan 10, 2017, and Dec 31, 2020, the DMFN model was trained and validated using 1269 close-ups and 11,852 HFUS images from 1351 skin lesions. The monomodal convolutional neural network (CNN) model was trained and validated with the same close-up images for comparison. Subsequently, we did a prospective and multicenter study in China. Both CNN models were tested prospectively on 422 cases from 4 hospitals and compared with the results from human raters (general practitioners, general dermatologists, and dermatologists specialized in HFUS). The performance of binary classification (benign vs. malignant) and multiclass classification (the specific diagnoses of 17 types of skin diseases) measured by the area under the receiver operating characteristic curve (AUC) were evaluated. This study is registered with www.chictr.org.cn (ChiCTR2300074765).

Findings: The performance of the DMFN model (AUC, 0.876) was superior to that of the monomodal CNN model (AUC, 0.697) in the binary classification (P = 0.0063), which was also better than that of the general practitioner (AUC, 0.651, P = 0.0025) and general dermatologists (AUC, 0.838; P = 0.0038). By integrating close-up and HFUS images, the DMFN model attained an almost identical performance in comparison to dermatologists (AUC, 0.876 vs. AUC, 0.891; P = 0.0080). For the multiclass classification, the DMFN model (AUC, 0.707) exhibited superior prediction performance compared with general dermatologists (AUC, 0.514; P = 0.0043) and dermatologists specialized in HFUS (AUC, 0.640; P = 0.0083), respectively. Compared to dermatologists specialized in HFUS, the DMFN model showed better or comparable performance in diagnosing 9 of the 17 skin diseases.

Interpretation: The DMFN model combining analysis of clinical close-up and HFUS images exhibited satisfactory performance in the binary and multiclass classification compared with the dermatologists. It may be a valuable tool for general dermatologists and primary care providers.

Funding: This work was supported in part by the National Natural Science Foundation of China and the Clinical research project of Shanghai Skin Disease Hospital.

Keywords: Convolutional neural network; High-frequency ultrasound; Multi-classification; Skin disease.