Practical Training Approaches for Discordant Atopic Dermatitis Severity Datasets: Merging Methods With Soft-Label and Train-Set Pruning

IEEE J Biomed Health Inform. 2023 Jan;27(1):166-175. doi: 10.1109/JBHI.2022.3218166. Epub 2023 Jan 4.

Abstract

Objective assessment of atopic dermatitis (AD) is essential for choosing proper management strategies. This study investigated the performance of convolutional neural networks (CNN) models in grading the severity of AD. Five board-certified dermatologists independently evaluated the severity of 9,192 AD images. The severity of AD was evaluated based on an Investigator's Global Assessment (IGA) and six signs of AD. For CNN training, we applied three distinct approaches: 1) ensemble vs. integration 2) hard-label vs. soft-label and 3) train-set pruning. For the IGA prediction, the two best models were chosen based on the macro-averaged AUROC and F-1 score. The ensemble-soft-label-pruning model was chosen based on AUROC 0.943, 0.927 for the internal and external validation set respectively, and integration-soft-label-whole dataset model was chosen based on the F1-score 0.750, 0.721 for the internal and external validation set respectively. CNN models trained by multi-evaluator dataset outperformed the models by an individual evaluator dataset, and they performed better to the dataset in which the assessment of dermatologists was concordant. In conclusion, CNN models for AD could be improved by labeled dataset from multiple evaluators, merging methods with soft-label and train-set pruning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Dermatitis, Atopic*
  • Humans
  • Immunoglobulin A
  • Neural Networks, Computer

Substances

  • Immunoglobulin A