Association between different scale bars in dermoscopic images and diagnostic performance of a market-approved deep learning convolutional neural network for melanoma recognition

Eur J Cancer. 2021 Mar:145:146-154. doi: 10.1016/j.ejca.2020.12.010. Epub 2021 Jan 16.

Abstract

Background: Studies systematically unravelling possible causes for false diagnoses of deep learning convolutional neural networks (CNNs) are scarce, yet needed before broader application.

Objectives: The objective of the study was to investigate whether scale bars in dermoscopic images are associated with the diagnostic accuracy of a market-approved CNN.

Methods: This cross-sectional analysis applied a CNN trained with more than 150,000 images (Moleanalyzer-pro®, FotoFinder Systems Inc., Bad Birnbach, Germany) to investigate seven dermoscopic image sets depicting the same 130 melanocytic lesions (107 nevi, 23 melanomas) without or with digitally superimposed scale bars of different manufacturers. Sensitivity, specificity and area under the curve (AUC) of receiver operating characteristics (ROC) for the CNN's binary classification of images with or without superimposed scale bars were assessed.

Results: Six dermoscopic image sets with different scale bars and one control set without scale bars (overall 910 images) were submitted to CNN analysis. In images without scale bars, the CNN attained a sensitivity [95% confidence interval] of 87.0% [67.9%-95.5%] and a specificity of 87.9% [80.3%-92.8%]. ROC AUC was 0.953 [0.914-0.992]. Scale bars were not associated with significant changes in sensitivity (range 87%-95.7%, all p ≥ 1.0). However, four scale bars induced a decrease of the CNN's specificity (range 0%-43.9%, all p < 0.001). Moreover, ROC AUC was significantly reduced by two scale bars (range 0.520-0.848, both p ≤ 0.042).

Conclusions: Superimposed scale bars in dermoscopic images may impair the CNN's diagnostic accuracy, mostly by increasing the rate of the false-positive diagnoses. We recommend avoiding scale bars in images intended for CNN analysis unless specific measures counteracting effects are implemented.

Clinical trial number: This study was registered at the German Clinical Trial Register (DRKS-Study-ID: DRKS00013570; URL: https://www.drks.de/drks_web/).

Keywords: Convolutional neural network; Deep learning; Dermoscopy; Melanoma; Nevus; Scale bar.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Artifacts
  • Cross-Sectional Studies
  • Deep Learning*
  • Dermoscopy*
  • Diagnosis, Computer-Assisted*
  • Humans
  • Image Interpretation, Computer-Assisted*
  • Melanoma / pathology*
  • Nevus / pathology*
  • Predictive Value of Tests
  • Reproducibility of Results
  • Retrospective Studies
  • Skin Neoplasms / pathology*

Associated data

  • DRKS/DRKS00013570