Diagnostic performance of augmented intelligence with 2D and 3D total body photography and convolutional neural networks in a high-risk population for melanoma under real-world conditions: A new era of skin cancer screening?

Eur J Cancer. 2023 Sep:190:112954. doi: 10.1016/j.ejca.2023.112954. Epub 2023 Jun 24.

Abstract

Background: Convolutional neural networks (CNNs) have outperformed dermatologists in classifying pigmented skin lesions under artificial conditions. We investigated, for the first time, the performance of three-dimensional (3D) and two-dimensional (2D) CNNs and dermatologists in the early detection of melanoma in a real-world setting.

Methods: In this prospective study, 1690 melanocytic lesions in 143 patients with high-risk criteria for melanoma were evaluated by dermatologists, 2D-FotoFinder-ATBM and 3D-Vectra WB360 total body photography (TBP). Excision was based on the dermatologists' dichotomous decision, an elevated CNN risk score (study-specific malignancy cut-off: FotoFinder >0.5, Vectra >5.0) and/or the second dermatologist's assessment with CNN support. The diagnostic accuracy of the 2D and 3D CNN classification was compared with that of the dermatologists and the augmented intelligence based on histopathology and dermatologists' assessment. Secondary end-points included reproducibility of risk scores and naevus counts per patient by medical staff (gold standard) compared to automated 3D and 2D TBP CNN counts.

Results: The sensitivity, specificity, and receiver operating characteristics area under the curve (ROC-AUC) for risk-score-assessments compared to histopathology of 3D-CNN with 95% confidence intervals (CI) were 90.0%, 64.6% and 0.92 (CI 0.85-1.00), respectively. While dermatologists and augmented intelligence achieved the same sensitivity (90%) and comparable classification ROC-AUC (0.91 [CI 0.80-1.00], 0.88 [CI 0.77-1.00]) with 3D-CNN, their specificity was superior (92.3% and 86.2%, respectively). The 2D-CNN (sensitivity: 70%, specificity: 40%, ROC-AUC: 0.68 [CI 0.46-0.90]) was outperformed by 3D CNN and dermatologists. The 3D-CNN showed a higher correlation coefficient for repeated measurements of 246 lesions (R = 0.89) than the 2D-CNN (R = 0.79). The mean naevus count per patient varied significantly (gold standard: 210 lesions; 3D-CNN: 469; 2D-CNN: 1324; p < 0.0001).

Conclusions: Our study emphasises the importance of validating the classification of CNNs in real life. The novel 3D-CNN device outperformed the 2D-CNN and achieved comparable sensitivity with dermatologists. The low specificity of CNNs and the lack of automated counting of TBP nevi currently limit the use of augmented intelligence in clinical practice.

Trial registration: ClinicalTrials.gov NCT04605822.

Keywords: Artificial intelligence; Convolutional neural network; Deep learning; Melanoma; Photography augmented intelligence; Pigmented naevus; Skin neoplasm; Three-dimensional (3D); Total body photography; Two-dimensional (2D).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Dermatologists
  • Early Detection of Cancer
  • Humans
  • Melanoma* / diagnostic imaging
  • Melanoma* / pathology
  • Neural Networks, Computer
  • Nevus* / pathology
  • Nevus, Pigmented* / diagnostic imaging
  • Photography
  • Prospective Studies
  • Reproducibility of Results
  • Risk Factors
  • Skin Neoplasms* / diagnostic imaging
  • Skin Neoplasms* / pathology

Associated data

  • ClinicalTrials.gov/NCT04605822