Assessment of Accuracy of an Artificial Intelligence Algorithm to Detect Melanoma in Images of Skin Lesions

Michael Phillips; Helen Marsden; Wayne Jaffe; Rubeta N Matin; Gorav N Wali; Jack Greenhalgh; Emily McGrath; Rob James; Evmorfia Ladoyanni; Anthony Bewley; Giuseppe Argenziano; Ioulios Palamaras

doi:10.1001/jamanetworkopen.2019.13436

Assessment of Accuracy of an Artificial Intelligence Algorithm to Detect Melanoma in Images of Skin Lesions

JAMA Netw Open. 2019 Oct 2;2(10):e1913436. doi: 10.1001/jamanetworkopen.2019.13436.

Authors

Michael Phillips^{1

2}, Helen Marsden³, Wayne Jaffe⁴, Rubeta N Matin⁵, Gorav N Wali⁵, Jack Greenhalgh³, Emily McGrath⁶, Rob James⁶, Evmorfia Ladoyanni⁷, Anthony Bewley^{8

9}, Giuseppe Argenziano¹⁰, Ioulios Palamaras¹¹

Affiliations

¹ Harry Perkins Institute of Medical Research, Perth, Western Australia, Australia.
² Centre for Medical Research, University of Western Australia, Perth, Western Australia, Australia.
³ Skin Analytics Limited, London, United Kingdom.
⁴ Royal Stoke University Hospital, University Hospital North Midlands, Stoke, United Kingdom.
⁵ Oxford University Hospitals NHS Foundation Trust, Oxford, United Kingdom.
⁶ Royal Devon and Exeter NHS Foundation Trust, Exeter, United Kingdom.
⁷ Dudley Group NHS Foundation Trust, Corbett Hospital, Stourbridge, United Kingdom.
⁸ Barts Health, London, United Kingdom.
⁹ Queen Mary School of Medicine, University of London, London, United Kingdom.
¹⁰ Dermatology Unit, University of Campania, Naples, Italy.
¹¹ Barnet and Chase Farm Hospitals, Royal Free NHS Foundation Trust, London, United Kingdom.

Abstract

Importance: A high proportion of suspicious pigmented skin lesions referred for investigation are benign. Techniques to improve the accuracy of melanoma diagnoses throughout the patient pathway are needed to reduce the pressure on secondary care and pathology services.

Objective: To determine the accuracy of an artificial intelligence algorithm in identifying melanoma in dermoscopic images of lesions taken with smartphone and digital single-lens reflex (DSLR) cameras.

Design, setting, and participants: This prospective, multicenter, single-arm, masked diagnostic trial took place in dermatology and plastic surgery clinics in 7 UK hospitals. Dermoscopic images of suspicious and control skin lesions from 514 patients with at least 1 suspicious pigmented skin lesion scheduled for biopsy were captured on 3 different cameras. Data were collected from January 2017 to July 2018. Clinicians and the Deep Ensemble for Recognition of Malignancy, a deterministic artificial intelligence algorithm trained to identify melanoma in dermoscopic images of pigmented skin lesions using deep learning techniques, assessed the likelihood of melanoma. Initial data analysis was conducted in September 2018; further analysis was conducted from February 2019 to August 2019.

Interventions: Clinician and algorithmic assessment of melanoma.

Main outcomes and measures: Area under the receiver operating characteristic curve (AUROC), sensitivity, and specificity of the algorithmic and specialist assessment, determined using histopathology diagnosis as the criterion standard.

Results: The study population of 514 patients included 279 women (55.7%) and 484 white patients (96.8%), with a mean (SD) age of 52.1 (18.6) years. A total of 1550 images of skin lesions were included in the analysis (551 [35.6%] biopsied lesions; 999 [64.4%] control lesions); 286 images (18.6%) were used to train the algorithm, and a further 849 (54.8%) images were missing or unsuitable for analysis. Of the biopsied lesions that were assessed by the algorithm and specialists, 125 (22.7%) were diagnosed as melanoma. Of these, 77 (16.7%) were used for the primary analysis. The algorithm achieved an AUROC of 90.1% (95% CI, 86.3%-94.0%) for biopsied lesions and 95.8% (95% CI, 94.1%-97.6%) for all lesions using iPhone 6s images; an AUROC of 85.8% (95% CI, 81.0%-90.7%) for biopsied lesions and 93.8% (95% CI, 91.4%-96.2%) for all lesions using Galaxy S6 images; and an AUROC of 86.9% (95% CI, 80.8%-93.0%) for biopsied lesions and 91.8% (95% CI, 87.5%-96.1%) for all lesions using DSLR camera images. At 100% sensitivity, the algorithm achieved a specificity of 64.8% with iPhone 6s images. Specialists achieved an AUROC of 77.8% (95% CI, 72.5%-81.9%) and a specificity of 69.9%.

Conclusions and relevance: In this study, the algorithm demonstrated an ability to identify melanoma from dermoscopic images of selected lesions with an accuracy similar to that of specialists.

Publication types

Clinical Trial
Multicenter Study
Research Support, Non-U.S. Gov't

MeSH terms

Adult
Aged
Area Under Curve
Biopsy
Deep Learning*
Dermoscopy* / instrumentation
Female
Humans
Male
Melanoma / diagnostic imaging*
Melanoma / pathology
Middle Aged
Photography / instrumentation
Prospective Studies
ROC Curve
Skin Neoplasms / diagnostic imaging*
Skin Neoplasms / pathology
Smartphone