Accuracy of Computer-Aided Diagnosis of Melanoma: A Meta-analysis

JAMA Dermatol. 2019 Nov 1;155(11):1291-1299. doi: 10.1001/jamadermatol.2019.1375.

Abstract

Importance: The recent advances in the field of machine learning have raised expectations that computer-aided diagnosis will become the standard for the diagnosis of melanoma.

Objective: To critically review the current literature and compare the diagnostic accuracy of computer-aided diagnosis with that of human experts.

Data sources: The MEDLINE, arXiv, and PubMed Central databases were searched to identify eligible studies published between January 1, 2002, and December 31, 2018.

Study selection: Studies that reported on the accuracy of automated systems for melanoma were selected. Search terms included melanoma, diagnosis, detection, computer aided, and artificial intelligence.

Data extraction and synthesis: Evaluation of the risk of bias was performed using the QUADAS-2 tool, and quality assessment was based on predefined criteria. Data were analyzed from February 1 to March 10, 2019.

Main outcomes and measures: Summary estimates of sensitivity and specificity and summary receiver operating characteristic curves were the primary outcomes.

Results: The literature search yielded 1694 potentially eligible studies, of which 132 were included and 70 offered sufficient information for a quantitative analysis. Most studies came from the field of computer science. Prospective clinical studies were rare. Combining the results for automated systems gave a melanoma sensitivity of 0.74 (95% CI, 0.66-0.80) and a specificity of 0.84 (95% CI, 0.79-0.88). Sensitivity was lower in studies that used independent test sets than in those that did not (0.51; 95% CI, 0.34-0.69 vs 0.82; 95% CI, 0.77-0.86; P < .001); however, the specificity was similar (0.83; 95% CI, 0.71-0.91 vs 0.85; 95% CI, 0.80-0.88; P = .67). In comparison with dermatologists' diagnosis, computer-aided diagnosis showed similar sensitivities and a 10 percentage points lower specificity, but the difference was not statistically significant. Studies were heterogeneous and substantial risk of bias was found in all but 4 of the 70 studies included in the quantitative analysis.

Conclusions and relevance: Although the accuracy of computer-aided diagnosis for melanoma detection is comparable to that of experts, the real-world applicability of these systems is unknown and potentially limited owing to overfitting and the risk of bias of the studies at hand.