Performances of Machine Learning in Detecting Glaucoma Using Fundus and Retinal Optical Coherence Tomography Images: A Meta-Analysis

Jo-Hsuan Wu; Takashi Nishida; Robert N Weinreb; Jou-Wei Lin

doi:10.1016/j.ajo.2021.12.008

Performances of Machine Learning in Detecting Glaucoma Using Fundus and Retinal Optical Coherence Tomography Images: A Meta-Analysis

Am J Ophthalmol. 2022 May:237:1-12. doi: 10.1016/j.ajo.2021.12.008. Epub 2021 Dec 21.

Authors

Jo-Hsuan Wu, Takashi Nishida, Robert N Weinreb, Jou-Wei Lin

PMID: 34942113
DOI: 10.1016/j.ajo.2021.12.008

Abstract

Purpose: To evaluate the performance of machine learning (ML) in detecting glaucoma using fundus and retinal optical coherence tomography (OCT) images.

Design: Meta-analysis.

Methods: PubMed and EMBASE were searched on August 11, 2021. A bivariate random-effects model was used to pool ML's diagnostic sensitivity, specificity, and area under the curve (AUC). Subgroup analyses were performed based on ML classifier categories and dataset types.

Results: One hundred and five studies (3.3%) were retrieved. Seventy-three (69.5%), 30 (28.6%), and 2 (1.9%) studies tested ML using fundus, OCT, and both image types, respectively. Total testing data numbers were 197,174 for fundus and 16,039 for OCT. Overall, ML showed excellent performances for both fundus (pooled sensitivity = 0.92 [95% CI, 0.91-0.93]; specificity = 0.93 [95% CI, 0.91-0.94]; and AUC = 0.97 [95% CI, 0.95-0.98]) and OCT (pooled sensitivity = 0.90 [95% CI, 0.86-0.92]; specificity = 0.91 [95% CI, 0.89-0.92]; and AUC = 0.96 [95% CI, 0.93-0.97]). ML performed similarly using all data and external data for fundus and the external test result of OCT was less robust (AUC = 0.87). When comparing different classifier categories, although support vector machine showed the highest performance (pooled sensitivity, specificity, and AUC ranges, 0.92-0.96, 0.95-0.97, and 0.96-0.99, respectively), results by neural network and others were still good (pooled sensitivity, specificity, and AUC ranges, 0.88-0.93, 0.90-0.93, 0.95-0.97, respectively). When analyzed based on dataset types, ML demonstrated consistent performances on clinical datasets (fundus AUC = 0.98 [95% CI, 0.97-0.99] and OCT AUC = 0.95 [95% 0.93-0.97]).

Conclusions: Performance of ML in detecting glaucoma compares favorably to that of experts and is promising for clinical application. Future prospective studies are needed to better evaluate its real-world utility.

Publication types

Meta-Analysis
Review

MeSH terms

Fundus Oculi
Glaucoma* / diagnosis
Humans
Machine Learning
Neural Networks, Computer
ROC Curve
Tomography, Optical Coherence* / methods