The Validation of Deep Learning-Based Grading Model for Diabetic Retinopathy

Wen-Fei Zhang; Dong-Hong Li; Qi-Jie Wei; Da-Yong Ding; Li-Hui Meng; Yue-Lin Wang; Xin-Yu Zhao; You-Xin Chen

doi:10.3389/fmed.2022.839088

The Validation of Deep Learning-Based Grading Model for Diabetic Retinopathy

Front Med (Lausanne). 2022 May 16:9:839088. doi: 10.3389/fmed.2022.839088. eCollection 2022.

Authors

Wen-Fei Zhang^{1

2}, Dong-Hong Li³, Qi-Jie Wei³, Da-Yong Ding³, Li-Hui Meng^{1

2}, Yue-Lin Wang^{1

2}, Xin-Yu Zhao^{1

2}, You-Xin Chen^{1

2}

Affiliations

¹ Department of Ophthalmology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Beijing, China.
² Key Laboratory of Ocular Fundus Diseases, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China.
³ Visionary Intelligence Ltd., Beijing, China.

Abstract

Purpose: To evaluate the performance of a deep learning (DL)-based artificial intelligence (AI) hierarchical diagnosis software, EyeWisdom V1 for diabetic retinopathy (DR).

Materials and methods: The prospective study was a multicenter, double-blind, and self-controlled clinical trial. Non-dilated posterior pole fundus images were evaluated by ophthalmologists and EyeWisdom V1, respectively. The diagnosis of manual grading was considered as the gold standard. Primary evaluation index (sensitivity and specificity) and secondary evaluation index like positive predictive values (PPV), negative predictive values (NPV), etc., were calculated to evaluate the performance of EyeWisdom V1.

Results: A total of 1,089 fundus images from 630 patients were included, with a mean age of (56.52 ± 11.13) years. For any DR, the sensitivity, specificity, PPV, and NPV were 98.23% (95% CI 96.93-99.08%), 74.45% (95% CI 69.95-78.60%), 86.38% (95% CI 83.76-88.72%), and 96.23% (95% CI 93.50-98.04%), respectively; For sight-threatening DR (STDR, severe non-proliferative DR or worse), the above indicators were 80.47% (95% CI 75.07-85.14%), 97.96% (95% CI 96.75-98.81%), 92.38% (95% CI 88.07-95.50%), and 94.23% (95% CI 92.46-95.68%); For referral DR (moderate non-proliferative DR or worse), the sensitivity and specificity were 92.96% (95% CI 90.66-94.84%) and 93.32% (95% CI 90.65-95.42%), with the PPV of 94.93% (95% CI 92.89-96.53%) and the NPV of 90.78% (95% CI 87.81-93.22%). The kappa score of EyeWisdom V1 was 0.860 (0.827-0.890) with the AUC of 0.958 for referral DR.

Conclusion: The EyeWisdom V1 could provide reliable DR grading and referral recommendation based on the fundus images of diabetics.

Keywords: artificial intelligence; diabetic retinopathy; eye wisdom V1; sensitivity; specificity; validation.