Counteracting Data Bias and Class Imbalance-Towards a Useful and Reliable Retinal Disease Recognition System

Diagnostics (Basel). 2023 May 29;13(11):1904. doi: 10.3390/diagnostics13111904.

Abstract

Multiple studies presented satisfactory performances for the treatment of various ocular diseases. To date, there has been no study that describes a multiclass model, medically accurate, and trained on large diverse dataset. No study has addressed a class imbalance problem in one giant dataset originating from multiple large diverse eye fundus image collections. To ensure a real-life clinical environment and mitigate the problem of biased medical image data, 22 publicly available datasets were merged. To secure medical validity only Diabetic Retinopathy (DR), Age-Related Macular Degeneration (AMD) and Glaucoma (GL) were included. The state-of-the-art models ConvNext, RegNet and ResNet were utilized. In the resulting dataset, there were 86,415 normal, 3787 GL, 632 AMD and 34,379 DR fundus images. ConvNextTiny achieved the best results in terms of recognizing most of the examined eye diseases with the most metrics. The overall accuracy was 80.46 ± 1.48. Specific accuracy values were: 80.01 ± 1.10 for normal eye fundus, 97.20 ± 0.66 for GL, 98.14 ± 0.31 for AMD, 80.66 ± 1.27 for DR. A suitable screening model for the most prevalent retinal diseases in ageing societies was designed. The model was developed on a diverse, combined large dataset which made the obtained results less biased and more generalizable.

Keywords: convolutional neural networks; deep learning; medical image classification.

Grants and funding

This work was supported by the statutory funds of the Department of Artificial Intelligence, Wroclaw University of Science and Technology.