Performance Evaluation of Different Object Detection Models for the Segmentation of Optical Cups and Discs

Gendry Alfonso-Francia; Jesus Carlos Pedraza-Ortega; Mariana Badillo-Fernández; Manuel Toledano-Ayala; Marco Antonio Aceves-Fernandez; Juvenal Rodriguez-Resendiz; Seok-Bum Ko; Saul Tovar-Arriaga

doi:10.3390/diagnostics12123031

Performance Evaluation of Different Object Detection Models for the Segmentation of Optical Cups and Discs

Diagnostics (Basel). 2022 Dec 2;12(12):3031. doi: 10.3390/diagnostics12123031.

Authors

Gendry Alfonso-Francia^{1

2}, Jesus Carlos Pedraza-Ortega¹, Mariana Badillo-Fernández³, Manuel Toledano-Ayala¹, Marco Antonio Aceves-Fernandez¹, Juvenal Rodriguez-Resendiz¹, Seok-Bum Ko², Saul Tovar-Arriaga¹

Affiliations

¹ Faculty of Engineering, Autonomous University of Querétaro, Santiago de Querétaro 76010, Mexico.
² Department of Electrical and Computer Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK S7N 5A9, Canada.
³ Instituto Mexicano de Oftalmología (IMO) I.A.P., Circuito Exterior Estadio Corregidora sn, Centro Sur, Santiago de Querétaro 76010, Mexico.

Abstract

Glaucoma is an eye disease that gradually deteriorates vision. Much research focuses on extracting information from the optic disc and optic cup, the structure used for measuring the cup-to-disc ratio. These structures are commonly segmented with deeplearning techniques, primarily using Encoder-Decoder models, which are hard to train and time-consuming. Object detection models using convolutional neural networks can extract features from fundus retinal images with good precision. However, the superiority of one model over another for a specific task is still being determined. The main goal of our approach is to compare object detection model performance to automate segment cups and discs on fundus images. This study brings the novelty of seeing the behavior of different object detection models in the detection and segmentation of the disc and the optical cup (Mask R-CNN, MS R-CNN, CARAFE, Cascade Mask R-CNN, GCNet, SOLO, Point_Rend), evaluated on Retinal Fundus Images for Glaucoma Analysis (REFUGE), and G1020 datasets. Reported metrics were Average Precision (AP), F1-score, IoU, and AUCPR. Several models achieved the highest AP with a perfect 1.000 when the threshold for IoU was set up at 0.50 on REFUGE, and the lowest was Cascade Mask R-CNN with an AP of 0.997. On the G1020 dataset, the best model was Point_Rend with an AP of 0.956, and the worst was SOLO with 0.906. It was concluded that the methods reviewed achieved excellent performance with high precision and recall values, showing efficiency and effectiveness. The problem of how many images are needed was addressed with an initial value of 100, with excellent results. Data augmentation, multi-scale handling, and anchor box size brought improvements. The capability to translate knowledge from one database to another shows promising results too.

Keywords: Cascade Mask R-CNN; Mask R-CNN; average precision; glaucoma; instance segmentation; intersection over union; object detection; segmentation.

Grants and funding

This research received no external funding.