Improving the generalization of glaucoma detection on fundus images via feature alignment between augmented views

Chengfeng Zhou; Juan Ye; Jun Wang; Zhiyong Zhou; Linyan Wang; Kai Jin; Yaofeng Wen; Chun Zhang; Dahong Qian

doi:10.1364/BOE.450543

Improving the generalization of glaucoma detection on fundus images via feature alignment between augmented views

Biomed Opt Express. 2022 Mar 11;13(4):2018-2034. doi: 10.1364/BOE.450543. eCollection 2022 Apr 1.

Authors

Chengfeng Zhou¹, Juan Ye², Jun Wang³, Zhiyong Zhou⁴, Linyan Wang², Kai Jin², Yaofeng Wen¹, Chun Zhang⁵, Dahong Qian¹

Affiliations

¹ School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China.
² Department of Ophthalmology, the Second Affiliated Hospital of Zhejiang University, Hangzhou, China.
³ Zhejiang University City College, Hangzhou, China.
⁴ School of Mechanic, Shanghai Dianji University, Shanghai, China.
⁵ Department of Ophthalmology, Peking University Third Hospital, Beijing, China.

Abstract

Convolutional neural networks (CNNs) are commonly used in glaucoma detection. Due to the various data distribution shift, however, a well-behaved model may be plummeting in performance when deployed in a new environment. On the other hand, the most straightforward method, data collection, is costly and even unrealistic in practice. To address these challenges, we propose a new method named data augmentation-based (DA) feature alignment (DAFA) to improve the out-of-distribution (OOD) generalization with a single dataset, which is based on the principle of feature alignment to learn the invariant features and eliminate the effect of data distribution shifts. DAFA creates two views of a sample by data augmentation and performs the feature alignment between that augmented views through latent feature recalibration and semantic representation alignment. Latent feature recalibration is normalizing the middle features to the same distribution by instance normalization (IN) layers. Semantic representation alignment is conducted by minimizing the Topk NT-Xent loss and the maximum mean discrepancy (MMD), which maximize the semantic agreement across augmented views from individual and population levels. Furthermore, a benchmark is established with seven glaucoma detection datasets and a new metric named mean of clean area under curve (mcAUC) for a comprehensive evaluation of the model performance. Experimental results of five-fold cross-validation demonstrate that DAFA can consistently and significantly improve the out-of-distribution generalization (up to +16.3% mcAUC) regardless of the training data, network architectures, and augmentation policies and outperform lots of state-of-the-art methods.