Structure-Guided Cross-Attention Network for Cross-Domain OCT Fluid Segmentation

Xingxin He; Zhun Zhong; Leyuan Fang; Min He; Nicu Sebe

doi:10.1109/TIP.2022.3228163

Structure-Guided Cross-Attention Network for Cross-Domain OCT Fluid Segmentation

IEEE Trans Image Process. 2022 Dec 14:PP. doi: 10.1109/TIP.2022.3228163. Online ahead of print.

Authors

Xingxin He, Zhun Zhong, Leyuan Fang, Min He, Nicu Sebe

PMID: 37015552
DOI: 10.1109/TIP.2022.3228163

Abstract

Accurate retinal fluid segmentation on Optical Coherence Tomography (OCT) images plays an important role in diagnosing and treating various eye diseases. The art deep models have shown promising performance on OCT image segmentation given pixel-wise annotated training data. However, the learned model will achieve poor performance on OCT images that are obtained from different devices (domains) due to the domain shift issue. This problem largely limits the real-world application of OCT image segmentation since the types of devices usually are different in each hospital. In this paper, we study the task of cross-domain OCT fluid segmentation, where we are given a labeled dataset of the source device (domain) and an unlabeled dataset of the target device (domain). The goal is to learn a model that can perform well on the target domain. To solve this problem, in this paper, we propose a novel Structure-guided Cross-Attention Network (SCAN), which leverages the retinal layer structure to facilitate domain alignment. Our SCAN is inspired by the fact that the retinal layer structure is robust to domains and can reflect regions that are important to fluid segmentation. In light of this, we build our SCAN in a multi-task manner by jointly learning the retinal structure prediction and fluid segmentation. To exploit the mutual benefit between layer structure and fluid segmentation, we further introduce a cross-attention module to measure the correlation between the layer-specific feature and the fluid-specific feature encouraging the model to concentrate on highly relative regions during domain alignment. Moreover, an adaptation difficulty map is evaluated based on the retinal structure predictions from different domains, which enforces the model focus on hard regions during structure-aware adversarial learning. Extensive experiments on the three domains of the RETOUCH dataset demonstrate the effectiveness of the proposed method and show that our approach produces state-of-the-art performance on cross-domain OCT fluid segmentation.