CrossU-Net: Dual-modality cross-attention U-Net for segmentation of precancerous lesions in gastric cancer

Comput Med Imaging Graph. 2024 Mar:112:102339. doi: 10.1016/j.compmedimag.2024.102339. Epub 2024 Jan 19.

Abstract

Gastric precancerous lesions (GPL) significantly elevate the risk of gastric cancer, and precise diagnosis and timely intervention are critical for patient survival. Due to the elusive pathological features of precancerous lesions, the early detection rate is less than 10%, which hinders lesion localization and diagnosis. In this paper, we provide a GPL pathological dataset and propose a novel method for improving the segmentation accuracy on a limited-scale dataset, namely RGB and Hyperspectral dual-modal pathological image Cross-attention U-Net (CrossU-Net). Specifically, we present a self-supervised pre-training model for hyperspectral images to serve downstream segmentation tasks. Secondly, we design a dual-stream U-Net-based network to extract features from different modal images. To promote information exchange between spatial information in RGB images and spectral information in hyperspectral images, we customize the cross-attention mechanism between the two networks. Furthermore, we use an intermediate agent in this mechanism to improve computational efficiency. Finally, we add a distillation loss to align predicted results for both branches, improving network generalization. Experimental results show that our CrossU-Net achieves accuracy and Dice of 96.53% and 91.62%, respectively, for GPL lesion segmentation, providing a promising spectral research approach for the localization and subsequent quantitative analysis of pathological features in early diagnosis.

Keywords: Cross attention; Dual-modality; Hyperspectral imaging; Image segmentation.

MeSH terms

  • Humans
  • Image Processing, Computer-Assisted
  • Precancerous Conditions* / diagnostic imaging
  • Stomach Neoplasms* / diagnostic imaging