EG-TransUNet: a transformer-based U-Net with enhanced and guided models for biomedical image segmentation

Shaoming Pan; Xin Liu; Ningdi Xie; Yanwen Chong

doi:10.1186/s12859-023-05196-1

EG-TransUNet: a transformer-based U-Net with enhanced and guided models for biomedical image segmentation

BMC Bioinformatics. 2023 Mar 7;24(1):85. doi: 10.1186/s12859-023-05196-1.

Authors

Shaoming Pan¹, Xin Liu¹, Ningdi Xie¹, Yanwen Chong²

Affiliations

¹ The State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan, China.
² The State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan, China. ywchong@whu.edu.cn.

Abstract

Although various methods based on convolutional neural networks have improved the performance of biomedical image segmentation to meet the precision requirements of medical imaging segmentation task, medical image segmentation methods based on deep learning still need to solve the following problems: (1) Difficulty in extracting the discriminative feature of the lesion region in medical images during the encoding process due to variable sizes and shapes; (2) difficulty in fusing spatial and semantic information of the lesion region effectively during the decoding process due to redundant information and the semantic gap. In this paper, we used the attention-based Transformer during the encoder and decoder stages to improve feature discrimination at the level of spatial detail and semantic location by its multihead-based self-attention. In conclusion, we propose an architecture called EG-TransUNet, including three modules improved by a transformer: progressive enhancement module, channel spatial attention, and semantic guidance attention. The proposed EG-TransUNet architecture allowed us to capture object variabilities with improved results on different biomedical datasets. EG-TransUNet outperformed other methods on two popular colonoscopy datasets (Kvasir-SEG and CVC-ClinicDB) by achieving 93.44% and 95.26% on mDice. Extensive experiments and visualization results demonstrate that our method advances the performance on five medical segmentation datasets with better generalization ability.

Keywords: Channel spatial attention; Medical image segmentation; Progressive enhancement module; Self-attention; Semantic guidance attention; Transformer.

MeSH terms

Electric Power Supplies*
Neural Networks, Computer*
Semantics

Abstract

MeSH terms

Grants and funding