ST-Unet: Swin Transformer boosted U-Net with Cross-Layer Feature Enhancement for medical image segmentation

Comput Biol Med. 2023 Feb:153:106516. doi: 10.1016/j.compbiomed.2022.106516. Epub 2023 Jan 6.

Abstract

Medical image segmentation is an essential task in clinical diagnosis and case analysis. Most of the existing methods are based on U-shaped convolutional neural networks (CNNs), and one of disadvantages is that the long-term dependencies and global contextual connections cannot be effectively established, which results in inaccuracy segmentation. For fully using low-level features to enhance global features and reduce the semantic gap between encoding and decoding stages, we propose a novel Swin Transformer boosted U-Net (ST-Unet) for medical image processing in this paper, in which Swin Transformer and CNNs are used as encoder and decoder respectively. Then a novel Cross-Layer Feature Enhancement (CLFE) module is proposed to realize cross-layer feature learning, and a Spatial and Channel Squeeze & Excitation module is adopted to highlight the saliency of specific regions. Finally, we learn the features fused by the CLFE module through CNNs to recover low-level features and localize local features for realizing more accurate semantic segmentation. Experiments on widely used public datasets Synapse and ISIC 2018 prove that our proposed ST-Unet can achieve 78.86 of dice and 0.9243 of recall performance, outperforming most current medical image segmentation methods.

Keywords: Cross-layer feature enhancement; Medical image segmentation; ST-Unet; Swin Transformer.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Image Processing, Computer-Assisted*
  • Learning*
  • Neural Networks, Computer
  • Semantics