TPFR-Net: U-shaped model for lung nodule segmentation based on transformer pooling and dual-attention feature reorganization

Med Biol Eng Comput. 2023 Aug;61(8):1929-1946. doi: 10.1007/s11517-023-02852-9. Epub 2023 May 27.

Abstract

Accurate segmentation of lung nodules is the key to diagnosing the lesion type of lung nodule. The complex boundaries of lung nodules and the visual similarity to surrounding tissues make precise segmentation of lung nodules challenging. Traditional CNN based lung nodule segmentation models focus on extracting local features from neighboring pixels and ignore global contextual information, which is prone to incomplete segmentation of lung nodule boundaries. In the U-shaped encoder-decoder structure, variations of image resolution caused by up-sampling and down-sampling result in the loss of feature information, which reduces the reliability of output features. This paper proposes transformer pooling module and dual-attention feature reorganization module to effectively improve the above two defects. Transformer pooling module innovatively fuses the self-attention layer and pooling layer in the transformer, which compensates for the limitation of convolution operation, reduces the loss of feature information in the pooling process, and decreases the computational complexity of the Transformer significantly. Dual-attention feature reorganization module innovatively employs the dual-attention mechanism of channel and spatial to improve the sub-pixel convolution, minimizing the loss of feature information during up-sampling. In addition, two convolutional modules are proposed in this paper, which together with transformer pooling module form an encoder that can adequately extract local features and global dependencies. We use the fusion loss function and deep supervision strategy in the decoder to train the model. The proposed model has been extensively experimented and evaluated on the LIDC-IDRI dataset, the highest Dice Similarity Coefficient is 91.84 and the highest sensitivity is 92.66, indicating the model's comprehensive capability has surpassed state-of-the-art UTNet. The model proposed in this paper has superior segmentation performance for lung nodules and can provide a more in-depth assessment of lung nodules' shape, size, and other characteristics, which is of important clinical significance and application value to assist physicians in the early diagnosis of lung nodules.

Keywords: Deep supervision; Dual-attention feature reorganization; Lung nodule segmentation; Transformer pooling.

MeSH terms

  • Clinical Relevance*
  • Electric Power Supplies
  • Humans
  • Image Processing, Computer-Assisted
  • Lung / diagnostic imaging
  • Physicians*
  • Reproducibility of Results