Pact-Net: Parallel CNNs and Transformers for medical image segmentation

Comput Methods Programs Biomed. 2023 Dec:242:107782. doi: 10.1016/j.cmpb.2023.107782. Epub 2023 Sep 1.

Abstract

Background and objective: The image segmentation of diseases can help clinical diagnosis and treatment in medical image analysis. Because medical images usually have low contrast and large changes in the size and shape of some structures, this will lead to over-segmentation and under-segmentation. These problems are particularly evident in the segmentation of skin damage. The blurring of the boundary in skin images and the specificity of patients will further increase the difficulty of skin lesion segmentation. Currently, most researchers use deep learning networks to solve these skin segmentation problems. However, traditional convolution methods often fail to obtain satisfactory segmentation performance due to their shortcomings in obtaining global features. Recently, Transformers with good global information extraction ability has achieved satisfactory results in computer vision, which brings new solutions to optimize the model of medical image segmentation further.

Methods: To extract more features related to medical image segmentation and effectively use features to further optimize the skin image segmentation model, we designed a network that combines CNNs and Transformers to improve local and global features, called Parallel CNNs and Transformers for Medical Image Segmentation (Pact-Net). Specifically, due to the advantages of Transformers in extracting global information, we create a novel fusion module CSMF, which uses channel and spatial attention mechanism and multi-scale mechanism to effectively fuse the global information extracted by Transformers into the local features extracted by CNNs. Therefore, our Pact-Net dual-branch runs in parallel to effectively capture global and local information.

Results: Our Pact-Net exceeds the models submitted on the three datasets ISIC 2016, ISIC 2017 and ISIC 2018, and the indicators required for the datasets reach 86.95%, 79.31% and 84.14%, respectively. We also conducted medical image segmentation experiments on cell and polyp datasets to evaluate the robustness, learning and generalization ability of the network. The ablation study of each part of Pact-Net proves the validity of each component, and the comparison with state-of-the-art methods on different indicators proves the predominance of the network.

Conclusions: This paper uses the advantages of CNNs and Transformers in extracting local and global features, and further integrates features for skin lesion segmentation. Compared with the state-of-the-art methods, Pact-Net can achieve the most advanced segmentation ability on the skin lesion segmentation dataset, which can help doctors diagnose and treat diseases.

Keywords: Convolutional neural networks; Fusion; Medical image segmentation; Transformers.

MeSH terms

  • Electric Power Supplies
  • Humans
  • Image Processing, Computer-Assisted
  • Information Storage and Retrieval
  • Physicians*
  • Polyps*
  • Skin / diagnostic imaging