Multi-modality self-attention aware deep network for 3D biomedical segmentation

BMC Med Inform Decis Mak. 2020 Jul 9;20(Suppl 3):119. doi: 10.1186/s12911-020-1109-0.

Abstract

Background: Deep learning based on segmentation models have been gradually applied in biomedical images and achieved state-of-the-art performance for 3D biomedical segmentation. However, most of existing biomedical segmentation researches take account of the application cases with adapting a single type of medical images from the corresponding examining method. Considering of practical clinic application of the radiology examination for diseases, the multiple image examination methods are normally required for final diagnosis especially in some severe diseases like cancers. Therefore, by considering the cases of employing multi-modal images and exploring the effective multi-modality fusion based on deep networks, we do the research to make full use of complementary information of multi-modal images referring to the clinic experiences of radiologists in image analysis.

Methods: Referring to the human radiologist diagnosis experience, we discuss and propose a new self-attention aware mechanism to improve the segmentation performance by paying the different attention on different modal images and different symptoms. Firstly, we propose a multi-path encoder and decoder deep network for 3D biomedical segmentation. Secondly, to leverage the complementary information among different modalities, we introduce a structure of attention mechanism called the Multi-Modality Self-Attention Aware (MMSA) convolution. Multi-modal images we used in the paper are different modalities of MR scanning images, which are input into the network separately. Then self-attention weight fusion of multi-modal features is performed with our proposed MMSA, which can adaptively adjust the fusion weights according to the learned contribution degree of different modalities and different features revealing the different symptoms from the labeled data.

Results: Experiments have been done on the public competition dataset BRATS-2015. The results show that our proposed method achieves dice scores of 0.8726, 0.6563, 0.8313 for the whole tumor, the tumor core and the enhancing tumor core, respectively. Comparing with the U-Net with SE block, the scores are increased by 0.0212,0.031,0.0304.

Conclusions: We present a multi-modality self-attention aware convolution, which have better segmentation results based on the adaptive weighting fusion mechanism with exploiting the multiple medical image modalities. Experimental results demonstrate the effectiveness of our method and prominent application in the multi-modality fusion based medical image analysis.

Keywords: 3D biomedical segmentation; Attention mechanism; Multi-modal fusion.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Image Processing, Computer-Assisted*