CoT-UNet++: A medical image segmentation method based on contextual transformer and dense connection

Math Biosci Eng. 2023 Mar 1;20(5):8320-8336. doi: 10.3934/mbe.2023364.

Abstract

Accurate depiction of individual teeth from CBCT images is a critical step in the diagnosis of oral diseases, and the traditional methods are very tedious and laborious, so automatic segmentation of individual teeth in CBCT images is important to assist physicians in diagnosis and treatment. TransUNet has achieved success in medical image segmentation tasks, which combines the advantages of Transformer and CNN. However, the skip connection taken by TransUNet leads to unnecessary restrictive fusion and also ignores the rich context between adjacent keys. To solve these problems, this paper proposes a context-transformed TransUNet++ (CoT-UNet++) architecture, which consists of a hybrid encoder, a dense connection, and a decoder. To be specific, a hybrid encoder is first used to obtain the contextual information between adjacent keys by CoTNet and the global context encoded by Transformer. Then the decoder upsamples the encoded features by cascading upsamplers to recover the original resolution. Finally, the multi-scale fusion between the encoded and decoded features at different levels is performed by dense concatenation to obtain more accurate location information. In addition, we employ a weighted loss function consisting of focal, dice, and cross-entropy to reduce the training error and achieve pixel-level optimization. Experimental results demonstrate that the proposed CoT-UNet++ method outperforms the baseline models and can obtain better performance in tooth segmentation.

Keywords: contextual transformer; dense connection; medical image segmentation; tooth segmentation; weighted loss function.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Electric Power Supplies*
  • Entropy