SurgiNet: Pyramid Attention Aggregation and Class-wise Self-Distillation for Surgical Instrument Segmentation

Med Image Anal. 2022 Feb:76:102310. doi: 10.1016/j.media.2021.102310. Epub 2021 Dec 4.

Abstract

Surgical instrument segmentation plays a promising role in robot-assisted surgery. However, illumination issues often appear in surgical scenes, altering the color and texture of surgical instruments. Changes in visual features make surgical instrument segmentation difficult. To address illumination issues, the SurgiNet is proposed to learn pyramid attention features. The double attention module is designed to capture the semantic dependencies between locations and channels. Based on semantic dependencies, the semantic features in the disturbed area can be inferred for addressing illumination issues. Pyramid attention is aggregated to capture multi-scale features and make predictions more accurate. To perform model compression, class-wise self-distillation is proposed to enhance the representation learning of the network, which performs feature distillation within the class to eliminate interference from other classes. Top-down and multi-stage knowledge distillation is designed to distill class probability maps. By inter-layer supervision, high-level probability maps are applied to calibrate the probability distribution of low-level probability maps. Since class-wise distillation enhances the self-learning of the network, the network can get excellent performance with a lightweight backbone. The proposed network achieves the state-of-the-art performance of 89.14% mIoU on CataIS with only 1.66 GFlops and 2.05 M parameters. It also takes first place on EndoVis 2017 with 66.30% mIoU.

Keywords: Class-wise Self-Distillation; Pyramid Attention; Surgical Insturment Segmentation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Attention
  • Humans
  • Image Processing, Computer-Assisted*
  • Semantics
  • Surgical Instruments