RadFormer: Transformers with global-local attention for interpretable and accurate Gallbladder Cancer detection

Med Image Anal. 2023 Jan:83:102676. doi: 10.1016/j.media.2022.102676. Epub 2022 Nov 19.

Abstract

We propose a novel deep neural network architecture to learn interpretable representation for medical image analysis. Our architecture generates a global attention for region of interest, and then learns bag of words style deep feature embeddings with local attention. The global, and local feature maps are combined using a contemporary transformer architecture for highly accurate Gallbladder Cancer (GBC) detection from Ultrasound (USG) images. Our experiments indicate that the detection accuracy of our model beats even human radiologists, and advocates its use as the second reader for GBC diagnosis. Bag of words embeddings allow our model to be probed for generating interpretable explanations for GBC detection consistent with the ones reported in medical literature. We show that the proposed model not only helps understand decisions of neural network models but also aids in discovery of new visual features relevant to the diagnosis of GBC. Source-code is available at https://github.com/sbasu276/RadFormer.

Keywords: Explainable AI; Gallbladder Cancer; Ultrasound Sonography; Visual transformer.

MeSH terms

  • Gallbladder Neoplasms* / diagnostic imaging
  • Humans
  • Learning
  • Neural Networks, Computer
  • Software