Coarse-to-fine visual representation learning for medical images via class activation maps

Comput Biol Med. 2024 Mar:171:108203. doi: 10.1016/j.compbiomed.2024.108203. Epub 2024 Feb 29.

Abstract

The value of coarsely labeled datasets in learning transferable representations for medical images is investigated in this work. Compared to fine labels which require meticulous effort to annotate, coarse labels can be acquired at a significantly lower cost and can provide useful training signals for data-hungry deep neural networks. We consider coarse labels in the form of binary labels differentiating a normal (healthy) image from an abnormal (diseased) image and propose CAMContrast, a two-stage representation learning framework for medical images. Using class activation maps, CAMContrast makes use of the binary labels to generate heatmaps as positive views for contrastive representation learning. Specifically, the learning objective is optimized to maximize the agreement within fixed crops of image-heatmap pair to learn fine-grained representations that are generalizable to different downstream tasks. We empirically validate the transfer learning performance of CAMContrast on several public datasets, covering classification and segmentation tasks on fundus photographs and chest X-ray images. The experimental results showed that our method outperforms other self-supervised and supervised pretrain methods in terms of data efficiency and downstream performance.

Keywords: Class activation map; Contrastive learning; Fundus; Weakly supervised learning; X-ray.

MeSH terms

  • Learning*
  • Neural Networks, Computer*
  • Thorax