Pixel Diffuser: Practical Interactive Medical Image Segmentation without Ground Truth

Bioengineering (Basel). 2023 Nov 2;10(11):1280. doi: 10.3390/bioengineering10111280.

Abstract

Medical image segmentation is essential for doctors to diagnose diseases and manage patient status. While deep learning has demonstrated potential in addressing segmentation challenges within the medical domain, obtaining a substantial amount of data with accurate ground truth for training high-performance segmentation models is both time-consuming and demands careful attention. While interactive segmentation methods can reduce the costs of acquiring segmentation labels for training supervised models, they often still necessitate considerable amounts of ground truth data. Moreover, achieving precise segmentation during the refinement phase results in increased interactions. In this work, we propose an interactive medical segmentation method called PixelDiffuser that requires no medical segmentation ground truth data and only a few clicks to obtain high-quality segmentation using a VGG19-based autoencoder. As the name suggests, PixelDiffuser starts with a small area upon the initial click and gradually detects the target segmentation region. Specifically, we segment the image by creating a distortion in the image and repeating it during the process of encoding and decoding the image through an autoencoder. Consequently, PixelDiffuser enables the user to click a part of the organ they wish to segment, allowing the segmented region to expand to nearby areas with pixel values similar to the chosen organ. To evaluate the performance of PixelDiffuser, we employed the dice score, based on the number of clicks, to compare the ground truth image with the inferred segment. For validation of our method's performance, we leveraged the BTCV dataset, containing CT images of various organs, and the CHAOS dataset, which encompasses both CT and MRI images of the liver, kidneys and spleen. Our proposed model is an efficient and effective tool for medical image segmentation, achieving competitive performance compared to previous work in less than five clicks and with very low memory consumption without additional training.

Keywords: CT segmentation; autoencoder; interactive medical segmentation; iterative segmentation; reconstruction noise.