Pixel Diffuser: Practical Interactive Medical Image Segmentation without Ground Truth

Mingeon Ju; Jaewoo Yang; Jaeyoung Lee; Moonhyun Lee; Junyung Ji; Younghoon Kim

doi:10.3390/bioengineering10111280

Pixel Diffuser: Practical Interactive Medical Image Segmentation without Ground Truth

Bioengineering (Basel). 2023 Nov 2;10(11):1280. doi: 10.3390/bioengineering10111280.

Authors

Mingeon Ju¹, Jaewoo Yang¹, Jaeyoung Lee¹, Moonhyun Lee², Junyung Ji¹, Younghoon Kim¹

Affiliations

¹ Major in Bio Artificial Intelligence, Department of Applied Artificial Intelligence, Hanyang University at Ansan, Ansan 15588, Republic of Korea.
² Major in Bio Artificial Intelligence, Department of Computer Science & Engineering, Hanyang University at Ansan, Ansan 15588, Republic of Korea.

Abstract

Medical image segmentation is essential for doctors to diagnose diseases and manage patient status. While deep learning has demonstrated potential in addressing segmentation challenges within the medical domain, obtaining a substantial amount of data with accurate ground truth for training high-performance segmentation models is both time-consuming and demands careful attention. While interactive segmentation methods can reduce the costs of acquiring segmentation labels for training supervised models, they often still necessitate considerable amounts of ground truth data. Moreover, achieving precise segmentation during the refinement phase results in increased interactions. In this work, we propose an interactive medical segmentation method called PixelDiffuser that requires no medical segmentation ground truth data and only a few clicks to obtain high-quality segmentation using a VGG19-based autoencoder. As the name suggests, PixelDiffuser starts with a small area upon the initial click and gradually detects the target segmentation region. Specifically, we segment the image by creating a distortion in the image and repeating it during the process of encoding and decoding the image through an autoencoder. Consequently, PixelDiffuser enables the user to click a part of the organ they wish to segment, allowing the segmented region to expand to nearby areas with pixel values similar to the chosen organ. To evaluate the performance of PixelDiffuser, we employed the dice score, based on the number of clicks, to compare the ground truth image with the inferred segment. For validation of our method's performance, we leveraged the BTCV dataset, containing CT images of various organs, and the CHAOS dataset, which encompasses both CT and MRI images of the liver, kidneys and spleen. Our proposed model is an efficient and effective tool for medical image segmentation, achieving competitive performance compared to previous work in less than five clicks and with very low memory consumption without additional training.

Keywords: CT segmentation; autoencoder; interactive medical segmentation; iterative segmentation; reconstruction noise.

Abstract

Grants and funding