Semi-Supervised Cell Detection with Reliable Pseudo-Labels

J Comput Biol. 2022 Oct;29(10):1061-1073. doi: 10.1089/cmb.2022.0108. Epub 2022 Jun 15.

Abstract

Pathological images play an important role in the diagnosis, treatment, and prognosis of cancer. Usually, pathological images contain complex environments and cells of different shapes. Pathologists consume a lot of time and labor costs when analyzing and discriminating the cells in the images. Therefore, fully annotated pathological image data sets are not easy to obtain. In view of the problem of insufficient labeled data, we input a large number of unlabeled images into the pretrained model to generate accurate pseudo-labels. In this article, we propose two methods to improve the quality of pseudo-labels, namely, the pseudo-labeling based on adaptive threshold and the pseudo-labeling based on cell count. These two pseudo-labeling methods take into account the distribution of cells in different pathological images when removing background noise, and ensure that accurate pseudo-labels are generated for each unlabeled image. Meanwhile, when pseudo-labels are used for model retraining, we perform data distillation on the feature maps of unlabeled images through an attention mechanism, which further improves the quality of training data. In addition, we also propose a multi-task learning model, which learns the cell detection task and the cell count task simultaneously, and improves the performance of cell detection through feature sharing. We verified the above methods on three different data sets, and the results show that the detection effect of the model with a large number of unlabeled images involved in retraining is improved by 9%-13% compared with the model that only uses a small number of labeled images for pretraining. Moreover, our methods have good applicability on the three data sets.

Keywords: adaptive threshold; cell count; cell detection; multi-task learning; pseudo-labeling.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Learning*