Hierarchical Active Learning With Qualitative Feedback on Regions

IEEE Trans Hum Mach Syst. 2023 Jun;53(3):581-589. doi: 10.1109/thms.2023.3252815. Epub 2023 Mar 23.

Abstract

Learning classification models in practice usually requires numerous labeled data for training. However, instance-based annotation can be inefficient for humans to perform. In this article, we propose and study a new type of human supervision that is fast to perform and useful for model learning. Instead of labeling individual instances, humans provide supervision to data regions, which are subspaces of the input data space, representing subpopulations of data. Since labeling now is performed on a region level, 0/1 labeling becomes imprecise. Thus, we design the region label to be a qualitative assessment of the class proportion, which coarsely preserves the labeling precision but is also easy for humans to do. To identify informative regions for labeling and learning, we further devise a hierarchical active learning process that recursively constructs a region hierarchy. This process is semisupervised in the sense that it is driven by both active learning strategies and human expertise, where humans can provide discriminative features. To evaluate our framework, we conducted extensive experiments on nine datasets as well as a real user study on a survival analysis of colorectal cancer patients. The results have clearly demonstrated the superiority of our region-based active learning framework against many instance-based active learning methods.

Keywords: Active learning (AL); learning from alternative human feedback; semisupervised learning.