Learning From Crowds With Multiple Noisy Label Distribution Propagation

IEEE Trans Neural Netw Learn Syst. 2022 Nov;33(11):6558-6568. doi: 10.1109/TNNLS.2021.3082496. Epub 2022 Oct 27.

Abstract

Crowdsourcing services provide a fast, efficient, and cost-effective way to obtain large labeled data for supervised learning. Unfortunately, the quality of crowdsourced labels cannot satisfy the standards of practical applications. Ground-truth inference, simply called label integration, designs proper aggregation methods to infer the unknown true label of each instance (sample) from the multiple noisy label set provided by ordinary crowd labelers (workers). However, nearly all existing label integration methods focus solely on the multiple noisy label set per individual instance while totally ignoring the intercorrelation among multiple noisy label sets of different instances. To solve this problem, a multiple noisy label distribution propagation (MNLDP) method is proposed in this article. MNLDP at first estimates the multiple noisy label distribution of each instance from its multiple noisy label set and then propagates its multiple noisy label distribution to its nearest neighbors. Consequently, each instance absorbs a fraction of the multiple noisy label distributions from its nearest neighbors and yet simultaneously maintains a fraction of its own original multiple noisy label distribution. Empirical studies on a collection of an artificial dataset, six simulated UCI datasets, and three real-world crowdsourced datasets show that MNLDP outperforms all other existing state-of-the-art label integration methods in terms of the integration accuracy and classification accuracy.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Neural Networks, Computer*