Max-Margin Majority Voting for Learning from Crowds

IEEE Trans Pattern Anal Mach Intell. 2019 Oct;41(10):2480-2494. doi: 10.1109/TPAMI.2018.2860987. Epub 2018 Jul 31.

Abstract

Learning-from-crowds aims to design proper aggregation strategies to infer the unknown true labels from the noisy labels provided by ordinary web workers. This paper presents max-margin majority voting (M$^3$3V) to improve the discriminative ability of majority voting and further presents a Bayesian generalization to incorporate the flexibility of generative methods on modeling noisy observations with worker confusion matrices for different application settings. We first introduce the crowdsourcing margin of majority voting, then we formulate the joint learning as a regularized Bayesian inference (RegBayes) problem, where the posterior regularization is derived by maximizing the margin between the aggregated score of a potential true label and that of any alternative label. Our Bayesian model naturally covers the Dawid-Skene estimator and M$^3$3V as its two special cases. Due to the flexibility of our model, we extend it to handle crowdsourced labels with an ordinal structure with the main ideas about the crowdsourcing margin unchanged. Moreover, we consider an online learning-from-crowds setting where labels coming in a stream. Empirical results demonstrate that our methods are competitive, often achieving better results than state-of-the-art estimators.

Publication types

  • Research Support, Non-U.S. Gov't