Crowdsourcing Literature

Notes on crowdsourcing literature that looks at finding true labels given estimates from noisy workers.

Lots of theory-focussed work which gives probabilistic bounds on errors in terms of the number of workers, their reliabilities, number of items and other parameters.

Addresses truth-tracking both in the sense of sample complexity and asymptotics.

Basic framework

  • Lot of work is in the Dawid-Skene model [DS79]

  • Often for the specific case of binary labels, although multi-label work exists too

  • Model:

    • A number of workers

    • A number of items, each of which has a true label among a set of possibilities (often same set of labels for all items)

    • Confusion probabilities for each worker: \(\pi_{kl}^{(i)}\) is the probability that worker \(i\) provides label \(l\) when \(k\) is the true label

    • Sometimes workers do not provide labels for all items:

      • Worker-item interaction modelled as a bipartite graph

      • Produced randomly [GKM11], completely arbitrary [DDKR13] or is constructed as part of the proposed algorithm [KOS13]

Papers

  • Who moderates the moderators? (2011)

    • Binary aggregation

    • Algorithm for finding true labels using spectral decomposition of a matrix associated with the input

    • Probabilistic bounds on fraction of errors made, parametrised by number of users, items, and average informativeness of workers

    • Informativeness is \((2p_i - 1)^2\): algorithm works well even for workers with accuracy close to 0 (since it can flip their labels)

    • The catch: need to know the identity of one trustworthy worker (reliability greater than \(1/2\))

    • Online version of the algorithm estimates worker reliability levels, but they do not analyse how close they are to true levels

    • Basic look at robustness to adversarial workers

  • Aggregating Crowdsourced Binary Ratings (2013)

    • Binary aggregation

    • Arbitrary worker-item answering graph

    • Spectral methods to produce estimates of both the true labels and true worker reliability levels

    • Probabilistic bounds on error for both of these estimates, in terms of average reliability (amongst other things)

  • Spectral methods meet EM: A provably optimal algorithm for crowdsourcing (2014)

    • Any finite number of labels

    • Uses spectral methods + EM algorithm to estimate confusion matrices (basically worker reliability levels…)

    • Results about theoretical accuracy of their algorithm

Papers to read

  • Learning from crowdsourced labeled data: a survey. Loads of references and outlines different strands of the literature.

  • CDAS: A Crowdsourcing Data Analytics System. Seems to be more focussed on actual implementation of a crowdsourcing platform, but read this just to extract the bit about quality control.

  • Adaptive Task Assignment for Crowdsourced Classification (already in pl). Have had a skim of first few pages and highlighted some further references. It also seems to contain some good background info on the area. Just noticed that lemma 1 is basically a more general result (and using their different but equivalent formalism, simpler to show) than the thing I was working on all weekend…

  • Efficient crowdsourcing for multi-class labeling, Karger et. al.

  • Aggregating Ordinal Labels from Crowds by Minimax Conditional Entropy

  • Reputation-based Worker Filtering in Crowdsourcing. From the abstract seems that it looks more generally at arbitrary (deterministic) adversarial strategies. So it might be free from the independent-judgements-with-same-accuracy-on-each-item weakness.

  • A Permutation-based Model for Crowd Labeling: Optimal Estimation and Robustness

  • Exact Exponent in Optimal Rates for Crowdsourcing

  • Multicategory Crowdsourcing Accounting for Variable Task Difficulty, Worker Skill, and Worker Intention

  • Domain-Weighted Majority Voting for Crowdsourcing

  • Identifying unreliable and adversarial workers in crowdsourced labeling tasks. Models adversarial sources explicitly

  • Iterative learning for reliable crowdsourcing systems

  • Crowdsourcing with Arbitrary Adversaries