Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition...
Transcript of Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition...
![Page 1: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/1.jpg)
Aggregating Ordinal Labels from Crowds by Minimax Conditional Entropy
Denny Zhou Qiang Liu John Platt Chris Meek
![Page 2: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/2.jpg)
2
![Page 3: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/3.jpg)
Crowds vs experts labeling: strength
3
Big labeled data
Time saving Money saving
More data beats cleverer algorithms
![Page 4: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/4.jpg)
Crowds vs experts labeling: weakness
4
Crowdsourced labels may be highly noisy
Garbage in … … Garbage out
![Page 5: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/5.jpg)
Orange (O) vs. Mandarin (M)
M
O
O
M
O O O
O
M
M
O
O
M
M
M
M
Non-experts, redundant labels
5
![Page 6: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/6.jpg)
Orange (O) vs. Mandarin (M)
M
O
O
M
O O O
O
M
M
O
O
M
M
M
M
Non-experts, redundant labels
6
![Page 7: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/7.jpg)
1 2 … 𝑗
1 𝑥11 𝑥12 … 𝑥1𝑗
2 𝑥21 𝑥21 … 𝑥2𝑗
… … … … …
𝑖 𝑥𝑖1 𝑥𝑖2 … 𝑥𝑖𝑗
… … … … …
…
…
…
…
…
…
Workers
Items
Observed worker labels
Unobserved true labels: 𝑦𝑗
7
![Page 8: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/8.jpg)
Roadmap: from multiclass to ordinal
1. Develop a method to aggregate general multiclass labels
2. Adapt the general method to ordinal labels
8
![Page 9: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/9.jpg)
Examples on multiclass labeling
9
Image categorization Speech recognition
![Page 10: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/10.jpg)
Introduce two fundamental concepts
Empirical count of wrong/correct labels
Expected number of wrong/correct labels
: worker label distribution : true label distribution
10
![Page 11: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/11.jpg)
Multiclass maximum conditional entropy
Given the true labels , estimate by
subject to
11
worker constraints
item constraints
![Page 12: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/12.jpg)
Multiclass minimax conditional entropy
Jointly estimate and by
subject to
12
worker constraints
item constraints
![Page 13: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/13.jpg)
Lagrangian dual
constraints
13
![Page 14: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/14.jpg)
Probabilistic labeling model
By the optimization theory, the dual problem leads to
normalization factor
worker ability item difficulty
14
![Page 15: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/15.jpg)
Dual problem
1. This only generates deterministic labels 2. Equivalent to maximizing complete likelihood
15
![Page 16: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/16.jpg)
Roadmap: from multiclass to ordinal
1. Develop a method to aggregate general multiclass labels
2. Adapt the general method to ordinal labels
16
![Page 17: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/17.jpg)
An example on ordinal labeling
search results
Perfect 1
Excellent 2
Good 3
Fair 4
Bad 5
17
![Page 18: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/18.jpg)
To proceed to ordinal labels
• Formulate assumptions which are specific for ordinal labeling
• Coincide with the previous multiclass method in the case of binary labeling
18
![Page 19: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/19.jpg)
Our assumption for ordinal labeling
1
2
3
4
5
likely to confuse
unlikely to confuse
adjacency confusability
19
![Page 20: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/20.jpg)
Reference label
True label Worker label
≥,<≥,<
Indirect label comparison
Formulating this assumption though pairwise comparison
20
![Page 21: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/21.jpg)
Ordinal minimax conditional entropy
Jointly estimate and by
subject to
Δ: take on values < or ≥𝛻: take on values < or ≥
21
worker constraints
item constraints
![Page 22: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/22.jpg)
Ordinal minimax conditional entropy
Jointly estimate and by
subject to
true label worker label
reference label
22
worker constraints
item constraints
![Page 23: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/23.jpg)
Ordinal minimax conditional entropy
Jointly estimate and by
subject to
difference from multiclass true label worker label
reference label
23
worker constraints
item constraints
![Page 24: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/24.jpg)
counting mistakes in ordinal sense
Explaining the ordinal constraints
For example, let Δ = <, 𝛻 = ≥:
24
![Page 25: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/25.jpg)
Probabilistic rating model
By the KKT conditions, the dual problem leads to
worker ability
item difficulty
structured
25
![Page 26: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/26.jpg)
Regularization
Two goals:
1. Prevent over fitting
2. Fix the deterministic label issue to generate probabilistic labels
26
![Page 27: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/27.jpg)
Regularized minimax conditional entropy
Jointly estimate and by
subject to
+ regularization terms
27
worker constraints
item constraints
![Page 28: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/28.jpg)
Regularized minimax conditional entropy
Jointly estimate and by
subject to
28
worker constraints
item constraints
![Page 29: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/29.jpg)
Dual problem
1. This generates probabilistic labels 2. Equivalent to maximizing marginal likelihood
29
![Page 30: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/30.jpg)
Choosing regularization parameters
• Cross-validation: 5 or 10 folds
• Random split
• Compare the likelihood of worker labels
30
Don’t need ground truth labels for cross-validation!
![Page 31: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/31.jpg)
Experiments: metrics
• Evaluation metrics
– L0 error:
– L1 error:
– L2 error:
31
![Page 32: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/32.jpg)
Experiments: baselines
• Compare regularized minimax condition entropy to
– Majority voting
– Dawid-Skene method (1979, see also its Bayesian version in Raykar et al. 2010, Liu et al. 2012, Chen at al. 2013)
– Latent trait analysis (Andrich 1978, Master 1982, Uebersax and Grove 1993, Mineiro 2011)
32
![Page 33: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/33.jpg)
Web search data
search results
Perfect 1
Excellent 2
Good 3
Fair 4
Bad 5
33
![Page 34: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/34.jpg)
Web search data
• Some facts about the data:
– 2665 query-URL pairs and a relevance rating scale from 1 to 5
– 177 non-expert workers with average error rate 63%
– Each query-URL pair is judged by 6 workers
– True labels are created via consensus from 9 experts
– Dataset created by Gabriella Kazai of Microsoft
34
![Page 35: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/35.jpg)
Web search data
L0 Error L1 Error L2 Error
Majority vote 0.269 0.428 0.930
Dawid & Skene 0.170 0.205 0.539
Latent trait 0.201 0.211 0.481
Entropy multiclass 0.111 0.131 0.419
Entropy ordinal 0.104 0.118 0.384
35
![Page 36: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/36.jpg)
Probabilistic labels vs error rates
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
(0, 0.5) (0.5, 0.6) (0.6, 0.7) (0.7, 0.8) (0.8, 0.9) (0.9, 1)
L0 error L1 error L2 error
36
![Page 37: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/37.jpg)
Price prediction data
$0 – $50 1
$51 – $100 2
$101 – $250 3
$251 – $500 4
$501 – $1000 5
$1001 – $2000 6
$2001 – $5000 7
37
![Page 38: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/38.jpg)
Price prediction data
• Some facts about the data:
– 80 household items collected from stores like Amazon and Costco
– Prices predicted by 155 students of UC Irvine
– Average error rate 69% and systematically biased
– Dataset created by Mark Steyvers of UC Irvine
38
![Page 39: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/39.jpg)
Price prediction data
L0 Error L1 Error L2 Error
Majority vote 0.675 1.125 1.605
Dawid & Skene 0.650 1.050 1.517
Latent trait 0.688 1.063 1.504
Entropy multiclass 0.675 1.150 1.643
Entropy ordinal 0.613 0.975 1.492
39
![Page 40: Aggregating Ordinal Labels from Crowds by Minimax ... · •Compare regularized minimax condition entropy to –Majority voting –Dawid-Skene method (1979, see also its Bayesian](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea61734a0035a42982b4b27/html5/thumbnails/40.jpg)
Summary
• Minimax conditional entropy principle for crowdsourcing
• Adjacency confusability assumption in ordinal labeling
• Ordinal labeling model with structured confusion matrices
http://research.microsoft.com/en-us/projects/crowd/
40