1 Efficiently Learning the Accuracy of Labeling Sources for Selective Sampling by Pinar Donmez,...
-
Upload
emory-joseph-hancock -
Category
Documents
-
view
221 -
download
0
Transcript of 1 Efficiently Learning the Accuracy of Labeling Sources for Selective Sampling by Pinar Donmez,...
1
Efficiently Learning the Accuracy of
Labeling Sources for Selective Sampling
by Pinar Donmez, Jaime Carbonell, Jeff Schneider
School of Computer Science, Carnegie Mellon University
KDD ’09
June 30th 2009
Paris, France
2
Problem Illustration
0.74
0.55
0.8
0.9
0.67
0.83
0.58
0.69
instances
oracles
3
Interval Estimate Threshold (IEThresh) Goal: find the labeler(s) with the highest expected accuracy Our work builds upon Interval Estimation [L. P. Kaelbling]
1. Estimate the reward of each labeler (more on next slide)2. Compute upper confidence interval for the labelers
3. Select labelers with upper interval higher than a threshold
4. Observe the output of the chosen oracles to estimate their reward
5. Repeat to step 1
filter out unreliable labelers reduce labeling cost
4
Reward of the labelers The reward of each labeler is unknown => need to be estimated
reward of a labeler eliciting true label
true label is also unknown => estimated by the majority vote
We propose the below reward function
reward=1 if the labeler agrees with the majority label reward=0 otherwise
5
IEThresh at the Beginning
Oracles
Expect
ed
rew
ard
incr
ease
s
6
IEThresh Oracle Selection
Oracles
Expect
ed
rew
ard
incr
ease
s
Threshold
1 2 3 4 5
7
IE Learning Snapshot IIExpect
ed
rew
ard
incr
ease
s
Oracles
Threshold
1 2 3 4 5
8
IEThresh Instance Selection1
3
4
5
2
9
Uniform Expert Accuracy є (0.5,1]
Repeated Labeling [Sheng et al, 2008]: querying all experts for labeling
Cla
ssifi
cati
on e
rror
10
# Oracle Queries vs. Accuracy
: First 10 iterations
: Next 40 iterations
: Next 100 iterations
11
# Oracle queries to reach a target accuracy
skew increases
bett
er
12
Results on AMT Data with Human Annotators
IEThresh reaches the best performance with similar effort to Repeated labeling
Repeated baseline needs 840 queries total to reach 0.95 accuracy
Dataset at http://nlpannotations.googlepages.com/ made available by [Snow et al., 2008]
5 annotators
6 annotators
13
Conclusions and Future Work Conclusions
IEThresh is effective in balancing exploration vs. exploitation tradeoff
Early filtering of unreliable labelers boosts performance Utilizing labeler accuracy estimates is more effective
than asking all or randomly
Future Work
from consistent to time-variant labeler quality label noise conditioned on the data instance correlated labeling errors
14
THANK YOU!