Analytic Methods for Optimizing Realtime Crowdsourcing
-
Upload
michael-bernstein -
Category
Technology
-
view
365 -
download
0
description
Transcript of Analytic Methods for Optimizing Realtime Crowdsourcing
MIT HUMAN-COMPUTER INTERACTION
Analytic Methods for Optimizing Realtime Crowdsourcing
Michael Bernstein, David Karger, Rob Miller, and Joel BrandtMIT CSAIL and Adobe Systems
Tuesday, May 8, 12
MIT HUMAN-COMPUTER INTERACTION
Use queueing theory to understand and optimize performance of a paid, realtime crowdsourcing platform.
β’Relationship between crowd size and response time
β’Algorithm for optimizing crowd size & cost vs. response time
β’ Improvements to the platform: 500 millisecond feedback
Tuesday, May 8, 12
Realtime CrowdsAnswering visual questions for blind users[Bigham et al. 2010]
Tuesday, May 8, 12
Realtime CrowdsAnswering visual questions for blind users[Bigham et al. 2010]
Crowd-assisted photography[Bernstein et al. 2011]
Tuesday, May 8, 12
Realtime CrowdsAnswering visual questions for blind users[Bigham et al. 2010]
Crowd-assisted photography[Bernstein et al. 2011]
Tuesday, May 8, 12
Paid CrowdsourcingPay small amounts of money for short tasks
Amazon Mechanical Turk: Roughly five million tasks completed per year at 1-5Β’ each [Ipeirotis 2010]
Label an imageRequester: Matt C.
Reward: $0.01
Transcribe short audio clipRequester: Gordon L.
Reward: $0.04
Tuesday, May 8, 12
Retainer RecruitmentWorkers sign up in advanceΒ½Β’ per minute to remain on callAlert when the task is ready
Wait at most: 5 minutesTask:Click on the verbs in the paragraph
He leapt the fence anddashed toward the door.
[Bernstein et al. 2011]Tuesday, May 8, 12
Retainer RecruitmentWorkers sign up in advanceΒ½Β’ per minute to remain on callAlert when the task is ready
Wait at most: 5 minutesTask:Click on the verbs in the paragraph
alert()
Start now! OK
He leapt the fence anddashed toward the door.
[Bernstein et al. 2011]Tuesday, May 8, 12
Retainer RecruitmentWorkers sign up in advanceΒ½Β’ per minute to remain on callAlert when the task is ready
50% of workers return in two seconds, and 75% of workers return in three seconds.
[Bernstein et al. 2011]Tuesday, May 8, 12
State of the LiteratureRealtime Crowds
β’ Recruit crowds in two seconds, execute traditional tasks (e.g., votes) in five seconds
β’ Maintain continuous control of remote interfaces
β’ Opportunities in deployable, intelligently reactive software
[Bigham et al. 2010, Bernstein et al. 2011, Lasecki et al. 2011]Tuesday, May 8, 12
The ChallengeRunning Out of Retainer Workers
Tuesday, May 8, 12
The ChallengeRunning Out of Retainer Workers
Tuesday, May 8, 12
The ChallengeRunning Out of Retainer Workers
Tuesday, May 8, 12
The ChallengeRunning Out of Retainer Workers
Tuesday, May 8, 12
The ChallengeRunning Out of Retainer Workers
Tuesday, May 8, 12
The ChallengeRunning Out of Retainer Workers
Tuesday, May 8, 12
The ChallengeRunning Out of Retainer Workers
LossNon-realtime response
Tuesday, May 8, 12
The TradeoffMissed tasks, non-realtime results
Extra retainer workers,extra cost
Tuesday, May 8, 12
The Goal
Optimize the tradeoff betweenrecruiting too many workers anddropping too many tasks.
Tuesday, May 8, 12
The Goal
Optimize the tradeoff betweenrecruiting too many workers anddropping too many tasks.
Budget-optimal crowdsourcing is possible in non-realtime scenarios [Dai, Mausam and Weld 2010; Kamar, Hacker and Horvitz 2012; Karger, Oh, and Shah 2011]
Tuesday, May 8, 12
1 Model
2 Optimization
3 PlatformOut
line
Tuesday, May 8, 12
Queueing Theory
β’ Formal framework for stochastic arrival and service processes
β’ Basic idea: random task arrivals and random processing times for workers
β’ Quantify how long tasks will need to waitin line
ModelOptimizePlatformO
utlin
e
Queueing theory for completion times: [Ipeirotis 2010]Tuesday, May 8, 12
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
Β΅οΏ½
Server
MM1
Tuesday, May 8, 12
Server
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
Β΅οΏ½M
M1
Tuesday, May 8, 12
ServerTask
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
Β΅οΏ½M
M1
Tuesday, May 8, 12
ServerTask
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
Β΅οΏ½M
M1
Tuesday, May 8, 12
Server
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
Β΅οΏ½M
M1
Tuesday, May 8, 12
Server
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
Β΅οΏ½M
M1
Tuesday, May 8, 12
Server
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
Β΅οΏ½M
M1
Tuesday, May 8, 12
Server
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
Β΅οΏ½M
M1
Tuesday, May 8, 12
Server
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
Β΅οΏ½M
M1
Tuesday, May 8, 12
Server
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
Β΅οΏ½M
M1
Tuesday, May 8, 12
Server
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
Β΅οΏ½M
M1
Tuesday, May 8, 12
Server
Queueing TheoryM/M/1 queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate One server
Β΅οΏ½M
M1
Tuesday, May 8, 12
Queueing TheoryM/M/c/c queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate c serversc max tasks in servers and queue
Β΅οΏ½M
Mcc
Tuesday, May 8, 12
Queueing TheoryM/M/c/c queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate c serversc max tasks in servers and queue
Β΅οΏ½M
Mcc
Tuesday, May 8, 12
Queueing TheoryM/M/c/c queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate c serversc max tasks in servers and queue
Β΅οΏ½M
Mcc
Tuesday, May 8, 12
Queueing TheoryM/M/c/c queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate c serversc max tasks in servers and queue
Β΅οΏ½M
Mcc
Tuesday, May 8, 12
Queueing TheoryM/M/c/c queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate c serversc max tasks in servers and queue
Β΅οΏ½M
Mcc
Tuesday, May 8, 12
Queueing TheoryM/M/c/c queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate c serversc max tasks in servers and queue
Β΅οΏ½M
Mcc
Tuesday, May 8, 12
Queueing TheoryM/M/c/c queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate c serversc max tasks in servers and queue
Β΅οΏ½M
Mcc
Tuesday, May 8, 12
All servers busy
Queueing TheoryM/M/c/c queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate c serversc max tasks in servers and queue
Β΅οΏ½M
Mcc
Tuesday, May 8, 12
Queueing TheoryM/M/c/c queue
Markovian (Poisson process) task arrivals, rateMarkovian (Poisson process) server work time, rate c serversc max tasks in servers and queue
Β΅οΏ½M
Mcc
Tuesday, May 8, 12
Modeling Retainer Recruitment
Tuesday, May 8, 12
Retainer QueueM/M/c/c queue
c workers, no waiting queueTask arrivals: Poisson process, rateWorker recruitment time: Poisson process, rate Β΅
οΏ½
Crowd
Tuesday, May 8, 12
Retainer QueueM/M/c/c queue
c workers, no waiting queueTask arrivals: Poisson process, rateWorker recruitment time: Poisson process, rate Β΅
οΏ½
Crowd
Tuesday, May 8, 12
Retainer QueueM/M/c/c queue
c workers, no waiting queueTask arrivals: Poisson process, rateWorker recruitment time: Poisson process, rate Β΅
οΏ½
Crowd
Tuesday, May 8, 12
c workers, no waiting queueTask arrivals: Poisson process, rateWorker recruitment time: Poisson process, rate Β΅
οΏ½
Retainer QueueM/M/c/c queue
Crowd
Tuesday, May 8, 12
c workers, no waiting queueTask arrivals: Poisson process, rateWorker recruitment time: Poisson process, rate Β΅
οΏ½
Retainer QueueM/M/c/c queue
Crowd
Tuesday, May 8, 12
c workers, no waiting queueTask arrivals: Poisson process, rateWorker recruitment time: Poisson process, rate Β΅
οΏ½
Retainer QueueM/M/c/c queue
Crowd
Tuesday, May 8, 12
c workers, no waiting queueTask arrivals: Poisson process, rateWorker recruitment time: Poisson process, rate Β΅
οΏ½
Retainer QueueLoss
Crowd
Tuesday, May 8, 12
c workers, no waiting queueTask arrivals: Poisson process, rateWorker recruitment time: Poisson process, rate Β΅
οΏ½
Retainer QueueLoss
Crowd
Tuesday, May 8, 12
c workers, no waiting queueTask arrivals: Poisson process, rateWorker recruitment time: Poisson process, rate Β΅
οΏ½
Retainer QueueLoss
Crowd
All servers busy
Tuesday, May 8, 12
Retainer QueueLoss
Crowd
All servers busy
Tuesday, May 8, 12
Retainer QueueLoss
Crowd
All servers busy
Tuesday, May 8, 12
Retainer QueueLoss
Crowd
All servers busy
P (i servers busy) = β‘(i)
P (all servers busy) = β‘(c)
Tuesday, May 8, 12
Retainer QueueLoss
Crowd
All servers busy
P (i servers busy) = β‘(i)
P (all servers busy) = β‘(c)
Tuesday, May 8, 12
Retainer QueueLoss
Crowd
All servers busy
P (i servers busy) = β‘(i)
P (all servers busy) = β‘(c)
Tuesday, May 8, 12
Retainer QueueLoss
Crowd
All servers busy
P (i servers busy) = β‘(i)
P (all servers busy) = β‘(c)
Tuesday, May 8, 12
Model Predictions
1. Probability that all workers are busy:β the task has to wait for expected time
2. Cost of keeping a retainer pool of size cβ cost depends on number of idle servers
1/Β΅β‘(c)
Tuesday, May 8, 12
Probability of Loss
β’ Draw on Erlangβs Loss Formula from queueing theory: probability of a rejected request in an M/M/c/c queue
β’ Let be the traffic intensity:
(roughly, the number of new tasks that will arrive in the time it takes to recruit a worker)
β’β’ = οΏ½/Β΅
Tuesday, May 8, 12
Probability of Loss
Erlangβs Loss Formula says:
Remarkably, this result makes no assumptionsabout the arrival distribution.
β‘(c) = P (c servers busy)
=β’c/c!Pci=0 β’
i/i!
Tuesday, May 8, 12
Probability of Loss
β
β
β
ββ
ββ
ββ
β
2 4 6 8 10
0.1
0.2
0.5
1.0
2.0
5.0
(a) Cost of retainer
Size of retainer pool
Paym
ent u
nits
per
uni
t tim
e
β
β
β
β
ββ
ββ
ββ
β
β
β
β
ββ
ββ
ββ
β
β
β
β
β
β
β
ββ
β
β
β
β
β
β
β
β
ββ
β
Traffic intensity Ο0.10.51510
β
β
β
β
β
β
β
β
β
2 4 6 8 10
1eβ1
01eβ0
71eβ0
41eβ0
1
(b) Probability of waiting
Size of retainer pool
Prob
abilit
y of
wai
ting
β
β
β
β
β
β
β
β
β
β
ββ
β
β
β
β
β
β
β
β
β β β β β ββ
ββ
β
β β β β β β β β β β
Traffic intensity Ο0.10.51510
β
β
β
β
β
β
β
β
2 4 6 8 10
1eβ1
01eβ0
71eβ0
41eβ0
11e
+02
(c) Expected wait time
Size of retainer pool
Expe
cted
wai
t tim
e (s
econ
ds)
β
β
β
β
β
β
β
β
β
β
ββ
β
β
β
β
β
β
β
β
β β β β β β ββ
ββ
β β β β β β β β β β
Traffic intensity Ο0.10.51510
β
β
β
ββ
ββ
ββ
β
2 4 6 8 10
0.1
0.2
0.5
1.0
2.0
5.0
(a) Cost of retainer
Size of retainer pool
Paym
ent u
nits
per
uni
t tim
e
β
β
β
β
ββ
ββ
ββ
β
β
β
β
ββ
ββ
ββ
β
β
β
β
β
β
β
ββ
β
β
β
β
β
β
β
β
ββ
β
Traffic intensity Ο0.10.51510
β
β
β
β
β
β
β
β
β
2 4 6 8 10
1eβ1
01eβ0
71eβ0
41eβ0
1
(b) Probability of waiting
Size of retainer pool
Prob
abilit
y of
wai
ting
β
β
β
β
β
β
β
β
β
β
ββ
β
β
β
β
β
β
β
β
β β β β β ββ
ββ
β
β β β β β β β β β β
Traffic intensity Ο0.10.51510
β
β
β
β
β
β
β
β
2 4 6 8 10
1eβ1
01eβ0
71eβ0
41eβ0
11e
+02
(c) Expected wait time
Size of retainer pool
Expe
cted
wai
t tim
e (s
econ
ds)
β
β
β
β
β
β
β
β
β
β
ββ
β
β
β
β
β
β
β
β
β β β β β β ββ
ββ
β β β β β β β β β β
Traffic intensity Ο0.10.51510
Tuesday, May 8, 12
Expected Waiting Time
P (c servers busy)β₯ (expected recruitment time)
= β‘(c)1
Β΅
=β’
c/c!Pc
i=0 β’i/i!
1
Β΅
P (c servers busy)β₯ (expected recruitment time)
= β‘(c)1
Β΅
=β’
c/c!Pc
i=0 β’i/i!
1
Β΅
P (c servers busy)β₯ (expected recruitment time)
= β‘(c)1
Β΅
=β’
c/c!Pc
i=0 β’i/i!
1
Β΅
Tuesday, May 8, 12
Expected CostHow much do we pay in steady-state?
Depends on how many workers are usually waiting on retainer.
Tuesday, May 8, 12
Expected CostProbability of i busy servers in an M/M/c/c queue is a more general version of Erlangβs Loss Formula:
Derive the expected number of busy workers:
β‘(i) =β’i/i!Pci=0 β’
i/i!
E[i] = β’[1οΏ½ β‘(c)]
Tuesday, May 8, 12
Expected CostProbability of i busy servers in an M/M/c/c queue is a more general version of Erlangβs Loss Formula:
Derive the expected number of busy workers:
Total cost is the number of idle workers:
β‘(i) =β’i/i!Pci=0 β’
i/i!
E[i] = β’[1οΏ½ β‘(c)]
cοΏ½ β’[1οΏ½ β‘(c)]
Tuesday, May 8, 12
Expected Cost
β
β
β
ββ
ββ
ββ
β
2 4 6 8 10
0.1
0.2
0.5
1.0
2.0
5.0
(a) Cost of retainer
Size of retainer pool
Paym
ent u
nits
per
uni
t tim
e
β
β
β
β
ββ
ββ
ββ
β
β
β
β
ββ
ββ
ββ
β
β
β
β
β
β
β
ββ
β
β
β
β
β
β
β
β
ββ
β
Traffic intensity Ο0.10.51510
β
β
β
β
β
β
β
β
β
2 4 6 8 10
1eβ1
01eβ0
71eβ0
41eβ0
1
(b) Probability of waiting
Size of retainer pool
Prob
abilit
y of
wai
ting
β
β
β
β
β
β
β
β
β
β
ββ
β
β
β
β
β
β
β
β
β β β β β ββ
ββ
β
β β β β β β β β β β
Traffic intensity Ο0.10.51510
β
β
β
β
β
β
β
β
2 4 6 8 10
1eβ1
01eβ0
71eβ0
41eβ0
11e
+02
(c) Expected wait time
Size of retainer pool
Expe
cted
wai
t tim
e (s
econ
ds)
β
β
β
β
β
β
β
β
β
β
ββ
β
β
β
β
β
β
β
β
β β β β β β ββ
ββ
β β β β β β β β β β
Traffic intensity Ο0.10.51510
Cost goes down when ,but performance suffers.
β
β
β
ββ
ββ
ββ
β
2 4 6 8 10
0.1
0.2
0.5
1.0
2.0
5.0
(a) Cost of retainer
Size of retainer pool
Paym
ent u
nits
per
uni
t tim
e
β
β
β
β
ββ
ββ
ββ
β
β
β
β
ββ
ββ
ββ
β
β
β
β
β
β
β
ββ
β
β
β
β
β
β
β
β
ββ
β
Traffic intensity Ο0.10.51510
β
β
β
β
β
β
β
β
β
2 4 6 8 10
1eβ1
01eβ0
71eβ0
41eβ0
1
(b) Probability of waiting
Size of retainer pool
Prob
abilit
y of
wai
ting
β
β
β
β
β
β
β
β
β
β
ββ
β
β
β
β
β
β
β
β
β β β β β ββ
ββ
β
β β β β β β β β β β
Traffic intensity Ο0.10.51510
β
β
β
β
β
β
β
β
2 4 6 8 10
1eβ1
01eβ0
71eβ0
41eβ0
11e
+02
(c) Expected wait time
Size of retainer pool
Expe
cted
wai
t tim
e (s
econ
ds)
β
β
β
β
β
β
β
β
β
β
ββ
β
β
β
β
β
β
β
β
β β β β β β ββ
ββ
β β β β β β β β β β
Traffic intensity Ο0.10.51510
c < β’
Tuesday, May 8, 12
Expected Cost
β
β
β
ββ
ββ
ββ
β
2 4 6 8 10
0.1
0.2
0.5
1.0
2.0
5.0
(a) Cost of retainer
Size of retainer pool
Paym
ent u
nits
per
uni
t tim
e
β
β
β
β
ββ
ββ
ββ
β
β
β
β
ββ
ββ
ββ
β
β
β
β
β
β
β
ββ
β
β
β
β
β
β
β
β
ββ
β
Traffic intensity Ο0.10.51510
β
β
β
β
β
β
β
β
β
2 4 6 8 101eβ1
01eβ0
71eβ0
41eβ0
1
(b) Probability of waiting
Size of retainer pool
Prob
abilit
y of
wai
ting
β
β
β
β
β
β
β
β
β
β
ββ
β
β
β
β
β
β
β
β
β β β β β ββ
ββ
β
β β β β β β β β β β
Traffic intensity Ο0.10.51510
β
β
β
β
β
β
β
β
2 4 6 8 10
1eβ1
01eβ0
71eβ0
41eβ0
11e
+02
(c) Expected wait time
Size of retainer pool
Expe
cted
wai
t tim
e (s
econ
ds)
β
β
β
β
β
β
β
β
β
β
ββ
β
β
β
β
β
β
β
β
β β β β β β ββ
ββ
β β β β β β β β β β
Traffic intensity Ο0.10.51510
Cost goes down when ,but performance suffers.
β
β
β
ββ
ββ
ββ
β
2 4 6 8 10
0.1
0.2
0.5
1.0
2.0
5.0
(a) Cost of retainer
Size of retainer pool
Paym
ent u
nits
per
uni
t tim
e
β
β
β
β
ββ
ββ
ββ
β
β
β
β
ββ
ββ
ββ
β
β
β
β
β
β
β
ββ
β
β
β
β
β
β
β
β
ββ
β
Traffic intensity Ο0.10.51510
β
β
β
β
β
β
β
β
β
2 4 6 8 10
1eβ1
01eβ0
71eβ0
41eβ0
1
(b) Probability of waiting
Size of retainer pool
Prob
abilit
y of
wai
ting
β
β
β
β
β
β
β
β
β
β
ββ
β
β
β
β
β
β
β
β
β β β β β ββ
ββ
β
β β β β β β β β β β
Traffic intensity Ο0.10.51510
β
β
β
β
β
β
β
β
2 4 6 8 10
1eβ1
01eβ0
71eβ0
41eβ0
11e
+02
(c) Expected wait time
Size of retainer pool
Expe
cted
wai
t tim
e (s
econ
ds)
β
β
β
β
β
β
β
β
β
β
ββ
β
β
β
β
β
β
β
β
β β β β β β ββ
ββ
β β β β β β β β β β
Traffic intensity Ο0.10.51510
c < β’
β
β
β
ββ
ββ
ββ
β
2 4 6 8 10
0.1
0.2
0.5
1.0
2.0
5.0
(a) Cost of retainer
Size of retainer pool
Paym
ent u
nits
per
uni
t tim
e
β
β
β
β
ββ
ββ
ββ
β
β
β
β
ββ
ββ
ββ
β
β
β
β
β
β
β
ββ
β
β
β
β
β
β
β
β
ββ
β
Traffic intensity Ο0.10.51510
β
β
β
β
β
β
β
β
β
2 4 6 8 101eβ1
01eβ0
71eβ0
41eβ0
1
(b) Probability of waiting
Size of retainer pool
Prob
abilit
y of
wai
ting
β
β
β
β
β
β
β
β
β
β
ββ
β
β
β
β
β
β
β
β
β β β β β ββ
ββ
β
β β β β β β β β β β
Traffic intensity Ο0.10.51510
β
β
β
β
β
β
β
β
2 4 6 8 10
1eβ1
01eβ0
71eβ0
41eβ0
11e
+02
(c) Expected wait time
Size of retainer pool
Expe
cted
wai
t tim
e (s
econ
ds)
β
β
β
β
β
β
β
β
β
β
ββ
β
β
β
β
β
β
β
β
β β β β β β ββ
ββ
β β β β β β β β β β
Traffic intensity Ο0.10.51510
Tuesday, May 8, 12
Optimal Retainer Sizeβ’ Size of retainer pool is typically the only
value that requesters can manipulate
β’ Minimize costs by keeping the retainer pool small while keeping low
ModelOptimizePlatformO
utlin
e
β‘(c)
Tuesday, May 8, 12
Optimal Retainer SizeBased on Maximum Miss Probability
Given a maximum desired probability of a miss :
Minimize c subject to
β
β
β
β
β
β
β
β
β
0 2 4 6 8
1eβ1
01eβ0
4
Cost vs. probability of waiting
Expected payments per unit time
Prob
abilit
y of
wai
ting
β
β
β
β
β
β
β
β
β
ββ
ββ
β
β
β
β
β
β
ββ β β β β β ββ
ββ
ββ
β
βββββββ β β β β β β β β β ββ
β
Traffic intensity Ο0.10.51510
pmax
β‘(c) pmax
Tuesday, May 8, 12
Optimal Retainer SizeBased on Maximum Miss Probability
Given a maximum desired probability of a miss :
Minimize c subject to
β
β
β
β
β
β
β
β
β
0 2 4 6 8
1eβ1
01eβ0
4
Cost vs. probability of waiting
Expected payments per unit time
Prob
abilit
y of
wai
ting
β
β
β
β
β
β
β
β
β
ββ
ββ
β
β
β
β
β
β
ββ β β β β β ββ
ββ
ββ
β
βββββββ β β β β β β β β β ββ
β
Traffic intensity Ο0.10.51510
pmax
β‘(c) pmax
Tuesday, May 8, 12
Optimal Retainer SizeBased on Joint Cost
If the βpizza deliveryβ property holds: we can quantify the cost of loss
ββ
ββ
ββ
β β β β
2 4 6 8 10
15
2010
050
0
Total cost of task and retainer
Size of retainer pool
Com
bine
d co
st o
f ret
aine
r and
mis
sed
task
β
ββ
ββ
ββ β β β
β
β
β
β ββ
β β β β
β
β
β
β
ββ β β β β
Cost of a missed task1101001000
Tuesday, May 8, 12
Improving the Retainer Model
ModelOptimizePlatformO
utlin
e
1 Subscriptions2 Shared Pools3 Predictive
Recruitment
Tuesday, May 8, 12
Retainer Subscriptions
β’ Proposal: increase by allowing workers to subscribe to realtime tasks
β’ Instead of posting to the global task list, the platform sends a message to subscribers
β’ Change crowdsourcing from a pull model to a push model
Β΅
Tuesday, May 8, 12
Global Retainer Poolsβ’ Sharing one global retainer pool across
requesters improves performance
β’ Intuition: Most workers are padding for unlikely runs of arrivals)
Time
Task 1
Tuesday, May 8, 12
Global Retainer Poolsβ’ Sharing one global retainer pool across
requesters improves performance
β’ Intuition: Most workers are padding for unlikely runs of arrivals)
Time
Task 1Task 2
Tuesday, May 8, 12
Global Retainer Poolsβ’ Sharing one global retainer pool across
requesters improves performance
β’ Intuition: Most workers are padding for unlikely runs of arrivals)
Time
Combined
Tuesday, May 8, 12
Global Retainer Poolsβ’ Sharing one global retainer pool across
requesters improves performance
β’ Intuition: Most workers are padding for unlikely runs of arrivals)
Time
Combined
Tuesday, May 8, 12
Global Retainer Poolsβ’ Through approximation, individual pools:
β’ Shared pools across k requesters:
β’ Loss rate declines exponentially with the number of bundled retainer pools
β‘(c) β‘p2β‘c
οΏ½eοΏ½β’(eβ’/c)c
οΏ½
β‘(c) β‘p2β‘kc
οΏ½eοΏ½β’(eβ’/c)c
οΏ½kβ‘(c) β‘p2β‘c
οΏ½eοΏ½β’(eβ’/c)c
οΏ½
β‘(c) β‘p2β‘kc
οΏ½eοΏ½β’(eβ’/c)c
οΏ½k
β‘(c) β‘
β‘(c) β‘
Tuesday, May 8, 12
Global Retainer Poolsβ’ Through approximation, individual pools:
β’ Shared pools across k requesters:
β’ Loss rate declines exponentially with the number of bundled retainer pools
β‘(c) β‘p2β‘c
οΏ½eοΏ½β’(eβ’/c)c
οΏ½
β‘(c) β‘p2β‘kc
οΏ½eοΏ½β’(eβ’/c)c
οΏ½kβ‘(c) β‘p2β‘c
οΏ½eοΏ½β’(eβ’/c)c
οΏ½
β‘(c) β‘p2β‘kc
οΏ½eοΏ½β’(eβ’/c)c
οΏ½k
β‘(c) β‘
β‘(c) β‘
Tuesday, May 8, 12
Global Retainer Pools
Cost dramatically decreases as you combine retainers: k dollars to log(k) dollars
Tuesday, May 8, 12
Global Retainer Routing
β’ Not every worker in a global retainer pool is good at every task
β’ If we assigned each worker to any task they could do, some tasks would starve
Tuesday, May 8, 12
Global Retainer Routing
β’ We want to maintain a buffer of workers to respond to all kinds of tasks
β’ A linear programming technique can balance the traffic intensities across all tasks
Tuesday, May 8, 12
Precruitmentβ’ Predictive Recruitment: notify workers
before the task arrives
β’ Recall workers in expectation of having a task by the time they arrive 2β3 seconds later
Tuesday, May 8, 12
PrecruitmentFormative Study, N=373 tasks
β’ 3Β’ for 3-minute retainer task: whack-a-mole
β’ βLoading...β screen for randomly-selected time [0, 20] seconds after worker returns
β’ Click on randomly-placed mole
Tuesday, May 8, 12
PrecruitmentFormative Study, N=373 tasks
β’ 3Β’ for 3-minute retainer task: whack-a-mole
β’ βLoading...β screen for randomly-selected time [0, 20] seconds after worker returns
β’ Click on randomly-placed mole
Tuesday, May 8, 12
PrecruitmentResults
β’ Median time to mouse move: 0.50 seconds
β’
β’ Standard retainer model (start timer @ alert): median mouse move in 1.36 seconds
Time waiting for task (sec)
Res
pons
e tim
e to
mov
e m
ouse
(sec
)
0.51.0
5.010.0
β
β
β
β
β
β
β
ββ
β
ββ
ββ
β β ββ
β
β
β
β
β β ββ
β
β
β ββ
β
ββ
β
ββ β
β
ββ
βββ
ββββ
ββ
β
ββ
β
βββ
β
β
ββ
ββ
β β
ββ β
β
β
β
βββ
β
ββ
βββ
β
ββ
β
β
β
ββ β
β
β
ββ
β
β
β
β β
β
β
β
β
ββ
β β βββ
β
β
ββ
ββ
β
β
β
β
β β β βββ
β
β
β
ββ
β β
β
β
β
ββ
ββ
β
ββ
β
β
ββ
β
ββ
β
β
β
β
β
β
β
β
β β
β
β
β
ββ
β
ββ
β
β
β
β
β β ββ
β
ββ
β
ββ
ββ βββ
β
β
ββ
ββ
βββ
ββ
β
β
β ββ
β
β β
β
ββ
β
β
β
β
ββ
βββ
β
β
β
ββ
β
β
ββ
ββ
βββ
β
β
β
β
β
β
β
β ββ ββ
β
β
β
β
ββ βββ
ββ
β
β
β
β
β
β
ββ β
β
ββ
β
β
β
β
β
β
ββ
βββ
β
β
β
β
β β
β
ββ β
β
ββ
ββ
ββ
β ββ
β
β
ββ
β
β
β
β
β
β β
β
β
β
ββ
β
ββ
β ββ
β
β
β
β
β
β
β
β
β
ββ
β
ββ
β
β
β
β
β
β
β
β
β
β
β
β
β
β
β β ββ
β
ββ
ββ
β
ββ
β
β
β
β
ββ
ββ β
β
ββββ
0 5 10 15 20
Tuesday, May 8, 12
Discussionβ’ Empirics: Can deployed crowdsourcing
platforms support lots of realtime tasks?
β’ Theory: Crowds as queueing systems
β’ Reputation: median response time, overall response rate
Tuesday, May 8, 12
MIT HUMAN-COMPUTER INTERACTION
Use queueing theory to understand and optimize performance of a paid, realtime crowdsourcing platform.
β’Relationship between crowd size and response time
β’Algorithm for optimizing crowd size vs. response time
β’ Improvements to the platform: 500 millisecond feedback
Tuesday, May 8, 12
MIT HUMAN-COMPUTER INTERACTION
Analytic Methods for Optimizing Realtime Crowdsourcing
Michael Bernstein, David Karger, Rob Miller, and Joel BrandtMIT CSAIL and Adobe Systems
Tuesday, May 8, 12