Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need...

128
Making Better Use of the Crowd Jenn Wortman Vaughan Microsoft Research

Transcript of Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need...

Page 1: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

MakingBetterUseoftheCrowd

JennWortmanVaughanMicrosoftResearch

Page 2: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Afewdisclaimers…

Page 3: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Aretherebetterwaystomakeuseofthecrowd?

Page 4: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Whatotherproblemscanthecrowdsolve?

Page 5: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

1. DirectApplicationstoNLPandMachineLearning

2. HybridIntelligenceSystems

3. LargeScaleStudiesofHumanBehavior

Part1:ThePotentialofCrowdsourcing

Page 6: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

“Crowd”guitar

man

Page 7: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Part2:TheCrowdisMadeofPeople

• Whatmotivatesworkers?• Areworkersindependent?• Areworkershonest?

Whatdoesthisteachusabouthowtoeffectivelyinteractwithcrowd?

Hint:Berespectful.Beresponsive.Beclear.

Page 8: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Extensivenotes,slides,andeventuallyvideoat

http://www.jennwv.com/projects/crowdtutorial.html

Page 9: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Part1:ThePotentialofCrowdsourcing

Page 10: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

1. DirectApplicationstoNLPandMachineLearning

2. HybridIntelligenceSystems

3. LargeScaleStudiesofHumanBehavior

ThePotentialofCrowdsourcing

Page 11: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

GeneratingLabeledData

Page 12: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Learner

Page 13: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Learner“dog”

“cat”

“dog” “cat”

“cat” “cat”

Aggregationofnoisylabels

Page 14: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Learner“dog”

“cat”

TrainedModel

Aggregationofnoisylabels

“dog” “cat”

“cat” “cat”

“cat”

Page 15: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Learner“dog”

“cat”

TrainedModel

Aggregationofnoisylabels

“dog” “cat”

“cat” “cat”

Usedtoannotatemedicalimages,labeltext,extractandlabelfeatures ofscenes.

Inspiredhugeamountsofalgorithmicworkonaggregation.

Page 16: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

AggregatingLabelswithEM• Input:Worker-generatedlabelsforeachinstance

• Calculateaninitialestimateofeachinstance’slabelbasedonasimplemajorityvote

• Repeatuntilconvergence:– Treatingthecurrentlabelestimatesastruth,estimateeachworker’squality

– Treatingthequalityestimatesastruth,calculatethemostlikelylabelforeachinstance

• Output:Oneaggregatedlabelforeachinstance[Dawid andSkene,1979]

Page 17: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

AggregatingLabelswithEM

• Noguaranteesonoptimality,buttendstoworkprettywellinpractice

• Manyrecentvariantshavebeenproposedtoincorporatethevaryingdifficultylevelsofinstances,workerexpertise,theexistenceof“gold”tasks,etc.

[Dawid andSkene,1979]

Page 18: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

BeyondSimpleLabels:CrowdTranslation

• Crowdworkers areaskedto– Translatesentencesfromonelanguagetoanother– Editotherworkers’translationstomakethemmorefluentandgrammatical

– Rankthequalityoftheresultingtranslations

• Canusemachinelearningtopredictthehighestqualitytranslationbasedonsentence-levelfeatures,worker-levelfeatures,andranks

[Zaidan andCallison-Burch,2011]

Page 19: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

GeneratingSimilarityMeasures

[Gomesetal.,2011]

Page 20: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

GeneratingSimilarityMeasures

flags noflags[Gomesetal.,2011]

Page 21: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

GeneratingSimilarityMeasures

Democrats Republicans[Gomesetal.,2011]

Page 22: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

CrowdClustering

[Gomesetal.,2011]

Bayesianmodel

Page 23: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

CrowdsourcingforEvaluation

Page 24: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

cheesekalebreadsteak

mushroompizza...

electionsenatebill

delegatepresidentproposal

...

EvaluatingTopicModels

Tobeusefulfordataexplorationorsummarization,topicsmustbehuman-interpretable!

[Changetal.,2009]

Page 25: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

EvaluatingTopicModels

mushroom,kale,cheese,bread,election,steak

workeraccuracy

human-interpretability

Previousmeasuresofsuccess(e.g.,loglikelihoodofheld-outdata)donotimplyinterpretability!

[Changetal.,2009]

Word intrusiontask:

Page 26: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

HumanDebugging

Page 27: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

HumanDebugging

• Semanticsegmentation:partitionanimageintosemanticallymeaningfulparts,labeleachpart

[Parikh&Zitnick,2011;Mottaghi etal.,2013]

“cat”

Page 28: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

HumanDebugging

• Semanticsegmentation:partitionanimageintosemanticallymeaningfulparts,labeleachpart

Whichcomponentistheweakestlink?

segmentclassifier

supersegmentclassifier

sceneclassifier

shapeprior

objectdetector

CRFmodel

[Parikh&Zitnick,2011;Mottaghi etal.,2013]

Page 29: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

HumanDebugging

segmentclassifier

supersegmentclassifier

sceneclassifier

shapeprior

objectdetector

CRFmodel

[Parikh&Zitnick,2011;Mottaghi etal.,2013]

Page 30: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

HumanDebugging

segmentclassifier

supersegmentclassifier

sceneclassifier

shapeprior

objectdetector

CRFmodel

[Parikh&Zitnick,2011;Mottaghi etal.,2013]

Page 31: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

HumanDebugging

segmentclassifier

supersegmentclassifier

sceneclassifier

shapeprior

objectdetector

CRFmodel

[Parikh&Zitnick,2011;Mottaghi etal.,2013]

Page 32: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

HumanDebugging

segmentclassifier

supersegmentclassifier

sceneclassifier

shapeprior

objectdetector

CRFmodel

[Parikh&Zitnick,2011;Mottaghi etal.,2013]

Humanslessaccurateattask,butsystemperformancestillimproved

Page 33: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

1. DirectApplicationstoNLPandMachineLearning

2. HybridIntelligenceSystems

3. LargeScaleStudiesofHumanBehavior

ThePotentialofCrowdsourcing

Page 34: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

HybridIntelligenceforSpeechRecognition

Page 35: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Crowd-BasedClosedCaptioning

Isitpossibletoprovidereal-timeclosedcaptioningoflectures,meetings,orotherday-to-dayconversations?

[Lasecki etal.,2012]

Page 36: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Thesystemmergesreal-timepartialinputs fromdynamic,untrainedcrowdstooutperformindividuals

Crowd-BasedClosedCaptioning

[Lasecki etal.,2012]

Page 37: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

HybridIntelligenceforScheduling

Page 38: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Cobi:Communitysourced Scheduling

[projectcobi.com]

Abigconstrainedoptimizationproblemwithnoaccesstotheconstraints!

Page 39: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

1. Committeesourcing 2. Authorsourcing

3. Scheduling 4. Attendeesourcing

[projectcobi.com]

Page 40: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Authorsourcing

crowdsourced clustering!

[projectcobi.com]

87%responserate!

Page 41: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Scheduling

[projectcobi.com]

Thesystemsolvesanoptimizationproblemtoproposeaschedule,butchairsretaincontrol.

Page 42: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

HybridIntelligenceforWriting

Page 43: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

TheSelfsourcing Process

1. Collectcontent

2. Organizecontent

3. Turncontentintowriting

[Teevan etal.,2016]

Page 44: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

CollectContent

The MicroWriter breaks writing into microtasks.

Collaborative writing typically requires coordination.

Microtasks can be done while mobile.

Structure turns big tasks into small microtasks.

Microtasks can be shared with collaborators.

Collaborators can be known or crowd workers.

People have spare time when mobile.

Microtasks make it easy to get started.

[Teevan etal.,2016]

Page 45: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

OrganizeContent

collaboration

microtask

mobile

The MicroWriter breaks writing into microtasks.

Collaborative writing requires coordination.

Microtasks can be done while mobile.

Structure turns big tasks into small microtasks.

Microtasks can be shared with collaborators.

Collaborators can be known or crowd workers.

People have spare time when mobile.

Microtasks make it easy to get started.

[Teevan etal.,2016]

Page 46: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

TurnContentintoWriting

Collaborative writing typically requires coordination, but microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

[Teevan etal.,2016]

Collaborative writing requires coordination.

Microtasks can be shared with collaborators.

Collaborators can be known or crowd workers.

collaboration

Page 47: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

TurnContentintoWriting

Collaborative writing typically requires coordination, but microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Structure makes it possible to turn big tasks into a series of smaller microtasks. For example, the MicroWriterbreaks writing into microtasks. These microtasks make the larger task easier to start.

People have spare time when mobile, and these micromoments are ideal for doing microtasks.

[Teevan etal.,2016]

Page 48: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

TheSelfsourcing Process

1. Collectcontent

2. Organizecontent

3. Turncontentintowriting

• Steps2&3couldbedownbycrowdworkers,traditionalML/AIapproaches,oracombination

• Authortakesfinalpass,noneedforperfection

Crowdsourcing

[Teevan etal.,2016]

Page 49: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

HybridIntelligenceinIndustry

Page 50: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

1. DirectApplicationstoNLPandMachineLearning

2. HybridIntelligenceSystems

3. LargeScaleStudiesofHumanBehavior

ThePotentialofCrowdsourcing

Page 51: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

UserStudiesforSecurityResearch

Page 52: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

HowwelldoInternetusersunderstandsecurityrisks?

Whotriestoguesspasswords?

Only14%mentionedbothstrangersand familiarpeopleasthreats

p@ssw0rd pAsswOrdvs.

[Uretal.,2016]

Page 53: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

UserStudiestoImprovetheCommunicationofNumbers

Page 54: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

[Barrioetal.,2016]

Page 55: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Perspectives

• IsaonehundredbilliondollarcuttotheUSfederalbudgetbigorsmall?

• Onehundredbilliondollarsisabout...– 3%ofthe2015USfederalbudget– 1/6ofannualUSspendingonmilitary– 30%ofthenetworthofBeyoncé– $5foreverypersoninNewYorkstate

[Barrioetal.,2016]

Page 56: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

SixmonthsofNewYorkTimesfrontpagearticles

Workersratedotherworkers’perspectivesforhelpfulness

Chosethehighest-ratedperspectives

64quoteswithmeasurements

370crowd-generatedperspectiveswithincentivesforquality

[Barrioetal.,2016]

Step1:PerspectiveGeneration

Page 57: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

PerspectiveExamples

• TheOhioNationalGuardbrought33,000gallons ofdrinkingwatertotheregion.

• Toputthisintoperspective,33,000gallonsofwaterisaboutequaltotheamountofwaterittakestofill2averageswimmingpools.

[Barrioetal.,2016]

Page 58: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

PerspectiveExamples

• Theyalsorecommendedsafetyprogramsforthenation’sgunowners;Americansownalmost300millionfirearms.

• Toputthisintoperspective,300millionfirearmsisabout1firearmforeverypersonintheUnitedStates.

[Barrioetal.,2016]

Page 59: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Step2:PerspectiveExperiments

• Randomizedexperimentsrunon3200+subjectsonAMTtotestthreeproxiesofcomprehension– Recall– Estimation– Errordetection

• Supportfoundforthebenefitsofperspectivesacrossallexperiments– Example:55%rememberednumberoffirearmsinUSwithperspective,only40%without

[Barrioetal.,2016]

Page 60: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

UserStudiesforOnlineAdvertising

Page 61: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

TheCostofAnnoyingAds

[Goldsteinetal.,2013]

Advertiserspaypublisherstodisplayads,butannoyingadscostpublisherspageviews.

Howmuchdoannoyingadscostpublishersindollars?

vs.

Page 62: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

TheCostofAnnoyingAds

[Goldsteinetal.,2013]

Step1:Usethecrowdtoidentifyannoyingads.

vs.

Page 63: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

GoodAds

[Goldsteinetal.,2013]

Page 64: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

BadAds

[Goldsteinetal.,2013]

Page 65: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Step2:EstimatetheCost•Workersaskedtolabelemailasspamornot• Showngood,bad,ornoads;paidvaryingamountsperemail• Howmuchmoremustaworkerbepaidtodothesametaskswhenshownbadads?

[Goldsteinetal.,2013]

Page 66: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Step2:EstimatetheCost

• Goodadsleadtoaboutthesamenumberofviews(emailsclassified)asnoads

• Costsmorethan$1extratogenerate1000viewsofbadadsinsteadofnoadsorgoodads

• Takeaway:Publisherslosemoneybyshowingbadadsunlesstheyarepaidsignificantlymoretoshowthem

[Goldsteinetal.,2013]

Page 67: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

1. DirectApplicationstoNLPandMachineLearning

2. HybridIntelligenceSystems

3. LargeScaleStudiesofHumanBehavior

SummaryofPart1

Page 68: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Part2:TheCrowdisMadeofPeople

Page 69: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Traditionalcomputersciencetoolsletusreasonaboutprogramsrunonmachines(runtime,scalability,correctness,...)

Whathappenswhentherearehumansintheloop?

Needamodelofhumanbehavior.(Aretheyaccurate?Honest?Dotheyrespondrationallytoincentives?)

Wrongassumptionsleadtosuboptimalsystems!

Page 70: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

“ButIonlywanttousecrowdsourcing togeneratetrainingdataorevaluatemymodel.”

Understandingthecrowdcanteachyou– Howmuchtopayforyourtasksandwhatpaymentstructuretouse

– Howmuchyoureallyneedtoworryaboutspam– Howandwhytocommunicatewithworkers–Whetheryourlabels/evaluationsareindependent– Howtoavoidcommonpitfalls

Page 71: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

TheCrowdisMadeofPeople

• Crowdworker demographics• Honestyofcrowdworkers• Monetaryincentives• Intrinsicmotivation• Thenetworkwithinthecrowd

Bestpractices!Tipsandtricks!

Page 72: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

CrowdsourcingPlatforms

Page 73: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

AmazonMechanicalTurk

Workers Requesters

Page 74: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

AlternativePlatforms

• GermanplatformwithmanyEuropeanworkersofferingsupportfortranslationandwebresearchplusmobilecrowdsourcing

• OffersenterprisesolutionsforbusinesseswithAI/dataneeds(searchrelevanceevaluation,sentimentanalysis,dataclassification)

Page 75: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

AlternativePlatforms

• Marketplaceforfreelancerswithlargerjobslikewritingarticlesordesigningwebsites

• UK-basedplatformfocusedonconnectingresearcherswithsubjectsforexperiments

Page 76: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Crowdworker Demographics

Page 77: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

DemographicsofMechanicalTurk

[mturk-tracker.com]

Page 78: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

DemographicsofMechanicalTurk

[mturk-tracker.com]

• 70-80%US,10-20%India

• Roughlyequalgendersplit

• Median(reported)householdincome:– $40K-$60KforUSworkers– Lessthan$15KforIndianworkers

• Canbebigchangesdependingontimeofday

Page 79: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Areworkersdishonest?

Page 80: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

ExperimentalParadigm

• Askparticipantsaboutdemographics– Sex,Age,Location,Income,Education

• Askparticipantstoprivately rolladie(orsimulateitonanexternalwebsite)andreporttheoutcome

payment=$0.25+($0.25*roll)

• Ifworkershonest,meanreportedrollshouldbeabout3.5...Whatdoyouthinkthemeanwas?

[Suri etal.,2011]

Page 81: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Baseline

• Averagereportedrollhigherthanexpectation– M=3.91,p <0.0005

• Playersunder-reportedonesandtwosandover-reportedfives

• Butmanyworkerswerehonest!

• SimilartoFischbacher &Huesi labstudy

Roll

Proportion

0.00

0.05

0.10

0.15

0.20

0.25

1 2 3 4 5 6

[Suri etal.,2011]

Page 82: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Thirtyrolls

• Overall,muchlessdishonesty

• Averagereportedrollmuchclosertoexpectation– M=3.57,p <0.0005

• Only3of232reportedsignificantlyunlikelyoutcomes

• Only1wasfullyincomemaximizing(allsixes)

• Whyisthisthecase?

Roll

Proportion

0.00

0.05

0.10

0.15

1 2 3 4 5 6

[Suri etal.,2011]

Page 83: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

DishonestyCanAddUp

• “Areyoutheparentorguardianofachildwithautism?”– 4.3%ofparticipantssaidyesincontrol– 7.8%ofparticipantssaidyeswhentoldthatthiswasaprescreeningtest forafurtherstudy

• Seemslikeasmalldifference,butwouldleadto(atleast)45%impostersinthesubsequentstudy!

[ChandlerandPaolacci,2011]

Page 84: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Takeaways&RelatedBestPractices

• Mostworkersarehonestmostofthetime.

• Butsomearenot.Youshouldstillusecaretoavoidattacks.

• Workersmaydeceiverequesterstogainaccesstowork.Prescreeningshouldbedonewithcare,ideallyaspartofaseparatetask.

Page 85: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

MonetaryIncentives

Page 86: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Howmuchshouldyoupay?

Ausefultrick:• Pilotyourtaskonstudents,colleagues,orafewworkerstoseehowlongitgenerallytakes.

• UsethattomakesureyourpaymentsworkouttoatleasttheUSminimumwage.

Benefits:• It’sthedecentthingtodo!• Ithelpsmaintaingoodrelationshipswithworkers.

Page 87: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Canperformance-basedpaymentsimprovethequalityofcrowdwork?

Proofreadthistext,earn$0.50

Earnanextra$0.10foreverytypofound

[Hoetal.,2015]

Page 88: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

PriorWorkonCrowdPayments

–Payingmoreincreasesthequantityofwork,butnotthequality[MW09,RK+11,BKG11,LRR14]–PBPsimprovequality[H11,YCS14]–PBPsdonotimprovequality[SHC11]–Bonussizesdon’tmatter[YCS13]

[Hoetal.,2015]

Page 89: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Performance-BasedPayments

Weexplorewhen,where,andwhy performace-basedpaymentsimprovethequalityofcrowdworkonAmazonMechanicalTurk.

[Hoetal.,2015]

Page 90: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

CanPBPs work?

• Warm-uptoverifythatPBPs canleadtohigherqualitycrowdwork onsometask.

• TestwhetherthereexistsanimplicitPBPeffect:workershavesubjectivebeliefsonthequalityofworktheymustproducetoreceivethebasepayment,andsoalreadybehaveasifpaymentsare(implicitly)performance-based.

[Hoetal.,2015]

Page 91: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

CanPBPs work?• Task:Proofreadanarticleandfindspellingerrors. • Werandomlyinsert20typos

• sufficiently->sufficently• existence->existance• …

• Usefulproperties:• Qualityismeasurable• Exertingmoreeffort->betterresults

[Hoetal.,2015]

Page 92: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

CanPBPs work?Basepayment:$0.50;Bonuspayment:$1.00

ThreeBonusTreatments:• NoBonus: nobonusormentionofabonus• BonusforAll: getthebonusunconditionally• PBP: getthebonusifyoufind75%of

thetyposfoundbyothers

TwoBaseTreatments:– Guaranteed: guaranteedtogetpaid– Non-Guaranteed: nomentionofaguarantee

[Hoetal.,2015]

Page 93: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

CanPBPs work?

• Resultsfrom1000uniqueworkers

• Guaranteedpaymentshurt(implicitPBP)

• PBPs improvequality

• Unlikeinpriorwork,payingmorealsoimprovesquality

[Hoetal.,2015]

Page 94: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

UnderwhatconditionsdoPBPs work?Bonusthreshold(585uniqueworkers)• $0.50base+$1.00bonusforfindingXtypos

Ctrl 5T 25% 75% All

• PBPs workforawiderangeofthresholds

• Subjectivebeliefs(5typosvs.25%oftypos)canimprovequality

[Hoetal.,2015]

Page 95: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Bonusamounts(451uniqueworkers)• $0.50base+$X bonusforfinding75%of typos• PBPs workaslongasthebonusislargeenough

11

12

13

14

0.00 0.25 0.50 0.75 1.00

Bonus Amount

Typo

s Fo

und

couldexplainShawetal.,2011couldexplainYinetal.,2013

[Hoetal.,2015]

UnderwhatconditionsdoPBPs work?

Page 96: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

WhichtasksdoPBPs workon?

• Whatpropertiesofataskleadtoqualityimprovementsfromperformance-basedpay?

• Somepilotexperimentsonaudiotranscriptionsuggestedthat– PBPs improvequalityforeffort-responsivetasks– Itisnotalwaysstraight-forwardtoguesswhichtasksareeffort-responsive

[Hoetal.,2015]

Page 97: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

WhichtasksdoPBPs workon?

[Hoetal.,2015]

Page 98: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Takeaways&RelatedBestPractices

• AimtopayatleastUSminimumwage.Pilotyourtasktofindouthowlongittakes.

• Performance-basedpaymentscanimprovequalityforeffort-responsivetasks.Pilottochecktherelationshipbetweentimeandquality.

• Bonuspaymentsshouldbelargerelativetothebase.Thepreciseamountandprecisecriteriaforreceivingthebonusdon’tmattertoomuch.

Page 99: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

IntrinsicMotivation

Page 100: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

WorkThatMatters• Threetreatments:– control:nocontextgiven– meaningful: toldtheywerelabelingtumorcellstoassistmedicalresearchers

– shredded: nocontext,toldworkwouldbediscarded

• Meaningful->quantity up,butquality similar• Shredded->quality down,butquantity similar

[ChandlerandKapelner,2013]

Page 101: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.
Page 102: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Gamification

[vonAhn andDabbish,2004]

Page 103: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Gamification

[vonAhn,Kedia,andBlum,2006]

Page 104: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Takeaways&RelatedBestPractices

• Workersproducemoreworkwhentheyknowtheyareperformingameaningfultask,butthequalityoftheirworkmightnotimprove.

• Gamificationcanalsoincreaseproductivity.Wellcalibratedtimedresponsesandscorekeeping(withorwithouthighscorelists)canbothincreaseenjoyment.

Page 105: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

TheCommunicationNetworkWithintheCrowd

Page 106: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Implicitassumption:Crowdworkersareindependent

[Yinetal.,2016]

Page 107: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Inrealityworkerstalkandcollaborate

Recreatesocialconnectionsand

supportM.L.Gray,S.Suri,S.S.AliandD.Kulkarni.TheCrowdisaCollaborativeNetwork.CSCW 2016

N.Gupta,D.Martin,B.V.Hanrahan andJ.O’Neil.Turk-lifeinIndia.Group 2014

Helpeachotherwithadministrativeoverhead

Ming’stasksaregreat!

Sharetasksandreputableemployers

Ethnographicfieldstudiesshowthat crowdworkers...

[Yinetal.,2016]

Page 108: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

ACommunicationNetwork

Whatisthescale? Whatisthestructure? Howisitused?[Yinetal.,2016]

Page 109: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Ourgoal:Opentheblackboxofcrowdsourcing tomapthecommunicationnetworkofcrowdworkers

[Yinetal.,2016]

Page 110: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Whyisitchallenging?

ThenetworkisnotaccessiblefromtheAPIsowecan’tsimplydownload,crawl,orscrapeit!

Wanttomapthenetworkinawaythat#1Elicitsonly“true”edges#2Elicitsasmanytrueedgesaspossible#3Preservesworkers’privacy

[Yinetal.,2016]

Page 111: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

AWebApp

• Workers self-report their connections

• Providessomevaluebacktotheworkerssothatit’sintheirbestinteresttoreportasmanytrueconnectionsaspossible

[Yinetal.,2016]

Page 112: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

5268connections

10,354 workers(roughlyacensusofMechanical Turk[Stewartetal.2015])

[Yinetal.,2016]

Page 113: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

1,389(13%)connectedworkers

Onaverage,workerscommunicatewith7.6 others

Maxdegreeis321

[Yinetal.,2016]

Page 114: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Largestcomponentincludes994(72%) workers

[Yinetal.,2016]

Page 115: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

ANetworkEnabledByForums

• 59% ofallworkersand83% ofconnectedworkersreportedusingatleastoneforum.

• 90%ofalledgesarebetweenpairsofworkerswhocommunicateviaforums,and86% arebetweenpairswhocommunicateexclusivelythroughforums.

[Yinetal.,2016]

Page 116: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

ForumsCreateSubcommunities

Reddit HWTF MTurkGrind TurkerNation

Facebook MTurkForum[Yinetal.,2016]

Page 117: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Subcommunities AreDifferent

TopologicalStructure: Howtightlyconnectediseachsubcommunity?

TemporalDynamics: Dorelationshipsendureovertime?

CommunicationContent: Iscommunicationsocialorstrictlybusiness?

[Yinetal.,2016]

Page 118: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

MeasuresofSuccess

Property Connected UnconnectedBeactive>1year 55% 46%

Useforums 83% 56%Master 11% 7%

Approvalrate 98.6% 97.4%

[Yinetal.,2016]

Connectedworkerswerealsomorelikelythanunconnectedworkerstofindourtaskearly.

Page 119: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

TakeawaysandRelatedBestPractices

• Forumusageiswidespread.Forumsarethevirtual“watercoolers”ofcrowdworkers.

• Engagewithworkersonforums.Introduceyourself.Introduceyourtasks.

• Activelymonitorforumdiscussionaboutyourtask.Whenappropriate,requestthatworkersdonotdiscussyourtask.Monitoranyway.

• Becarefulaboutassumingindependence!

Page 120: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

AdditionalBestPractices

Page 121: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.
Page 122: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

MaintainGoodRelationshipswithWorkers

• Setasidetimetoactivelymonitoryourrequesteremailaccountandrespondtoquestions.

• Approveworkquickly.

• Avoidrejectingworkexceptinthemostextremeofcircumstances.

• Strivetobeanethicalrequester.

Page 123: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

http://guidelines.wearedynamo.org

Page 124: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

TipstoMakeYourProjectRunSmoothly

• Pilot,pilot,pilot!Testyourtaskonyourcollaborators,othercolleagues,andeventuallysmallbatchesofworkers.

• Iterateasmanytimesasneeded.

Ifyourememberoneslidefromthistalk,rememberthis!

Page 125: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

TipstoMakeYourProjectRunSmoothly

• Createclearinstructions.Includequizquestionsifneeded.Pilotthemandcollectfeedback.

• Createanattractiveandeasy-to-useinterface.Pilotthistoo!

• Askworkersforfeedback.Askthemtoreportbugs.Conductexitsurveyswhenappropriate.Workersgenerallywanttohelp!

Page 126: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Thanks...

ToChien-Ju Ho,AndrewMao,JoellePineau,SidSuri,HannaWallach,andespeciallyMingYinforextensivediscussionsandfeedback

ToDanGoldstein,Chien-Ju Ho,JakeHofman,Roozbeh Mottaghi,SidSuri,JaimeTeevan,MingYin,Haoqi Zhang,andalloftheircollaboratorsfortheuseofmaterialfromtheirslides

Andtoallthepeoplewhosentmepointerstocoolresearch...thistutorialwasacrowdsourced effort!

Page 127: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

Extensivenotes,slides,andeventuallyvideoat

http://www.jennwv.com/projects/crowdtutorial.html

Page 128: Making Better Use of the Crowdbut microtasks are easy to share with collaborators without the need for coordination. The collaborators can be known colleagues or paid crowd workers.

[email protected]://jennwv.com@jennwvaughan