© 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September...

36
© 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

Transcript of © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September...

Page 1: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Amazon Mechanical TurkNew York City Meet Up

September 1, 2009

WELCOME!

Page 2: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

AGENDA

Welcoming Statements Introductions Dolores Labs – Video Directory Use Case Knewton – Adaptive Learning Use Case FreedomOSS – Enterprise Integration New York University – Worker Quality Solution Panel Questions and Answers

Page 3: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Amazon Mechanical TurkRequester Meetup

Howie LiuDolores Labs

Page 4: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Dolores Labs Introduction

Founded in 2008 by Lukas Biewald, Senior Scientist, Powerset (MSFT); Yahoo! Search; Stanford AI Lab– Recognized enormous potential of AMT

platform Dolores Labs develops quality control technology

(CrowdControl™) to make AMT more accessible and reliable

Page 5: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Case Study

A large video directory needed to select relevant thumbnails for 200k+ videos

Page 6: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Why Mechanical Turk?

Size of project and turnover speed made MTurk the obvious solution– Given the needs of the client, traditional

outsourcing or hiring employees was not an option

– However, the client was concerned about quality of results

Inherent variability of Mechanical Turk workers– Unlike other Amazon marketplaces, workers are

not a perfect commodity– Significant variations in quality (accuracy)– Need to ensure workers diligently completed work– Intelligently aggregate multiple responses to find

the single best thumbnail for a video

6

Page 7: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!
Page 8: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

3 Step Process for Optimizing the Task

Baseline Performance

• Create a custom interactive UI

• 74% result accuracy

CrowdControl™

• Apply statistical quality control

• 90% result accuracy

CrowdControl™ + 2 pass

• Second pass for Turkers to verify results

• 98% result accuracy

Page 9: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

High Quality on Mechanical Turk: Best Practices

Statistical inference algorithms to dynamically assess quality– …Of each worker, of each result– …While the task is live– Smart allocation of worker resources

• Blindly increasing redundancy is expensive Aggregating all responses from workers with varying quality

into a single “best” answer

White paper with Stanford AI Lab about quality on AMT http://bit.ly/DLpaper

Baseline Performance

CrowdControl™

CrowdControl™_x000d_+ Custom Solutions

70 75 80 85 90 95 100

CrowdControl™ vs Baseline Result Ac-curacy

Page 10: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Other Insights

Clear task instructions are crucial for good results– Garbage in, garbage out

Intuitive and efficient task interface makes the task faster (read—cheaper) and more fun!

Mechanical Turk is an unprecedented, hyper-efficient labor marketplace– Need to understand its dynamics

through experience in order to harness its power

Page 11: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Amazon Mechanical TurkRequester MeetupDahn Tamir, Knewton Inc.

Page 12: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Knewton - Introduction

Live online GMAT and LSAT prep courses customized for each student, powered by the world’s most advanced adaptive learning engine.

Selected to the 2009 AlwaysOn Global 250 List. Named Category Winner in the Digital Education field.

Page 13: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

How we use MTurk

Quality assurance

Focus Groups and Surveys

Database building

Marketing

Calibration for computer-adaptive testing

Page 14: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Why Mturk?

Cost

Appropriate worker population for each task

Quality

Speed

Page 15: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

What We Learned

Use qualification tests

Invest in building good HITs

Hesitate to reject work (but not cheaters)

Turkers are a diverse and capable population

Meet Turker Nation

Page 16: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Thank you!---

Questions?

[email protected]

Page 17: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Amazon Mechanical TurkRequester Meet-up(Max Yankelevich, Chief Architect– Freedom OSS)

Page 18: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Freedom OSS- Introduction Freedom OSS is a professional services organization with a focus on Practical

Implementations using Cloud Computing & Open Source Technologies International Firm

– US Offices: PA,NYC, GA, KC ,NV, WA,NC– 4 Large Solution Centers in Eastern Europe (Russia, Belarus, Ukraine and

Lithuania) Practical Approach to Cloud Computing – most successfully completed

Enterprise Cloud Computing projects in the Industry Key Cloud Computing Partnerships

– Top Amazon AWS Enterprise System Integrator – Top Eucalyptus Enterprise Partner

Key Open Source Partnerships– Top Red Hat Advanced Business Partner– #1 JBoss Advanced Business Partner in US

2008 “JBoss SOA Innovation” Award Winner 2007-08 “Practical SOA” Award Winner 2008 “Red Hat Extensive Ecosystem” Award Winner Leading technology partner for many Fortune 2000 companies Freedom is a privately held corporation

Page 19: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

MTurk and Enterprise Integration

Most Legacy systems are not architected to include the human intervention

Providing a technological interface to maintain the workflow while inserting human intelligence and building self adjudicating business flows

Leveraging Mechanical Turk programmatically in your everyday systems

Freedom OSS has leveraged the power of Enterprise Service Bus (ESB) & Practical  Service Oriented Architecture (SOA) to make the process of on-boarding  and managing MTurk workers a rapid and cost effective process

Using its Professional Open Source ESB – freeESB , Freedom has developed many powerful Connectors for some of the most used Enterprise Systems and Technologies such as SAP, Mainframe, Siebel, Java/J2EE, Oracle , IBM MQ ,etc

Page 20: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Master Data Cleansing & Validation Use Case

Keeping Master Customer Data File (Master Data Management)– Record de-duping– Contact information validation

Traditional MDM tactics– Expensive software– Big Bang approach– Invasive Code Changes to Legacy

Applications Clean and consistent customer data

Page 21: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

AWS Cloud

freeESBRouting , Transformation, Connectivity, QoS

Business Applications Business Applications

Real-timeEvents

Real-time access

Legacy Applications

Mainframe, Client-Server, Oracle, .NET, SAP, Siebel ,etc

APIAPIFirst Turk Task –

Simple Data Checking

Second Turk Task – Deeper Data Checking

Third Turk Task – Data

Edit/Trusted Task

Master Data

Business Process Orchestration &

Workflow

Business RulesEngine

Page 22: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Outcome

Low operational costs

Non-invasive data integration

High-degree of accuracy due to multi-task

distribution

Some Best Practices when integrating MTurk

within an Enterprise

– Deliver value incrementally

– Inversion of Control

Page 23: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Thank you!---

Questions?

Page 24: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Amazon Mechanical TurkRequester Meetup(Panos Ipeirotis – New York University)

Page 25: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

“A Computer Scientist in a Business

School”

http://behind-the-enemy-lines.blogspot

.com/

Email: [email protected]

“A Computer Scientist in a Business

School”

http://behind-the-enemy-lines.blogspot

.com/

Email: [email protected]

Panos Ipeirotis - Introduction

New York University, Stern School of Business

Page 26: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Example: Build an Adult Web Site Classifier

Need a large number of hand-labeled sites Get people to look at sites and classify them

as:G (general), PG (parental guidance), R (restricted), X

(porn)

Cost/Speed Statistics Undergrad intern: 200 websites/hr, cost:

$15/hr MTurk: 2500 websites/hr, cost: $12/hr

Cost/Speed Statistics Undergrad intern: 200 websites/hr, cost:

$15/hr MTurk: 2500 websites/hr, cost: $12/hr

Page 27: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Bad news: Spammers!

Worker ATAMRO447HWJQ

labeled X (porn) sites as G (general

audience)

Page 28: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Improve Data Quality through Repeated Labeling

Get multiple, redundant labels using multiple workers Pick the correct label based on majority vote

Probability of correctness increases with number of workers

Probability of correctness increases with quality of workers

1 worker

70%

correct

1 worker

70%

correct

11 workers

93%

correct

11 workers

93%

correct

Page 29: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

11-vote Statistics MTurk: 227 websites/hr, cost: $12/hr Undergrad: 200 websites/hr, cost:

$15/hr

11-vote Statistics MTurk: 227 websites/hr, cost: $12/hr Undergrad: 200 websites/hr, cost:

$15/hr

Single Vote Statistics MTurk: 2500 websites/hr, cost: $12/hr Undergrad: 200 websites/hr, cost:

$15/hr

Single Vote Statistics MTurk: 2500 websites/hr, cost: $12/hr Undergrad: 200 websites/hr, cost:

$15/hr

But Majority Voting is Expensive

Page 30: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Using redundant votes, we can infer worker quality

Look at our spammer friend ATAMRO447HWJQ together with other 9 workers

Our “friend” ATAMRO447HWJQ mainly marked sites as G.Obviously a spammer…

We can compute error rates for each worker

Error rates for ATAMRO447HWJQ P[X → X]=9.847% P[X → G]=90.153% P[G → X]=0.053% P[G → G]=99.947%

Page 31: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Rejecting spammers and Benefits

Random answers error rate = 50%

Average error rate for ATAMRO447HWJQ: 45.2% P[X → X]=9.847% P[X → G]=90.153% P[G → X]=0.053% P[G → G]=99.947%

Action: REJECT and BLOCK

Results: Over time you block all spammers Spammers learn to avoid your HITS You can decrease redundancy, as quality of workers is

higher

Page 32: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

After rejecting spammers, quality goes up Spam keeps quality down Without spam, workers are of higher quality Need less redundancy for same quality Same quality of results for lower cost

With spam

1 worker

70%

correct

With spam

1 worker

70%

correct

With spam

11 workers

93%

correct

With spam

11 workers

93%

correct

Without

spam

1 worker

80% correct

Without

spam

1 worker

80% correct

Without

spam

5 workers

94% correct

Without

spam

5 workers

94% correct

Page 33: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Correcting biases

Classifying sites as G, PG, R, X Sometimes workers are careful but biased

Classifies G → P and P → R Average error rate for ATLJIK76YH1TF: 45.0%

Error Rates for Worker: ATLJIK76YH1TFP[G → G]=20.0% P[G → P]=80.0% P[G → R]=0.0% P[G → X]=0.0%P[P → G]=0.0% P[P → P]=0.0% P[P → R]=100.0% P[P → X]=0.0%P[R → G]=0.0% P[R → P]=0.0% P[R → R]=100.0% P[R → X]=0.0%P[X → G]=0.0% P[X → P]=0.0% P[X → R]=0.0% P[X →

X]=100.0%

Error Rates for Worker: ATLJIK76YH1TFP[G → G]=20.0% P[G → P]=80.0% P[G → R]=0.0% P[G → X]=0.0%P[P → G]=0.0% P[P → P]=0.0% P[P → R]=100.0% P[P → X]=0.0%P[R → G]=0.0% P[R → P]=0.0% P[R → R]=100.0% P[R → X]=0.0%P[X → G]=0.0% P[X → P]=0.0% P[X → R]=0.0% P[X →

X]=100.0%

Is ATLJIK76YH1TF a spammer?Is ATLJIK76YH1TF a spammer?

Page 34: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Correcting biases

For ATLJIK76YH1TF, we simply need to compute the “non-recoverable” error-rate (technical details omitted)

Non-recoverable error-rate for ATLJIK76YH1TF: 9%

Error Rates for Worker: ATLJIK76YH1TFP[G → G]=20.0% P[G → P]=80.0% P[G → R]=0.0% P[G → X]=0.0%P[P → G]=0.0% P[P → P]=0.0% P[P → R]=100.0% P[P → X]=0.0%P[R → G]=0.0% P[R → P]=0.0% P[R → R]=100.0% P[R → X]=0.0%P[X → G]=0.0% P[X → P]=0.0% P[X → R]=0.0% P[X →

X]=100.0%

Error Rates for Worker: ATLJIK76YH1TFP[G → G]=20.0% P[G → P]=80.0% P[G → R]=0.0% P[G → X]=0.0%P[P → G]=0.0% P[P → P]=0.0% P[P → R]=100.0% P[P → X]=0.0%P[R → G]=0.0% P[R → P]=0.0% P[R → R]=100.0% P[R → X]=0.0%P[X → G]=0.0% P[X → P]=0.0% P[X → R]=0.0% P[X →

X]=100.0%

Page 35: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Too much theory?

Open source implementation available at:http://code.google.com/p/get-another-label/

Input: – Labels from Mechanical Turk– Cost of incorrect labelings (e.g., XG costlier than

GX)

Output: – Corrected labels– Worker error rates– Ranking of workers according to their quality

Alpha version, more improvements to come! Suggestions and collaborations welcomed!

Page 36: © 2009 Amazon.com, Inc. or its Affiliates. Amazon Mechanical Turk New York City Meet Up September 1, 2009 WELCOME!

© 2009 Amazon.com, Inc. or its Affiliates.

Thank you!

Questions?

“A Computer Scientist in a Business School”

http://behind-the-enemy-lines.blogspot.com

/

Email: [email protected]

“A Computer Scientist in a Business School”

http://behind-the-enemy-lines.blogspot.com

/

Email: [email protected]