Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

32
Human Computation and Crowdsourcing Uichin Lee May 8, 2011

Transcript of Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Page 1: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Human Computation and Crowdsourcing

Uichin LeeMay 8, 2011

Page 2: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Content Networking• Human intelligence:

– Distributed human computation, crowdsourcing

• Mobile device intelligence:– Sensing (camera, GPS)

• Network intelligence:– Internet, mobile networks (w/ advanced services)

• Application intelligence:– Agent, processing, mining

InternetSmart home/office

On the move Applications

Content provider

Fixed access

Content

Networking

Radioaccess

Device

Network

Application

Human

Crowd

Page 3: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Contents

• Overview• Genres of distributed human computation– Games with a purpose, mechanized labor, wisdom of

crowds, crowdsourcing, dual-purpose work, grand search, human-based genetic algorithms, knowledge collection from volunteer contributors

• Dimensions– Motivation, quality, human skill, participation time,

cognitive load• Analyzing Amazon Mechanical Turk Marketplace

Page 4: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Overview• Distributed human computation (DHC) aims at solving rich

computation problems through collaboration between humans and computers– Particularly, in some problem domains where humans could be much

better than machines– Examples: artificial intelligence, natural language processing, and

computer visions • Well artificial intelligence has been trying hard to solve these

problems using machines– But its quality may be not satisfactory..

• DHC offers the possibility of combining humans and computers: faster than individual human efforts, and quality is as good as human efforts (or even better)

Taxonomy of Distributed Human ComputationAlexander J. Quinn, Benjamin B. Bederson, 2009

Page 5: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Overview

• How? The system has global knowledge of the problem and forms small sub-problems that take advantage of humans’ special abilities– Delegating sub-problems to a large number of people

connected via Internet (could be geographically dispersed)• Examples:

– Searching for a person in a large number of satellite photos covering thousands of square miles of ocean (e.g., Jim Gray)

– Image labeling (e.g., ESP game)– Human-computer interaction, cryptograph, business, genetic

algorithms, etc. (and many others!)

Page 6: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

DHC Genres

• Games with a purpose• Mechanized labor• Wisdom of crowds• Crowdsourcing• Dual-purpose work• Grand search• Human-based genetic algorithms• Knowledge collection from volunteer contributors• People sensing

Page 7: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Games with a purpose

• Game that requires the player to perform some computation to gain points or to succeed

• Defining factor: people are motivated by the fun of a game

Page 8: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Mechanized labor

• Crowdsourcing with monetary rewards• Amazon’s Mechanical Turk, ChaCha (paid per micro task)

– Cf) Mturk was lunched in 2005 by the needs of Amazon; they wanted to eliminate all the duplicate pages as much as possible which couldn’t be done using automated algorithms

Page 9: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Wisdom of crowds

• Crowd intelligence: very difficult when done individually, but very easy when aggregated (asking opinions of crowds)

• Example services: online polling, prediction markets

Page 10: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Crowdsourcing• Coined by Jeff Howe in a Wired magazine article

– Displacement of usual internal labor by soliciting unpaid help from the general public

– Motivated by curiosity or serendipity while browsing the web (e.g., online product reviews)

• Examples: – Question answering services: Naver KiN, Yahoo Answer, Askville, Aardvark– Stardust@Home (finding elusive particles from space images)

Page 11: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Dual-purpose work• Translating a computation into an activity that many people were

already doing frequently

Page 12: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Grand search

• Finding a solution (instead of aggregation)• Examples: finding an image that contains something

(e.g., search for a missing person, or for elusive particles as in Stardust@home)

Page 13: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Human-based genetic algorithms

• Humans contribute solutions to problems and subsequent participants by performing functions such as initialization, mutation, and recombinant crossover

• Defining factor is that solutions consists of a sequence of small parts and that they evolve in a way that is controlled by human evaluation

Page 14: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Knowledge collection from volunteer contributors

• Aims to advance artificial intelligence research by using humans to build large databases of common sense facts– E.g., “people cannot brush their hair with

a table”

• Common methods have been using data mining, e.g., Cyc

• Human-based methods could help, e.g., FACTory, Verbosity, 1001 Paraphrases, etc.

Page 15: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

People sensing

• Community awareness (participatory sensing)• Emergency/rescue operations

Safecast.org seeks to aggregate worldwide sensor information

Geiger counter;방사능측정기

Pictures from http://news.cnet.com/japan-radiation-monitoring-goes-crowd-open-source/8301-17938_105-20060639-1.html

Page 16: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Dimensions• Motivation

– Pay (e.g., Mturk), altruism (e.g., Naver KiN, Wikipedia), fun (e.g., games), implicit (e.g., embedded in regular activities)

• Quality– Mechanisms: forced agreement (e.g., games), economic models (when money is

involved), defensive task design, redundancy– Checking: statistical, redundant work, multilevel review, expert review, forced

agreement, automatic check, reputation systems• Aggregation

– Knowledge base, statistical, grand search, unit tasks (ChaCha, Mturk)• Human skill

– Language understanding, vision, communications, reasoning, common knowledge/sense

• Participation time: <2min, 2-10min, >10min• Cognitive load (affecting contributor’s willingness to help)

Page 17: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.
Page 18: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.
Page 19: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.
Page 20: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Analyzing the Amazon Mechanical Turk Marketplace

Panagiotis G. Ipeirotis (NYU)

Page 21: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

AMT Screenshot

Page 22: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Screenshot

Page 23: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

AMT questions

• Who are the workers that complete these tasks?

• What type of tasks can be completed in the marketplace?

• How much does it cost?• How fast can I get results back?• How big is the AMT market place?

Page 24: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Demographics

• Countries: 46.80% US, India: 34%, Misc: 19.2% (from 66 different countries)

http://behind-the-enemy-lines.blogspot.com/2010/03/new-demographics-of-mechanical-turk.html

1

Page 25: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Demographics• Why do you complete tasks in Mechanical Turk? Please check any of the

following that applies:– [1] Fruitful way to spend free time and get some cash (e.g., instead of watching TV)– [2] I find the tasks to be fun– [3] To kill time– [4] For "primary" income purposes (e.g., gas, bills, groceries, credit cards)– [5] For "secondary" income purposes, pocket change (for hobbies, gadgets, going out)– [6] I am currently unemployed, or have only a part time job

1 2 3

Page 26: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Demographics• Why do you complete tasks in Mechanical Turk? Please check any of the

following that applies:– [1] Fruitful way to spend free time and get some cash (e.g., instead of watching TV)– [2] I find the tasks to be fun– [3] To kill time– [4] For "primary" income purposes (e.g., gas, bills, groceries, credit cards)– [5] For "secondary" income purposes, pocket change (for hobbies, gadgets, going

out)– [6] I am currently unemployed, or have only a part time job

4 5 6

Page 27: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Type of tasks

Page 28: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Requester distribution

Page 29: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Price distribution

Page 30: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Keywords vs. Ranks

Page 31: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Posting vs. completion rate

Page 32: Human Computation and Crowdsourcing Uichin Lee May 8, 2011.

Completion time