SCG Court: A Crowdsourcing Platform for Innovation Karl Lieberherr Northeastern University College...

135
SCG Court: A Crowdsourcing Platform for Innovation Karl Lieberherr Northeastern University College of Computer and Information Science Boston, MA joint work with Ahmed Abdelmeged Supported by Novartis

Transcript of SCG Court: A Crowdsourcing Platform for Innovation Karl Lieberherr Northeastern University College...

SCG Court: A Crowdsourcing Platform for Innovation

SCG Court: A Crowdsourcing Platform for Innovation

Karl LieberherrNortheastern University

College of Computer and Information ScienceBoston, MA

joint work with Ahmed Abdelmeged

Karl LieberherrNortheastern University

College of Computer and Information ScienceBoston, MA

joint work with Ahmed Abdelmeged

Supported by Novartis

4/24/2011 2Crowdsourcing

SOLVEORGANIZATIONALPROBLEMS

HOW TOCOMBINE THEWORK OFHUNDREDSOF SCHOLARS?

HOW TO AVOIDECONOMICNON-SENSE

HOW TO FOCUSSCHOLARS

FoldIt

• Predicting protein structures with a multiplayer online game

• S. Cooper, F. Khatib, A. Treuille, J. Barbero, J. Lee, M. Beenen, A. Leaver Fay, D. Baker and ‐Z.Popovic

• Nature. Vol. 466 : 756 760, 2010 ‐• http://www.ncbi.nlm.nih.gov/pmc/articles/

PMC2956414/4/24/2011 Crowdsourcing 3

FoldIt

• Collaborative/Competitive• Minimize energy• acquire domain expertise through game play• co-adaptation of game and players• see EteRNA from Stanford/CMU

4/24/2011 Crowdsourcing 4

Organizing GeniusThe Secrets of Creative Collaboration

Warren Bennis• Take Home Lessons

– conditions that have to be met• Focus• You create an atmosphere of stress, creative stress,

everyone competing to solve one problem. And you have one ringmaster.

4/24/2011 Crowdsourcing 5

Organizational Problem Solved

• How to organize a loosely coupled collaboration among several scholars to agree on claims that can be refuted or defended constructively using a dialog.– fair recognition of scholars

• strong scholars cannot be ignored– answer: “is claim refuted” plus the dialog

• When game is over: interested in – know-how!– list of claims that scholars agree with.

4/24/2011 6

defend(Alice,Bob,c) = ! refute(Alice,Bob,c)

Crowdsourcing

Organizational Problem Solved

• How to design a happy scientific community that creates the science that society needs.

• Classical game solution: Egoistic scholars produce social welfare: knowledge base and know-how how to defend it.

• Control of scientific community (is governance effective? political science)– SCG rules– Specific domain

4/24/2011 Crowdsourcing 7

happy = no scholar is ignored, rich, immersive experience.

What is a loose collaboration?

• Scholars can work independently on an aspect of the same problem.

• Problem = decide which claims in playground to oppose or agree with.

• How is know-how combined? Using a protocol.– Alice claimed that for the input that Alice provides,

Bob cannot find an output of quality q. But Bob finds such an output. Alice corrects.

– Bug reports that need to be addressed and corrections.

4/24/2011 8CrowdsourcingPlayground = Instantiation of Platform

Controlled Communication instead of Isolation

4/24/2011 Crowdsourcing 9

s1 s2

s3 s4

s1 s2

s3 s4

admin admin

undesirablemore learning and collaborationteam evaluation

s: scholar

Claims

• Protocol. Defines scientific discourse.• Scholars make a prediction about their

performance in protocol.• Predicate that decides whether refutation is

successful. Refutation protocol collects data for predicate.

• As a starter: Think of a claim as a mathematical statement: EA or AE.– all planar graphs have a 4 coloring.

4/24/2011 10Crowdsourcing

Benefits for Playground Designers

Return On Investment for playground designers: a small investment in defining a playground (Domain=(Instance,Solution,valid,quality), Claim=(Protocol, etc.)) produces an interactive environment to assimilate and create domain knowledge.

4/24/2011 11Crowdsourcing

Benefits for Playground Designers

4/24/2011 Crowdsourcing 12

Playground Design

Financial IncentiveOrganize the thought

processes of hundreds of scholars to creatively focus on one problem

Benefits for Scholars

• Return on Investment for scholars and avatar designers: The SCG rules need to be learned only once because they are the same across playgrounds. A small investment in learning the SCG rules and a domain leads to numerous learning and teaching and innovation opportunities. The more a scholar teaches, the higher the scholar’s reputation.

4/24/2011 13Crowdsourcing

Benefits for Scholars: Immersion Experience

4/24/2011 Crowdsourcing 14

show me (solve) I challenge you(propose)

you are wrong (oppose, teaching and learning)

tell me (provide)

me

scholar s s

s sInnovation: Scholars are free to invent; game rules don’t limit creativity; social engineering: answers to “why did I lose?” may lead to better solutions.

Global Warming

• Alice’ Claim: The earth is warming significantly.– Refutation protocol: Bob tries to refute.

• Alice must provide a data set DS satisfying a property defined precisely by the refutation protocol.

• Bob applies one of the allowed analysis methods M defined precisely by the refutation protocol.

• Bob wins iff M(DS) holds.

4/24/2011 15Crowdsourcing

Independent Set

• Protocol / claim: At Least As Good– Bob provides undirected graph G.– Bob computes independent set sB for G (secret).– Alice computes independent set sA for G.– Alice wins, if size(sA) >= size(sB).

4/24/2011 16Crowdsourcing

Overview

1. Organizational problem that SCG solves2. What is SCG in detail?3. Crowdsourcing4. Formal Properties of SCG5. Applications6. Disadvantages7. Conclusions

4/24/2011 Crowdsourcing 17

Big Picture

• Weaker form of logic.• Approximate truth.• Don’t focus on proofs but on refutations.

4/24/2011 Crowdsourcing 18

4/24/2011 Crowdsourcing 19

good bad

Logic with Soundness

claimssentences

not just true/false claims, but optimum/non-optimum claims:good: true/optimumbad: false/non-optimum

bad

4/24/2011 Crowdsourcing 20

good

Scientific Community Game Logicwith Community Principle

agreed by two scholars disagreed by two scholars

there exists two-party certificateto expose misclassification

claimssentences

agree c(informally):both successfully defend c andboth successfully refutenegation !c.

Comparison Logic and SCG

Logic• sentences

– true– false

• proof for being true– proof system, checkable– guaranteed defense

• proof for being false– proof system, checkable– guaranteed refutation

• Universal sentences

Scientific Community Game• sentences = claims

– good– bad

• evidence for goodness– defense, checkable– uncertainty of defense

• evidence for badness– refutation, checkable– uncertainty of refutation

• Personified sentences

4/24/2011 Crowdsourcing 21

Example: Highest Safe Rung

• You are doing stress-testing on various models of glass jars to determine the height from which they can be dropped and still not break. The setup for this experiment, on a particular type of jar, is as follows.

Crowdsourcing 224/24/2011

Highest Safe Rung

Only two identical bottles to determinehighest safe rung

Alice Bob

23Crowdsourcing

You have a ladder with n rungs, and you want to find the highest rung from which you can drop a copy of the jar and not have it break. We call this the highest safe rung. You have a fixed ``budget'' of k > 0 jars.

4/24/2011

admin

Highest Safe Rung

Only two identical bottles to determinehighest safe rung

HSR(9,2) ≤ 4/9 I doubt it: refutation attempt!

Alice Bob

Alice constructsdecision tree T ofdepth 4 and gives itto Bob. He checkswhether T is valid.Bob wins if he findsa flaw.

24Crowdsourcing4/24/2011

3

1

0

6

1 2

4

3

5

9

97

6

87

2

4

5

8

x

y z

yes no

u

highest safe rung

Highest Safe Rung Decision TreeHSR(10,2)=5/10

25Crowdsourcing4/24/2011

Other playgrounds might have only two claims: C and !C.

bad

4/24/2011 Crowdsourcing 26

good

Scientific Community Game Logicwith Community Principle

agreed by two scholars disagreed by two scholars

there exists two-party certificateto expose misclassification

claimssentences

HSR(11,2)=4HSR(27,3)=5

HSR(15,3)=5 HSR(57,4)=6

HSR(17,4)=4

HSR(5,1)=4

HSR(n,k)=q/nabbreviate as HSR(n,k)=kn: rungsk: to breakq: questions HSR(8,3)=3

Community Principle for SCG

• Every faulty decision has a certificate to assign blame to the faulty decision maker.

• Certificate contains information from both parties.

• A certificate is a sequence of moves leading to a loss for the scholar making the faulty decision. – The certificate can be checked efficiently assuming all

basic game operations (valid, belongsTo, quality) take constant time.

4/24/2011 Crowdsourcing 27

Example

• HSR(15,3)=5• This claim was misclassified as a good claim

because both Alice and Bob could only find a decision tree of depth 5.

• Nina, a newcomer to the HSR scientific community, could find a decision tree of depth 4.

• Exposing the faulty decision of Alice and Bob.

4/24/2011 Crowdsourcing 28

Big Picture

• Replace soundness with community principle.• Participants have to work hard to approximate

soundness; if they don’t achieve soundness, they risk to be caught and risk to lose reputation.– as in a real scientific community: mistakes are

made, even in mathematical journals.

4/24/2011 Crowdsourcing 29

What is SCG(X)

Crowdsourcing 30

no automationhuman plays

full automationavatar plays

degree of automation used by scholar

some automationhuman plays

0 1

more applications:test constructive knowledge

transfer to reliable, efficient software

avatar Bobscholar Alice

4/24/2011

A Virtual WorldAvatar’s View

Administrator

Avatar

Opponents’ communication,Feedback

Claims,Instances,Solutions

Results • Agreed Claims: statements about algorithms• = Social welfare• Algorithms in Avatar

31Crowdsourcing4/24/2011

does simple checking (usually efficient)

does complex work

Avatars propose and (oppose|agree)

Crowdsourcing 32

CA1

CA2

CA3

CA4

egoisticAlice egoistic

Bob

reputation 1000 reputation 10

CB1

CB2

opposes (1)

provides instance (2)

solves instance

not as well as she expected based on CA2 (3)WINS!LOSES

proposed claims

transfer 200

Life of an avatar: (propose+ (oppose | agree)+ provide* solve*)*

4/24/2011

What Scholars think about!

• If I propose claim C, what is the probability that– C is successfully refuted– C is successfully strengthened

• If I try to refute claim C, what is the probability that I will fail.

• If I try to strengthen claim C, what is the probability that I will fail?

33Crowdsourcing4/24/2011

Essence of Game Rules• actors:

– proposer=verifier (1. arg to propose, oppose, refute, usually Alice), – opposer=falsifier (2. arg to propose, oppose, refute, usually Bob)

• LifeOfClaim(c) = propose(Alice,c) followed by (oppose(Alice,Bob,c)|agree(Alice,Bob,c)).

• oppose(Alice,Bob,c) = (refute(Alice,Bob,c)|strengthen(Alice,Bob,c,cs)), where stronger(c,cs).

• strengthen(Alice,Bob,c,cs) = !refute(Bob,Alice,cs).• agree(Alice,Bob,c) = !refute(Alice,Bob,c) and !

refute(Bob,Alice,c) and refute(Alice,Bob,!c) and refute(Bob,Alice,!c)

4/24/2011 34

blamed decisions:propose(Alice,c)refute(A,B,c)strengthen(Alice,Bob,c,cs)agree(A,B,c)

Crowdsourcing

Winning/Losing

• propose(Alice,c), refutationTry(Alice,Bob,c)• If Alice first violates a game rule, Bob is the

winner.• If Bob first violates a game rule, Alice is the

winner.• If none violate a game rule: the claim

predicate c.p(Alice,Bob,in,out) decides.

4/24/2011 35Crowdsourcing

Game Rules for Playground

• legal(in)• legal(out)• valid(in,out)• belongsTo(in, instanceSet)• each move must be within time-limit

4/24/2011 36Crowdsourcing

Protocol Language

ProtocolSpec = <steps> List(Step).Step = <action> Action "from" <role> Role.interface Role = Alice | Bob.Alice = "Alice". Bob = "Bob".interface Action = ProvideAction | SolveAction.ProvideAction = "instance".// solve the instance provided in // step # stepNo.// stepNo is 0-based.SolveAction = "solution" "of" <stepNo> int.

4/24/2011 Crowdsourcing 37

instance from Bob // rsolution of 0 from Bob // sB for rsolution of 0 from Alice // sA for r

How to achieve loosely coupled collaboration?

• Information exchange is based on values. Knowledge how to produce values is secret.

• Assign blame correctly to Alice or Bob based on outcome of refutation protocol.

• Every claim has a negation (using the idea of Hintikka’s dual game).

• Negation of HSR(n,k)=q: HSR(n,k)<q.

4/24/2011 38Crowdsourcing

Dual Game / Negation

• Each game G has a dual game which is the same as G except that the players Alice and Bob are transposed in both the rules for playing and the rules for winning. The game G(¬φ) is the dual of G(φ).

4/24/2011 39Crowdsourcing

How is collaboration working?

• Scholars make claim about their performance in a given context.

• Scholars make claim about the performance of their avatar in a given context.

• Opponent finds input in context that contradicts claim. Claim is refuted.

4/24/2011 40Crowdsourcing

Playground Design

• Define several languages– Instance– Solution– Claim

• InstanceSet

• Define protocol or reuse existing protocol.• Implement interfaces for corresponding

classes.

4/24/2011 Crowdsourcing 41

Who are the scholars?

• Students in a class room– High school– University

• Members of the Gig Economy– Between 1995 and 2005, the number of self-

employed independent workers grew by 27 percent.

• Potential employees• Anyone with web access; Intelligent crowd.

4/24/2011 42Crowdsourcing

How to engage scholars?Opposition

• Central to opposition is refutation.• Claim defined by protocol.• Simplest protocol:

– Alice provides Input in.– Bob computes Output out: valid(in,out)– Alice defends if quality(in,out)<q.– Bob refutes if quality(in,out)>=q.

• Claims: C(q), q in [0,1].

4/24/2011 43Crowdsourcing

instance from Alice // insolution of 0 from Bob // out for in

Overview

1. Organizational problem solved by SCG2. What is SCG in detail?3. Crowdsourcing4. Formal Properties of SCG5. Applications6. Disadvantages7. Conclusions

4/24/2011 Crowdsourcing 44

Crowdsourcing

• Active area: Recent Communication of the ACM article.

• Wikipedia, FoldIt, TopCoder, …• We want a family of crowdsourcing systems

with provable properties.

4/24/2011 Crowdsourcing 45

Crowdsourcing Platform

• Crowdsourcing – is the act of taking a job traditionally performed

by a designated agent (usually an employee) and outsourcing it to an undefined, generally large group of people in the form of an open call.

– enlists a crowd of humans to help solve a problem defined by the system owners.

• A crowdsourcing platform is a generic tool that makes it easy to develop a crowdsourcing system.

4/24/2011 46Crowdsourcing

Crowdsourcing Platform

• The job, target problem is – to solve instances of a problem and make claims

about the solution process.– to build knowledge base of claims and techniques

to defend the claims

4/24/2011 47Crowdsourcing

Requirements for Crowdsourcing Platform

• Find a good way to combine user contributions to solve the target problem.

• Find a good way to evaluate users and their contributions.

• Find a good way to recruit and retain users.

4/24/2011 48Crowdsourcing

SCG Court

• Web application• Software developers register with SCG Court

and choose playgrounds they want to compete in.

• They register their avatars in the appropriate playgrounds in time for the next tournament.

• Avatars get improved between tournaments based on ranking achieved and game history.

4/24/2011 Crowdsourcing 49

Combine user contributions

• Users build on each others work: strengthening and checking.

• Users check each others claims for correct judgment.– Claims are defended and refuted.

• Users trade reputation for information.– Example: HSR(15,3)=5

4/24/2011 50Crowdsourcing

Learning cycle

• Alice wins reputation with claim c because Bob made a wrong decision– Alice gives information about artifact related to c.

Alice teaches Bob.• Bob integrates information into his know-how. Bob

learns from Alice. – Bob hopefully has learned enough and will no longer make a

wrong decision about c.

4/24/2011 Crowdsourcing 51

Voting with Justification

• I vote – for this claim (agree) because I can defend it and

refute its negation.– against this claim because I can oppose it (refute

or strengthen).

4/24/2011 52Crowdsourcing

Evaluate users and their contributions

• Calculate reputation– confidence by the proposer that a claim is good

(gc)– confidence by the opposer (refute or strengthen)

that the claim is bad (bc)

• The scholars are encouraged to set their confidences truthfully. Otherwise they don't gain enough reputation or they lose too much reputation.

4/24/2011 53Crowdsourcing

Reputation UpdateClaim good bad

propose up down

oppose down up

up: if you are good, there is a chance that you windown: if the other is good, there is a chance that you lose

up: reputation goes up, but has to provide knowledge that might reveal secret technique.down: reputation goes down, but might gain knowledge that reveals secret technique.

4/24/2011 54Crowdsourcing

Reputation UpdateClaim good bad

propose up down

oppose down up

up: if you are good, there is a chance that you windown: if the other is good, there is a chance that you lose

confidence: proposer: claim is good: gcopposer: claim is bad: bc

r = result of reputation protocol.Reputation update: r*gc*bc (various refinements are possible)

4/24/2011 55Crowdsourcing

Overview

1. Organizational problem solved by SCG2. What is SCG in detail?3. Crowdsourcing4. Formal Properties of SCG5. Applications6. Disadvantages7. Conclusions

4/24/2011 Crowdsourcing 56

Formal Properties of SCG• Soundness:

– Only false claims are refuted.– Only true claims are defended.

• SCG is not sound because it adapts to the skill level of the scholars. E.g., – Alice proposes a false claim and still defends it,

because Alice and Bob are weak, or– Alice proposes a true claim and does not defend it,

because Alice is weak.• We want to prove formal properties that don’t

imply soundness.

4/24/2011 Crowdsourcing 57

Formal Properties

• Properties– Community Property– Equilibrium– Convergence

• Assumption: claims are bivalent (true or false); disallow indeterminate claims.

4/24/2011 Crowdsourcing 58

For every faulty decision action there exists an exposing reaction.

• decision propose(A,c): if c is not true, refute(A,B,c) or strengthen(A,B,c,cs) expose.

• decision oppose(Alice,Bob,c)|agree(Alice,Bob,c): – if Bob decides to oppose but does not oppose

successfully, his oppose action is blamed. Bob discouraged to attack without good reason.

– if Bob decides to agree but does not agree successfully, his agree action is blamed.

4/24/2011 59Crowdsourcing

Community Property

• For every faulty decision action there exists an exposing reaction that blames the bad decision.– Reasons:

• We want the system to be egalitarian. – It is important that clever crowd members can shine and

expose others who don’t promote the social welfare of the community.

• Faulty decisions must be exposable. It may take effort.

4/24/2011 60Crowdsourcing

Community PropertyAlternative formulation

• If all decisions by Alice are not faulty, there is no chance of Alice losing against Bob.– if Alice is perfect, there is no chance of losing.

• If there exists a faulty decision by Alice, there is a chance of Alice losing against Bob.– egalitarian game

4/24/2011 61Crowdsourcing

Summary: faulty decisions

1. propose(Alice,c),c=bad2. refute(Alice,Bob,c),c=good3. agree(Alice,Bob,c),c=bad

4/24/2011 62Crowdsourcing

Summary: faulty decisions

1. propose(Alice,c),c=false2. propose(Alice,c),c=not optimum, c=true3. refute(Alice,Bob,c),c=true4. strengthen(Alice,Bob,c,cs),c=optimum5. strengthen(Alice,Bob,c,cs),c=false6. agree(Alice,Bob,c),c=false7. agree(Alice,Bob,c),c=not optimum, c=true

4/24/2011 63Crowdsourcing

Community PropertyCase 1

• Alice’ decision propose(Alice,c) proposes claim c as a claim that is true. Let’s assume c is false. Alice introduced a fault into the knowledge base.

• There must be a reaction that assigns blame to Alice’ decision. Here it is: Bob decides to oppose: oppose(Alice,Bob,c), specifically to refute: refute(Alice,Bob,c). There must be a successful refutation.

4/24/2011 64Crowdsourcing

1. propose(Alice,c),c=false

Community PropertyCase 2

• Alice’ decision propose(Alice,c) proposes claim c as a claim that is optimum. Let’s assume c is not optimum, but true, and can be strengthened. Alice introduced a fault into the knowledge base.

• There must be a reaction that assigns blame to Alice’ decision. Here it is: Bob decides to oppose: oppose(Alice,Bob,c), specifically to strengthen: strengthen(Alice,Bob,c,cs). There must be a choice for cs so that refute(Bob,Alice,cs) returns false, independent of Alice’ strategy.

4/24/2011 65Crowdsourcing

2. propose(Alice,c),c=not optimum, c=true

Community PropertyCase 3

• Bob’s decision refute(Alice,Bob,c) is faulty, if c is true. Bob tries to introduce a fault into the knowledge base.

• There must be a reaction by Alice that assigns blame to Bob’ decision to refute. Because c is true, there must be a defense of c by Alice, i.e., refute(Alice,Bob,c) returns false independent of Bob’s strategy. Bob’s decision to refute is blamed.

4/24/2011 66Crowdsourcing

3. refute(Alice,Bob,c),c=true

Community PropertyCase 4

• Bob’s decision strengthen(Alice,Bob,c,cs) is faulty, if c is optimum. Bob tries to introduce a fault into the knowledge base.

• There must be a reaction by Alice that assigns blame to Bob’s decision to strengthen. Because c is optimum, there must be a refutation of cs by Alice, i.e., refute(Bob,Alice,cs) returns true independent of Bob’s strategy. Bob’s decision to strengthen is blamed.

4/24/2011 67Crowdsourcing

4. strengthen(Alice,Bob,c,cs),c=optimum

Community PropertyCase 5

• Bob’s decision strengthen(Alice,Bob,c,cs) is faulty, if c is false. Bob tries to introduce a fault into the knowledge base.

• There must be a reaction by Alice that assigns blame to Bob’s decision to strengthen. Because c is false, there must be a refutation of cs by Alice, i.e., refute(Bob,Alice,cs) returns true independent of Bob’s strategy. Bob’s decision to strengthen is blamed.

4/24/2011 68Crowdsourcing

5. strengthen(Alice,Bob,c,cs),c=false

Case 5 discussion

• Alice already made a faulty decision to propose c. But Bob did not catch that.

• In case 5 Bob makes things worse by trying to strengthen a false claim.

• Alice has an opportunity to force Bob to fail to defend his strengthened claim.

4/24/2011 Crowdsourcing 69

Community PropertyCase 6

• Bob’s decision agree(Alice,Bob,c,) is faulty, if c is false. Let’s assume c is false. Bob tries to introduce a fault into the knowledge base.

• There must be a reaction by Alice that assigns blame to Bob’s decision to agree. Because c is false, there is a strategy for Alice so that refute(Bob,Alice,c) returns false independent of Bob’s strategy. Bob’s decision to agree is blamed.

4/24/2011 70Crowdsourcing

6. agree(Alice,Bob,c),c=false

Community PropertyCase 7

• Bob’s decision agree(Alice,Bob,c,) is faulty, if c is not optimum. Let’s assume c is not optimum, but true. Bob tries to introduce a fault into the knowledge base.

• There must be a reaction by Alice that assigns blame to Bob’s decision to agree. Because c is not optimum and true, there must be a strengthening of c by Alice to cs, i.e., refute(Alice,Bob,cs) returns false independent of Bob’s strategy. Bob’s decision to agree is blamed.

4/24/2011 71Crowdsourcing

7. agree(Alice,Bob,c),c=not optimum, c=true

SCG Equilibrium

• reputations of scholars are stable• the science does not progress; bugs are not

fixed, no new ideas are introduced• extreme example: All scholars are perfect:

they propose optimal claims that can neither be strengthened nor refuted.

Crowdsourcing 724/24/2011

Claims

Crowdsourcing 7373

0

1

qualitystrengthening

correct valuation

over strengthening

true claims(defendable)

false claims(refutable)

4/24/2011

Convergence

• if every faulty action is exposed, convergence guaranteed.

4/24/2011 74Crowdsourcing

Related Work

• Argumentation Theory• Argumentation Mechanism Design

– strategy-proof mechanism

• Logic– Paul Lorenzen Dialog games– Independence Friendly Logic by Hintikka/Sandu

• Logical games of imperfect information.

4/24/2011 75Crowdsourcing

Independence Friendly Logic(Hintikka and Sandu)

• Protocol / claim– Bob provides positive real number r in R+.– Bob computes square root sB of r in R (secret).– Alice computes square root sA of r in R .– Alice wins, if sA and sB are equal (within a small error

bound).• Claim is neither true nor false (Imperfect information).• ForAll r in R+ ForAll sB in R Exists sA/sB in R: (sA=sB)

and (sB=B(r) and sA=A(r))• Exists sA/sB means that the Verifier’s choice prompted

by Exists sA is independent of the Falsifier’s choice prompted by ForAll sB.

4/24/2011 76

Verifier = AliceFalsifier = Bob

Crowdsourcing

In SCG Protocol Language

• instance from Bob // r• solution of 0 from Bob // sB for r• solution of 0 from Alice // sA for r

4/24/2011 Crowdsourcing 77

Independence Friendly Logic(IF Logic)

• Protocol / claim: At Least As Good– Bob provides undirected graph G.– Bob computes independent set sB for G (secret).– Alice computes independent set sA for G.– Alice wins, if size(sA) >= size(sB).

• Alice has a winning strategy: search for the maximum independent set.

• But does she have a practical winning strategy?

4/24/2011 78Crowdsourcing

Claims that are neither true nor false

• ForAll x Exists y/x (x=y)• Has indeterminate truth in any model with

cardinality > 1.• Reason: game of imperfect information.• Verifier and Falsifier will choose values for x

and y without knowing each other’s choice.• Classical logic is bivalent. IF logic is more

expressive than ordinary first-order languages.

4/24/2011 79Crowdsourcing

Game-Theoretic Semantics

• Every sentence is associated to a game with two players: the Verifier (Alice) and the Falsifier (Bob).

• Universal quantifier prompts move of Falsifier.• Existential quantifier prompts move of Verifier.• A sentence is said to be true (false) if there exists

a winning strategy for the Verifier (Falsifier).• A sentence is said to be refuted (defended) if the

Falsifier (Verifier) wins on a specific game.

4/24/2011 80Crowdsourcing

Long History

• (It came to light sometime later that C. S. Peirce had already suggested explaining the difference between ‘every’ and ‘some’ in terms of who chooses the object, in 1898)

4/24/2011 Crowdsourcing 81

Significance of Refutation or Defense

• Forget about winning strategies for Verifier and Falsifier.

• Want to come up with winning strategies incrementally.

• When Verifier wins a game, we have some evidence that claim is true. Falsifier is blamed for trying to refute.

• When Falsifier wins a game, we have some evidence that claim is false. Verifier is blamed for proposing the claim.

4/24/2011 82Crowdsourcing

Collaboration between Verifier (Alice) and Falsifier (Bob)

• IF formulas have special form:– ForAll i Exists oA: p(i,oA) and oA=A(i) and PB(i)– ForAll i ForAll oB Exists oA/oB: p(i,oA,oB) and oA=A(i) and

oB=B(i) and PB(i)– Exists i ForAll oB: p(i,oB) and oB=B(i) and PA(i)

• We are interested in improving A,B and PB through playing the game several times. A is the know-how of Alice and B the know-how of Bob. A and B are functions. PB(i) is Bob’s provide relation to find hard inputs i.

• The claim makes a prediction about A and B and PB.• A game defends the prediction or refutes it.

4/24/2011 83Crowdsourcing

Collaboration between Verifier (Alice) and Falsifier (Bob)

• After a successful defense, the blame is assigned to Bob. Specifically to Bob’s decision to oppose the claim.

• After a successful refutation, the blame is assigned to Alice. Specifically to Alice’ decision to propose the claim.

• It is the responsibility of Alice and Bob to assign the blame more specifically and improve their know-how about A, B, PA, PB and the claim.

4/24/2011 84Crowdsourcing

Overview

1. Organizational problem that SCG solves2. What is SCG in detail?3. Crowdsourcing4. Formal Properties of SCG5. Applications6. Disadvantages7. Conclusions

4/24/2011 Crowdsourcing 85

SCG for different audiences

• Logicians• Computer scientists• Programmers• Laymen • Managers• Experimental scientists

4/24/2011 Crowdsourcing 86

SCG for programmers

Programming• claims about programs

– Provide input where claim fails.

• claims about problems– I have an algorithm that solves

instances with quality q.• Provide algorithm and other

provides instance where algorithm does not achieve quality q.

SCG• claims

– input = instance.

• claims– algorithm = instance– instance = solution

4/24/2011 Crowdsourcing 87

Comparison Logic and SCG

Logic• sentences

– true– false

• proof for being true– proof system, checkable– guaranteed defense

• proof for being false– proof system, checkable– guaranteed refutation

• Universal sentences

Scientific Community Game• sentences = claims

– good– bad

• evidence for goodness– defense, checkable– uncertainty of defense

• evidence for badness– refutation, checkable– uncertainty of refutation

• Personified sentences

4/24/2011 Crowdsourcing 88

Laymen

• Group problem solving for problems with constructive solutions.

• Solutions are evaluated by group.• Reputation based: Scholar s1 is better than

scholar s2 if reputation(s1) > reputation(s2).• Game is egalitarian: scholars with good ideas

can force reputation win.• Scholars challenge each other and try to figure

out each other’s solution processes.

4/24/2011 Crowdsourcing 89

Applications

• My Applications of SCG in teaching– Software Development classes

• Developing SCG Court• Developing software for MAX CSP

– Algorithms classes (e.g., HSR)

• Crowdsourcing know-how in constructive domains.

4/24/2011 Crowdsourcing 90

Claim involving Experiment

Claim ExperimentalTechnique(X,Y,q,r)I claim, given raw materials x in X,I can produce product y in Y of quality qand using resources at most r.

91Crowdsourcing4/24/2011

Gamification of Software Development etc.

• Want reliable software to solve a computational problem? Design a game where the winning team will create the software you want.

• Want to teach a STEM domain? Design a game where the winning students demonstrate superior domain knowledge.

Crowdsourcing

Doesn’t TopCoder already do this?

STEM = Science, Technology, Engineering, and Mathematics

924/24/2011

SCG and TopCoder

• SCG is an abstraction and generalization of TopCoder.

Crowdsourcing 934/24/2011

Planned Applications Require Prize Money

• IT recruiting tool: need employees good in a computational domain? Design a game and pick the winners.

• Need a software package for solving an optimization problem? Design a game and pick the winning avatar.

4/24/2011 Crowdsourcing 94

What we want

• Engage software developers– let them produce software that models an

organism that fends for itself in a real virtual world while producing the software we want. Have fun. Focus them.

– let them propose claims about the software they produce. Reward them when they

• defend their claims successfully or • oppose the claims of others successfully.

Crowdsourcing 95

Clear Feedback Sense of Progress

Possibility of Success

Authenticity (Facebook)

4/24/2011

Overview

1. Organizational problem that SCG solves2. What is SCG in detail?3. Crowdsourcing4. Formal Properties of SCG5. Applications6. Disadvantages7. Conclusions

4/24/2011 Crowdsourcing 96

Disadvantages

• Overhead for avatar developers– Overhead of learning SCG (rules)– Overhead of learning SCG Court (how to register

your avatar)– Amortization: SCG(X1) -> SCG(X2) -> SCG(X3)

• Overhead for playground developers– Playgrounds need to be well tested (cheating)– Definition of what you want must be precise– Get what you ordered

4/24/2011 Crowdsourcing 97

Disadvantages of SCG

• The game is addictive. After Bob has spent 4 hours to fix his avatar and still losing against Alice, Bob really wants to know why!

98Crowdsourcing4/24/2011

Disadvantages of SCG

• The administrator for SCG(X) must perfectly supervise the game. – if admin does not, cheap play is possible– watching over the admin

99Crowdsourcing4/24/2011

How to compensatefor those disadvantages

• Warn the scholars about addictive game.• Use a gentleman’s security policy: report

administrator problems, don’t exploit them to win.

• Occasionally have a non-counting “attack the administrator” competitions to find vulnerabilities in administrator.– both generic as well as X-specific vulnerabilities.

100Crowdsourcing4/24/2011

Overview

1. Organizational problem that SCG solves2. What is SCG in detail?3. Crowdsourcing4. Formal Properties of SCG5. Applications6. Disadvantages7. Conclusions

4/24/2011 Crowdsourcing 101

Conclusions• SCG Court is a platform for creating happy communities

of scholars/avatars that create science in specific domains. Solves organizational problems from introduction: lose coupling and effective combination of results, economically meaningful, fair evaluation.

• The egoistic scholars create social welfare: knowledge and the know-how to support it.

• Evaluates fairly, frequently, constructively and dynamically. Encourages retrieval of state-of-the-art know-how, integration and discovery.

• Challenges humans, drives innovation, both competitive and collaborative.

4/24/2011 Crowdsourcing 102

The End

4/24/2011 Crowdsourcing 103

Highest Safe Rung

• You are doing stress-testing on various models of glass jars to determine the height from which they can be dropped and still not break. The setup for this experiment, on a particular type of jar, is as follows.

Crowdsourcing 1044/24/2011

Highest Safe Rung

Only two identical bottles to determinehighest safe rung

Alice Bob

105Crowdsourcing

You have a ladder with n rungs, and you want to find the highest rung from which you can drop a copy of the jar and not have it break. We call this the highest safe rung. You have a fixed ``budget'' of k > 0 jars.

4/24/2011

Highest Safe Rung

Only two identical bottles to determinehighest safe rung

HSR(9,2) ≤ 4 I doubt it: refutation attempt!

Alice Bob

Alice constructsdecision tree T ofdepth 4 and gives itto Bob. He checkswhether T is valid.Bob wins if he findsa flaw.

106Crowdsourcing4/24/2011

3

1

0

6

1 2

4

3

5

9

97

6

87

2

4

5

8

x

y z

yes no

u

highest safe rung

Highest Safe Rung Decision TreeHSR(9,2)=5

107Crowdsourcing4/24/2011

Finding solution for HSR(n,2)

• Approximate min x in [0,n] (n/x) + x

• Exact – MaxRungs(x,y) =MaxRungs(x-1,y-1)+MaxRungs(x-1,y)– MaxRungs(x, 2) = x + MaxRungs(x – 1, 2)– MaxRungs(0, 2) = 1– Applied to HSR(9,2)

• MaxRungs(3,2) = 7 < 9• MaxRungs(4,2) = 11 > 9

108Crowdsourcing

Keith Levin CS 4800 Fall 2010

MaxRungs(x,y) = the largest numberof rungs we can test with y jars andx experiments.

breaks at root does not break at root

Find minimum x, s.t. MaxRungs(x,2) > n

4/24/2011

MaxRungs

• MaxRungs(x,y) = sum [k=0 .. y] binomial(x,k)• All paths are of length x. At most k branches

may be left branches.• Note: y = x implies MaxRungs(x,y) = 2x

meaning a complete binary tree of depth x.• Example: binomial(3,2)+binomial(3,1)+

binomial(3,0) = 7

Crowdsourcing 1094/24/2011

Formal: HSR

• Domain: – Problem: (n,k), k <= n.– Solution: Decision tree to determine highest safe

rung.– quality(problem, solution): depth of decision tree /

number of rungs– valid(problem, solution): at most k left branches, ...

110Crowdsourcing4/24/2011

4/24/2011 Crowdsourcing 111

Community Principle 2

• If all decisions by Alice are good, there is no chance of Alice losing against Bob.– if Alice is perfect, there is no chance of losing.

• If there exists a bad decision by Alice, there is a chance of Alice losing against Bob.– egalitarian game

4/24/2011 112Crowdsourcing

Bad Decisions (detectable efficiently during game)

a. Proposing a claim and not supporting it.b. Opposing a claim and not opposing it

successfully.c. Agreeing with a claim that one cannot

defend nor refute its negation.

4/24/2011 113Crowdsourcing

Under the Radar

• Under the radar: a game can progress without detectable faults of kinds a,b,c.

• Still not sound.• With 7 fault kinds: if no faults: have soundness

but cannot check it efficiently.• With a,b,c: guaranteed loss if caught.

4/24/2011 Crowdsourcing 114

Questions from ETH Talk• Michael Franz

– electronic trading analogy, improve trading software over night• Walter Huersch

– value created by game: how to distribute it among participants? Based on reputation of scholars.

– Volkswirtschaftlich vernueftig? Is it more efficient– get scholars to evaluate each other.

• Christoph Roduner– intranet: start collaboration. Game as collaboration starter.

Focused brainstorming.• Von CMU: Poersch?

– How does it work with students. Mention baby avatar. MAX CSP.

– Constructive nature.4/24/2011 Crowdsourcing 115

Questions

• Thomas Gross– meta game: trying to break the game.– students pose each other questions and correct

each other’s answers• still need a TA because of unsoundness of game

4/24/2011 Crowdsourcing 116

Emanuele (by email)

• Claim sets to share (close under negation)– HSR(n,k)<=q – CNF(k)>=1-2-k

– MAX-CSP(R)>=tR

– MAX(ProblemName,i)>=o• MAX(NetworkFlow,g)>=f

4/24/2011 Crowdsourcing 117

Terminology

• decision or move. The following adjectives are equivalent– faulty– wrong– erroneous

4/24/2011 Crowdsourcing 118

Modified Game

• Build list of agreed claims: social welfare.• Choose from list of allowed list of claims or

from social welfare claims.• Special rules for social welfare claims.• Who has agreed with claim? Choose highest

reputation member. Get’s extra time to defend.

4/24/2011 Crowdsourcing 119

• certificate must be agreed by both parties, constructed according to refutation protocol.

• “true” claim that is false will be refuted• “false” claim that is true will be defended • faulty decisions about claims• faulty decisions during game• root cause analysis

4/24/2011 Crowdsourcing 120

Root Cause AnalysisFalse claim in KB

• Alice proposed false claim.• Bob failed to propose hard instance.• Bob failed to solve instance well enough.

4/24/2011 Crowdsourcing 121

Root Cause AnalysisTrue claim not in KB

4/24/2011 Crowdsourcing 122

claim HSR(16,3)=4: false

• How can it be agreed? Find DT of depth 4 with 16 leaves numbered 0..15. Cannot find a legal one. Agreement protocol – InstanceSet is one– if there is no solution and there is only one

instance: everybody notices. No agreement is possible.

4/24/2011 Crowdsourcing 123

claim HSR(16,3)=5: non-optimal

• How can it be agreed? Find DT of depth 5 with 16 leaves numbered 0..15. Can find a legal one. Agreement protocol – InstanceSet is one– if there is no solution and there is only one

instance: everybody notices. No agreement is possible.

4/24/2011 Crowdsourcing 124

|InstanceSet|=1, Minimizationclaim kinds kinds(refined) agreement

bad, non-optimal false (too low) no

true (too high) yes

good, optimal true (just right) yes

4/24/2011 Crowdsourcing 125

bad, nonoptimal, false claims are properly excluded from social welfare.Proof: none of the scholars participating in the agreement protocol will successfully defend because there is no solution. Because InstanceSet has size 1, there is only one instance to test.

Perfect

• Being perfect means to make perfect decisions.

• up: if you are perfect, you will not lose.• down: if the other is perfect, you will not win.

Claim good bad

propose up down

oppose down up

up: if you are good, there is a chance that you windown: if the other is good, there is a chance that you lose

4/24/2011 126Crowdsourcing

SCG from different perspectives

• Organizational• Educational• Logical

4/24/2011 Crowdsourcing 127

SCG for different audiences

• Logicians• Computer scientists• Programmers• Laymen • Managers• Experimental scientists

4/24/2011 Crowdsourcing 128

SCG for programmers

Programming• claims about programs

– Provide input where claim fails.

• claims about problems– I have algorithm that solves

instances with quality q.• Provide algorithm and other

provides instance where algorithm does not achieve quality q.

SCG• claims

– Provide instance where claim fails.

• claims– algorithm = instance– instance = solution

4/24/2011 Crowdsourcing 129

Comparison Logic and SCG

Logic• sentences

– true– false

• proof for being true– proof system, checkable– guaranteed defense

• proof for being false– proof system, checkable– guaranteed refutation

• Universal sentences

Scientific Community Game• sentences = claims

– good– bad

• evidence for goodness– defense, checkable– uncertainty of defense

• evidence for badness– refutation, checkable– uncertainty of refutation

• Personified sentences

4/24/2011 Crowdsourcing 130

Laymen

• Group problem solving for problems with constructive solutions.

• Solutions are evaluated by group.• Reputation based: Scholar s1 is better than

scholar s2 if reputation(s1) > reputation(s2).• Game is egalitarian: scholars with good ideas

can force reputation win.• Scholars challenge each other and try to figure

out each other’s solution processes.

4/24/2011 Crowdsourcing 131

Organizational Problem Solved

• Avoid economic inefficiency of some crowdsourcing systems: n participants solve the same problem independently without interaction.– reduced learning

• instead learning through interaction– isolation of participants

• instead integration and collaboration– exploitation of participants

• if they don’t make money, they have the opportunity to learn– evaluation is subjective

• evaluation is done by community

4/24/2011 Crowdsourcing 132

Avoid Economic Nonsense

• 1 contest, 500 responses, 1 selected.– no interaction between participants– 499 don’t learn much, nor put bread on their

table.

• We want: each participant interacts with several others– feedback, opportunities for learning

4/24/2011 Crowdsourcing 133

A claim is

• information about one’s performance when interacting with another clever being.

• information about the performance of one’s program.

4/24/2011 134Crowdsourcing

How to engage scholars?• Several binary games between Alice and Bob.• Alice must propose C or !C for one of the allowed

C.• Bob must agree with or oppose what Alice

proposes.• Agree(C)

– Bob defends C against Alice. – Bob refutes !C against Alice.– Alice defends C against Bob.– Alice refutes !C against Bob.

4/24/2011 135Crowdsourcing