Truth-Revealing Social Choice ADT-15 Tutorial Lirong Xia.

Truth-Revealing Social Choice

ADT-15 Tutorial

Lirong Xia

• Member of Parliament

election:

Plurality rule Alternative vote?

• 68% No vs. 32% Yes

2

2011 UK Referendum

Ordinal Preference Aggregation: Social Choice

> > social choice

mechanism

> >

> >

3

A profile

Carol

Alice

Bob

A B C

A B C

ACB

A

4

A B C

A B C

Turker 1 Turker 2 Turker n

…

> >

Ranking pictures [PGM+ AAAI-12]

...

.

.

.

....

. ....

. . .

.. .

. .

. .. . .

> > AB > B C>

5

Social choice

R1 R1*

Outcome

R2 R2*

Rn Rn*

social choice mechanism

… …

Profile

Ri, Ri*: full rankings over a set A of alternatives

Applications: real world

• People/agents often have conflicting

preferences, yet they have to make a

joint decision

6

• Multi-agent systems [Ephrati and Rosenschein 91]

• Recommendation systems [Ghosh et al. 99]

• Meta-search engines [Dwork et al. 01]

• Belief merging [Everaere et al. 07]

• Human computation (crowdsourcing) [Mao et al.

AAAI-13]

• etc.7

Applications: academic world

How to design a good social choice mechanism?

8

What is being “good”?

Two goals for social choice mechanisms

GOAL1: democracy

9

GOAL2: truth

THIS TUTORIALAxiomatic social choice

• Axiomatic social choice

• The Condorcet Jury Theorem (CJT)

• Break

• Four directions of extending CJT

• Beyond CJT: the objective decision-

making perspective

10

Outline

• Research questions + Basic models

– tip of the iceberg

• More references

– Survey by Nitzan and Paroush (online):

Collective Decision Making and Jury Theorem

– Survey by Gerlinga et al. [2005]: Information

acquisition and decision making in committees:

A survey

– My personal summary, send me an email 11

Flavor of this tutorial

• Joerg’s text book

• Handbook of Computational Social Choice

12

Computational social choice



• Break



making perspective

13

Outline

Common voting rules(what has been done in the past two centuries)

• Mathematically, a social choice mechanism (voting rule) is a mapping from {All profiles} to {outcomes}– an outcome is usually a winner, a set of winners, or a ranking

– m : number of alternatives (candidates)

– n : number of agents (voters)

– D=(P1,…,Pn) a profile

• Positional scoring rules• A score vector s1,...,sm

– For each vote V, the alternative ranked in the i-th position gets si points

– The alternative with the most total points is the winner

– Special cases

• Borda, with score vector (m-1, m-2, …,0)

• Plurality, with score vector (1,0,…,0) [Used in the US]

An example

• Three alternatives {c1, c2, c3}

• Score vector (2,1,0) (=Borda)

• 3 votes,

• c1 gets 2+1+1=4, c2 gets 1+2+0=3,

c3 gets 0+0+2=2

• The winner is c1

1 2 3c c c 2 1 3c c c 3 1 2c c c

2 1 0 2 1 0 2 1 0

• Kendall tau distance

– K(V,W)= # {different pairwise comparisons}

• Kemeny(D)=argminW K(D,W)=argminW ΣP∈DK(P,W)

• For single winner, choose the top-ranked alternative

in Kemeny(D)

• [Has a statistical interpretation] 16

The Kemeny rule

K( b ≻ c ≻ a , a ≻ b ≻ c ) =112

• Approval, Baldwin, Black, Bucklin,

Coombs, Copeland, Dodgson, maximin,

Nanson, Range voting, Schulze, Slater,

ranked pairs, etc…

17

…and many others

18

• Q: How to evaluate rules in terms of

achieving democracy?

• A: Axiomatic approach

19

Axiomatic approach(what has been done in the past 50 years)

• Anonymity: names of the voters do not matter– Fairness for the voters

• Non-dictatorship: there is no dictator, whose top-ranked alternative is always the winner– Fairness for the voters

• Neutrality: names of the alternatives do not matter– Fairness for the alternatives

• Consistency: if r(D1)∩r(D2)≠ϕ, then r(D1∪D2)=r(D1)∩r(D2)

• Condorcet consistency: if there exists a Condorcet winner, then it must win– A Condorcet winner beats all other alternatives in pairwise elections

• Easy to compute: winner determination is in P– Computational efficiency of preference aggregation

• Hard to manipulate: computing a beneficial false vote is hard

20

Which axiom is more important?

• Some of these axiomatic properties are not compatible with others

Condorcet consistency

Consistency Easy to compute

Positional scoring rules

N Y Y

Kemeny Y N N

Ranked pairs Y N Y

21

An easy fact

• Theorem. For voting rules that selects a single winner, anonymity is not compatible with neutrality– proof:

>

>

>

>

≠W.O.L.G.

NeutralityAnonymity

Alice

Bob

22

Not-So-Easy facts• Arrow’s impossibility theorem

– Google it!

• Gibbard-Satterthwaite theorem

– Google it!

• Axiomatic characterization

– Template: A voting rule satisfies axioms A1, A2, A2 if it is rule X

– If you believe in A1 A2 A3 are the most desirable properties then X is

optimal

– (anonymity+neutrality+consistency+continuity) positional scoring

rules [Young SIAMAM-75]

– (neutrality+consistency+Condorcet consistency) Kemeny [Young&Levenglick SIAMAM-78]



• Break



making perspective

23

Outline

• Given

– two alternatives {a,b}.

– competence 0.5<p<1,

• Suppose

– agents’ signals are i.i.d. conditioned on the ground truth

• w/p p, the same as the ground truth

• w/p 1-p, different from the ground truth

– agents truthfully report their signals

• The majority rule reveals ground truth as n→∞

24

The Condorcet Jury theorem (CJT) [Condorcet 1785, Laplace 1812]

• It Justifies the democracy and wisdom of

the crowd

• It “lays, among other things, the

foundations of the ideology of the

democratic regime” [Paroush SCW-98]

25

Why CJT is important?

• Group competence

– Pr(maj(Pn)=a|a)

– Pn: n i.i.d. votes given ground truth a

• Random variable Xj : takes 1 w/p p, 0 otherwise

– encoding whether signal=ground truth

• Σj=1nXj /n converges to p in probability (Law of Large

Numbers)26

Proof

The group competence

1. is higher than that of any single agent

2. increases in the group size n

3. goes to 1 as n→∞

27

Three parts of CJT

• From 2k to 2k+1

– The extra vote breaks ties with higher probability in

favor of the ground truth

– k@a+k@b

• From 2k+1 to 2k+2

– (k+1)@a+k@b(k+1)@a+(k+1)@b

– k@a+(k+1)@b(k+1)@a+(k+1)@b28

Proof of competence monotonicity

(k+1)@a+k@b

k@a+(k+1)@b

p

1-p

• Given

– two alternatives {a,b}.

– competence 0.5<p<1,

• Suppose

– agents’ signals are i.i.d. conditioned on the ground truth

• w/p p, the same as the ground truth

• w/p 1-p, different from the ground truth

– agents truthfully report their signals

• The majority rule reveals ground truth as n→∞

29

Limitations of CJT

more than two?

heterogeneous agents?

dependent agents?

strategic agents?

other rules?



• Break



making perspective

30

Outline

• Dependent agents

• Heterogeneous agents

• Strategic agents

• More than two alternatives

31

Extensions

32

An active area

Social

Choice and

Welfare

American

Political

Science

Review

Games and

Economic

Behavior

Mathematical

Social

Sciences

Theory and

DecisionPublic

ChoiceEconometrica +

JET

Myerson

Shapley&Grofman

MSS special issue on ADT-15





33

Extensions



– Not always (mimicking one leader)


– Not always (mimicking one leader)


– Yes for some dependency models [Berg 92; Ladha

92, 93; Peleg&Zamir 12]34

Does CJT hold for dependent agents?

• Positive correlations

– agents are likely to receive similar signals even

conditioned on the ground truth

• Negative correlations

– agents are likely to receive different signals

• Conjecture: Positive correlations reduces group

competence

– positively correlated agents effectively reduces the

number of agents35

Dependent agents

• One leader (Y), 2k followers (X1,…, X2k), same

competence p

– Pr(Y=1) = Pr(Xj =1)=p

– Xj’s are independent conditioned on Y

• Correlation r2

– Pr(Xj =1|Y=1) = p+r(1-p)

– Pr(Xj =0|Y=0) = (1-p) + rp

• Theorem. In the opinion leader model

– when p>0.5 the group competence decreases in r

– when p<0.5 the group competence increases in r

– when p=0.5 the group competence does not change in r 36

Opinion leader model[Boland et al. 89]

• One common evidence (E), 2k+1 agents (X1,…, X2k+1),

same competence p

– Pr(E=1) = Pr(Xj=1)=p

– Xj’s are independent conditioned on E

• Correlation r2

– Pr(Xj=1|E=1) = p+r(1-p)

– Pr(Xj=0|E=0) = (1-p) + rp

• Theorem. In the common evidence model

– when p>0.5 the group competence decreases in r

– when p<0.5 the group competence increases in r

– when p=0.5 the group competence does not change in r 37

Common evidence model[Boland et al. 89]

• Ground truth G

• Common evidence E

• Given any ideal vote function f: EG

– Competence pe=Pr(Xj =f(e)|e)

• Theorem. The majority rule converges to

f(e) as n→∞

38

Common evidence model[Dietrich and List 2004]

G

E

X1 Xn…





39

Extensions



– Not always (1, 0.9.0.8,…)


– Not always (1, 0.9.0.8,…)


– not always: pj=0.5+1/n

– Yes under some condition [Berend&Paroush, 1998]

40

Does CJT hold for heterogeneous agents?

• Independent signals

• Agent j’s competence is pj

• Theorem [Berend&Paroush, 1998]. CJT holds

if and only if

1. , or

2. for every sufficiently large n,

41

Group competence for heterogeneous agents

• Given the competence {p1,…,pn} of n agents

where pj ≥0.5

– Ml: average competence of m randomly chosen

agents

• Theorem [Berend&Sapir 05]. For two alternatives

and all l≤n-1

– Ml ≤ Ml+1 if m is even

– Ml = Ml+1 if m is odd42

Competence monotonicity[Berend&Sapir 05]

• Theorem [Shapley and Grofman 1984]. Given

the competence {p1,…,pn} of n agents, the

maximum likelihood estimator is the

weighted majority voting with

• Proof. Suppose the ground truth is a, the

log likelihood of the profile is

43

Optimal voting rule for two alternatives





44

Extensions



– Not always (same-vote equilibrium)


– Not always (same-vote equilibrium)


– Yes for some models and informative

equilibrium 45

Does CJT hold for strategic agents?

• Common interest Bayesian voting game [Austen-

Smith&Banks APSR-96]

– two alternatives {a, b}, two signals {A,B}, a prior, Pr(signal|

truth),

• pa=Pr(signal=A|truth=a)

• pb=Pr(signal=B|truth=b)

– agents have the same utility function U(outcome, ground

truth) =1 iff outcome = ground truth

– sincere voting: vote for the alternative with the highest

posterior probability

– informative voting: vote for the signal

– strategic voting: vote for the alternative with the highest

expected utility46

Strategic voting

1. Nature chooses a ground truth g

2. Every agent j receives a signal sj~Pr(sj|g)

3. Every agent computes the posterior

distribution (belief) over the ground truth

using Bayesian’s rule

4. Every agent chooses a vote to maximizes

her expected utility according to her belief

5. The outcome is computed by the voting

rule 47

Timeline of the game

• Two signals, two voters

• Model:

Pr( | )

= Pr( | )

= p>0.5

48

High level example

p 1-p

+ my vote , winner:

utility for voting :

half/half half/half

p 1-p p1-p

Truthful agent:

1 0.5 0 0.5

Posterior:

The other signal:

• Setting

– Two alternatives {a, b}, two signals {A,B}

– Three agents

– pa=0.8, pb=0.6

– Uniform prior: Pr(a)=0.1, Pr(b)=0.9

• An agent receives A

– Informative voting: a

– posterior probability:

• 0.1*0.8@a vs. 0.9*0.4@b

• sincere voting: b49

Sincere voting = informative voting?

• Setting


– Three agents

– pa=0.8, pb=0.6

– Uniform prior: Pr(a)=Pr(b)=0.5

• An agent receives A, other two agents are sincere/informative

– Informative voting: a

– posterior probability: 2/3@a+1/3@b

• sincere voting: a

– probability of a tie (other two agents’ votes are {a, b})

• 0.32|a, 0.48|b

– Expected utility for voting a: 0.32*2/3

– Expected utility for voting b: 0.48*1/3

– Strategic voting: a 50

Sincere voting = strategic voting?

• Setting


– Three agents

– pa=0.8, pb=0.6

– Uniform prior: Pr(a)=Pr(b)=0.5

• An agent receives A, other two agents are

sincere/informative

– Conditioned on other two votes are {a, b}

– Signal profile is (A,A,B)

– Posterior probabilities

• Pr(a|A,A,B) Pr(∝ a)×Pr(A|a)×Pr(A|a)×Pr(B|a)=0.5pa2(1-pa)

• Pr(b|A,A,B) Pr(∝ b)×Pr(A|b)×Pr(A|b)×Pr(B|b)=0.5(1-pb)2pb

– Strategic voting: a51

The “pivotal” approach

• Given a Bayesian game, a Bayesian Nash

Equilibrium is a strategy profile (s1,…, sn)

such that

– sn: signal vote

– every agent j prefers sj to any other strategy,

conditioned on other agents playing s

• Example of strategy

– Informative voting: s(A)=a, s(B)=b

– You can also: s(A)=b, s(B)=a52

Bayesian Nash Equilibrium

• Theorem [McLennan 98]. Let r* denote

the voting rule with maximum expected

utility given informative vote. Informative

voting is a Bayesian Nash Equilibrium

under r*.

53

Equilibrium under the optimal voting rule

• Key question:

– What are the equilibria of the game (hopefully informative

voting)?

– Does CJT hold in equilibria?

• Similar model for juries– [Feddersen&Pesendorfer Econometrica-97, APSR-98, PNAS-99]

• Number of voters is uncertain, following a Poisson

distribution– [Myerson GEB-98, JET-02]

• Three alternatives– [Nunez JTP-10; Goertz&Maniquet JET-11;B outon and Micael Castanheira

Econometrica-12; Goertz SCW-14; Goertz&Maniquet EL-14]

54

Subsequent work





55

Four extensions

Condorcet’s MLE approach• Parametric ranking model Mr: given a “ground truth” parameter Θ

– each vote V is drawn i.i.d. conditioned on Θ, according to Pr(V|Θ)

– Each P is a ranking

• For any profile P=(V1,…,Vn),

– The likelihood of Θ is L(Θ|P)=Pr(P|Θ)=∏V∈P Pr(V|Θ)

– The MLE mechanism

MLE(P)=argmaxΘ L(Θ|P)

– Break ties randomly

• What if Decision space ≠ Parameter space?

“Ground truth” Θ

V1 V2 Vn…

56

• Fix the dispersion ϕ<1

• Parameter space

– all full rankings over alternatives

• Sample space

– i.i.d. generated full rankings

• Probabilities: given a ground truth ranking

W, generate a ranking V w.p.

PrW(V) ∝ ϕ Kendall(V,W)

• MLE is the Kemeny rule57

Mallows’ model [Mallows-1957]

• Fix the dispersion ϕ<1

• Parameter space

– all binary relations over alternatives

• Sample space

– i.i.d. generated binary relations

• Probabilities: given a ground truth relation

W, generate a relation V w.p.

PrW(V) ∝ ϕ Kendall(V,W)

58

Condorcet’s model [Condorcet-1785, Young-1988]

• Understanding truth-revealing property of

existing rules

– MLE: [Conitzer&Sandholm UAI-05; Conitzer,Rognile&Xia IJCAI-

09; Xia,Conitzer&Lang AAMAS-10; Xia&Conitzer AAAI-11]

– Consistent estimator: [Caragiannis, Procaccia & Shah EC-13]

– Most probable winner: [Procaccia, Reddit&Shah UAI-13;

Elkind&Shah UAI-14; Azari Soufiani, Parkes,&Xia NIPS-14]

• Learning ranking models

– Mallows’ model: [Lu&Boutilier ICML-11; Hughes, Hwang&Xia

UAI-15; Awasthi et al. NIPS-14; Chierichetti et al. ITCS-15]

– Random Utility Models [too many to show] 59

Recent Work in Computer Science



• Break



making perspective

60

Outline

• Thinking about Arrow’s impossibility theorem

– axiomatic properties are used to evaluate and

compare voting rules

• New perspective

– an objective measurement for voting rules

– can be seem as another numerical “axiomatic”

property

61

Beyond CJT

• How to make objectively optimal decision using

voting?

• Goal: new computationally tractable voting rule

with desirable axiomatic+statistical properties– 2 alternatives: majority rule

– Kemeny’s rule (for ranking), NP-hard to compute

• Especially when Decision space ≠ Parameter

space

– e.g. use Mallows’ model to choose a single winner 62

CJT: the optimal objective decision-making perspective

• Social choice community

– statistical models are compelling

• Statistics/Machine Learning community

– some axioms are desirable

• strategy-proofness

• monotonicity

• agents have less incentive to lie

63

Why care?

StatML

Social

Choice

Inputs

The rule

64

Statistical decision-theoretic framework for social choice

[Azari Soufiani, Parkes &Xia NIPS-14]

• statistical model: Θ, S, Prθ(s)

• decision space: D

• loss function: L(θ, d)∈ℝ

f : Profiles⟶D with minimum Bayesian expected lost:

– f (P) argmin∈ d Eθ|P L(θ,d)

unknown ground truth decision to make

• fB1 (Mallows)

– Statistical model: Mallows’ model

– Decision space: single winners

– Loss function: the top loss function

• L(W, a) =0 if a is top-ranked in W, otherwise it is 1

– Bayesian estimator with uniform prior

• fB2 (Condorcet)

– Statistical model: Condorcet’s model

– Decision space: single winners

– Loss function: the top loss function

• L(W, a) =0 if a is top-ranked in W, otherwise it is 1

– Bayesian estimator with uniform prior 65

Two examples

Anonymity, neutrality,

monotonicity

Consistency

Majority, Condorcet Complexity

Min Bayesian

risk

Kemeny(Fishburn)

✔ ✗

✔ ✗ ✗

fB1

(Mallows)

✗✗ ✔ for

Mallows

fB2

(Condorcet) ✔✔ for

Condorcet

66

Comparisons

Highlight: fB2 does well in many aspects.

• How much does strategic agents hurt the truth-revealing

power?

• Price of Anarchy (PoA) [Koutsoupias&Papadimitriou STACS-99]

Optimal truth-revealing power

WORST truth-revealing power in equilibrium

• Price of Stability (PoS) [Anshelevich et al. FOCS-04]

Optimal truth-revealing power

BEST truth-revealing power in equilibrium

• Theorem [Xia-15]. Informative voting is a BNE under plurality

for a wide range of statistical modes with m>2

• Theorem [Xia-15]. The PoA of plurality is at least m, the PoS

of plurality is 1 as n→∞67

CJT: numerically evaluate the effect of strategic agents

• The Condorcet Jury Theorem

• Four extensions

– dependent agents

– heterogeneous agents

– strategic agents

– more than two alternatives

• The new perspective

– design new mechanisms

– PoA and PoS 68

Wrap up

• Numerical extensions of the CJT to

– dependent, heterogeneous, and strategic

agents

– with m>2

– for commonly studied voting rules

• The new perspectives

– New frameworks and rules compromising

axiomatic, computational, and truth-revealing

properties69

Open questions

Thank you!

• Given

– a similarity function d

• symmetric, coincidence axiom

• not necessarily triangle inequality

– a dispersion 0<ϕ<1

• Prb(a) ∝ ϕ d(a, b)

70

Mallows-like models

• d:

• Suppose an agent receives a1

– When t is sufficiently small, reporting a2 has a

higher expected utility given that other agents

are sincere under the plurality rule

– When triangle inequality is satisfied, sincere

voting is a BNE

71

Sincere voting is not always a BNEa1 a3

a2 a4

t t33

2

2

Truth-Revealing Social Choice ADT-15 Tutorial Lirong Xia.

Documents

Transcript of Truth-Revealing Social Choice ADT-15 Tutorial Lirong Xia.