Preference Elicitation in Multiagent Domains: Mechanism ... · 20 Implications of Properties on...

1

Preference Elicitation in Multiagent Domains:

Mechanism Design, Voting and Stable Matching

Craig Boutilier

Department of Computer Science

University of Toronto

2

The Preference Bottleneck in AI

Decisions on behalf of individuals (groups, organizations)

• match individuals to desired products, services, information,

people, behaviors, courses of action

Decision theory provides foundations for automated

decision support systems

• actions, outcomes, dynamics, utilities: MEU

But what is the objective function?

• user preferences (or utilities) are often unknown

• vary much more widely than dynamics

Consider applications in your area of research where the dynamics are stable, but goals or preferences vary

3

The Preference Bottleneck

Two difficult questions faced in decision analysis:

• Decomposition, natural representation of preferences

• Assessing precise tradeoffs (cognitive cost)

Other difficult questions:

• what preference info is relevant to the task at hand?

• when is the elicitation effort worth the improvement it offers in

terms of decision quality?

• what decision criterion to use given partial utility info?

4

Product Configuration

Luggage Capacity? Two Door? Cost?

Engine Size? Color? Options?

5

COACH*

POMDP for prompting Alzheimer’s patients

• solved using factored models, value-directed compression of

belief space

Reward function (patient/caregiver preferences)

• indirect assessment (observation, policy critique)

../NESCAI07/mdp_005_short.avi

6

Active Collaborative Filtering*

Probabilistic model assumed (MCVQ, naïve Bayes)

• expected utility (given uncertain preferences) for specific recommendations

Active querying to produce better suggestions

• which new ratings would (in expectation) improve expected value of best suggestions the most?

• computed using EVOI

• offline bounding of effect on posterior to prune queries allows real time response

7

Combinatorial Auctions

Expressive bidding in auctions becoming common

• combinatorial bids, side-constraints, discount schedules, etc.

• direct expression of utility/cost: economic efficiency

Advances in winner determination

• determine least-cost allocation of business to bidders

8

Non-price Preferences

A and B for $12000. C and D for $5000…

A for $10000.

B and D for $5000 if A; B and D for $7000 if not A...

Joe

Hank

etc…

A, C to Fred. B, D, G to Frank. F, H, K to Joe…

Cost: $57,500.

That gives too much business

to Joe!!

9


Winner determination algs. minimize cost alone

• but preferences for non-price attributes play key role

• Some typical attributes in sourcing:

percentage volume business to specific supplier

average quality of product, delivery on time rating

geographical diversity of suppliers

number of winners (too few, too many), …

Clear utility function involved

• difficult to articulate precise tradeoff weights

• “What would you pay to reduce %volumeJoe by 1%?”

10

Manual Scenario Navigation*

Current practice: manual scenario navigation

• impose constraints on winning allocation

• re-run winner determination

• new allocation satisfying constraint: higher cost

not a hard constraint!

• assess tradeoff and repeat (often hundreds of times)

until satisfied with some allocation

Here’s a new allocation with less business to Joe.

Cost is now: $62,000.

12

Bargaining for a Car



$$ $$

$$

$$

$$

$$

$$

$$

Social Choice

13

???

Winner determination: non-price attributes % volume business to specific supplier average quality of product delivery on time rating geographical diversity of suppliers number of winners (too few, too many), …

14

Overview Section 1: Decision Theory and Basics of Preference Elicitation

Section 2: Regret-based and Polyhedral Methods

• computational motivations, imprecise utility functions

• minimax regret

• polyhedral conjoint analysis, volumetric methods (if time)

[May skip due to time] Section 3: Elicitation in Mechanism Design

• basics of mechanism design, incentives, VCG

• partial revelation mechanisms

Section 4: Elicitation in Voting

• basics of social choice, voting

• minimax regret and incremental vote elicitation

Section 5: Elicitation in Stable Matching

• basics of stable matching problems

• preference elicitation in stable matching

15

Why preferences?

Natural question: why not specify behavior with goals?

Preferences: coffee > OJ > tea

• Natural goal: coffee

but what if unavailable? requires a 30 minute wait? …

• allows alternatives to be explored in face of costs, infeasibility,…

16

Preference Orderings

Assume (finite) outcome set X (states, products, etc.)

Preference ordering over X: x1 x2, x1 > x2, x1 ~ x2, …

• must be: (a) transitive; (b) connected (orderable)

• i.e., a total preorder

Why connected? Why transitive?

• e.g., money pump

Consider the planning problem: what if uncertainty?

> > > … >

Uncertainty in Decision Outcomes

What if:

• 2% chance no coffee made (30 min delay)? 10%? 20%? 95%?

• robot has charge to check only one possibility

• 5% chance of damage in coffee room, 1% at OJ vending machine

17

18

Preference over Lotteries

If there’s uncertainty in choice outcomes, is not enough

A simple lottery over X has form:

[ (p1 ,x1), (p2 ,x2), …, (pn ,xn) ]

where pi 0 and pi = 1

A compound lottery allows outcomes to be lotteries:

[ (p1 ,l1), (p2 ,l2), …, (pn ,ln) ]

• outcomes are just trivial lotteries; restrict to finite compounding

19

Constraints on Lotteries*

Continuity:

• If x1>x2>x3 then p s.t. [(p,x1), (1-p,x3)] ~ x2

Substitutability:

• If x1 ~ x2 then [(p,x1), (1-p,x3)] ~ [(p,x2), (1-p,x3)]

Mononoticity:

• If x1 x2 and p q then [(p,x1), (1-p,x2)] [(q,x1), (1-q,x2)]

Reduction of Compound Lotteries (“no fun gambling”):

• [ (p, [(q,x1), (1-q,x2)] ), (1-p, [(q’,x3), (1-q’,x4)]) ]

~ [ (pq,x1), (p-pq,x2), (q’-pq’,x3), ((1-p)(1-q’),x4) ]

Nontriviality:

• xT > x

20

Implications of Properties on

Since is trans, connected: representable by an ordinal

value function V(x)

With constraints on lotteries: we can construct a utility

function U(l)∈ R s.t. U(l1) U(l2) iff l1 l2

• where U([ (p1 ,x1), … , (pn ,xn) ]) = i piU(xi)

• famous result of Ramsey, vonNeumann&Morgenstern, Savage

• Exercise: prove existence of such a utility function

Thus knowing U(xi) for each outcome allows tradeoffs to

be made over uncertain courses of action (lotteries)

Principle of Maximum Expected Utility (MEU)

• utility of choice is a expected utility of its outcome

• appropriate choice is that with maximum expected utility

21

Some Discussion Points**

Utility function existence: proof is straighforward

Utility function for > over lotteries is not unique:

• any positive affine transformation of U induces same ordering >

• normalization in range [0,1] common

Ordinal preferences “easy” too elicit (if X small)

• cardinal utilities trickier for people: almost an “art form” in D.A.

Outcome space often factored: exponential size

• requires techniques of multiattribute utility theory MAUT

Expected utility accounts for risk: inherent in preferences

over lotteries

• see utility of money

22

Risk profiles and Utility of money**

What would you choose?

• (a) $100,000 or (b) [(.5, $200,000), (.5, 0) ]

• what if (b) was $250K, $300K, $400K, $1M; p = .6, .7, .9, .999, …

• generally, U(EMV(lottery)) > U(lottery) EMV = expected monetary value

Utility of money is nonlinear: e.g., U($100K) > .5U($200K)+.5U($0)

Certainty equivalent of l: U(CE) = U(l); CE = U-1(EU(l))

0 40K 100K 200K U(0)

EU(lottery)

U($100K)

U($200K) For many people, CE ~ $40K Note: 2nd $100K “worth less” than 1st $100K

23

Risk attitudes**

Risk Premium: EMV(l) – CE(l)

• how much of EMV will I give up to remove risk of losing

Risk averse:

• decision maker has positive risk premium; U(money) is concave

Risk neutral:

• decision maker has zero risk premium; U(money) is linear

Risk seeking:

• decision maker has negative risk premium; U(money) is convex

Most people are risk averse

• this explains insurance

• often risk seeking in negative range

• linear a good approx in small ranges

24

St. Peterburg Paradox**

How much would you pay to play this game?

• A coin is tossed until it falls heads. If it occurs on the Nth toss

you get $2N

• Most people will pay about $2-$20

11

122

1

n

n

n

n

EMV

25

Allais’ Paradox**

Situation 1: choose either

• (1) $1M, Prob=1.00

• (2) $5M, Prob=0.10; $1M, Prob=0.89; nothing, Prob=0.01


• (3) $1M, Prob=0.11; nothing, Prob=0.89


Most people: (1) > (2) and (4) > (3)

Paradox: no way to assign utilities to monetary outcomes

that conforms to expected utility theory and the stated

preferences (violates substitutability)

• possible explanation: regret

…and the survey says**

Situation 1:

• (1)>(2): 11 (37%)

• (2)>(1): 19 (63%)

Situation 2:

• (3)>(4): 2 (7%)

• (4)>(3): 27 (93%)

26

Allais’ Paradox: The Paradox**


• (1) $1M, Prob=1.00

equiv: ($1M 0.89; $1M 0.11)

• (2) $5M, Prob=0.10; $1M, Prob=0.89; nothing, Prob=0.01

• So if (1)>(2), by subst: $1M > ($5M 10/11; nothing 1/11)




equiv: nothing 0.89; $5M 0.10; nothing 0.01

• So if (4)>(3), by subst: ($5M 10/11; nothing 1/11) > $1M

27

28

Ellsberg Paradox**

Urn with 30 red balls, 60 yellow or black balls; well mixed


• (1) $100 if you draw a red ball

• (2) $100 if you draw a black ball


• (3) $100 if you draw a red or yellow ball

• (4) $100 if you draw a black or yellow ball

Most people: (1) > (2) and (4) > (3)

Paradox: no way to assign utilities (all the same) and

beliefs about yellow/black proportions that conforms to

expected utility theory

• possible explanation: ambiguity aversion

29

Utility Representations

Utility function u: X [0,1] • decisions induce distribution over outcomes

• or we simply choose an outcome (no uncertainty), but constraints on outcomes

If X is combinatorial, sequential, etc. • representing, eliciting u difficult in explicit form

Some structural form usually assumed • so u parameterized compactly (weight vector w)

• e.g., linear/additive, generalized additive models

Representations for qualitative preferences, too

• e.g., CP-nets, TCP-nets, etc. [BBDHP03, BDS05]

Configuration Problems

• Configuration variables

X = {X1 … Xn}

• Constraints C over X

• Xf : set of feasible outcomes

Utility function u: X [0,1] • user’s strength of preference

Optimal decision x* = argmax {u(x) : x Xf }

30

Color

cherryRed

metallicBlue

grey

#Doors

Coupe/2

Sedan/4

Hatch/5

Wagon/5

Power

150hp

280hp

350hp

Power > 280hp → FuelCons > 8l /100km

Make=Lexus & Power > 200hp → AutoTrans

…

car1 car2 car3 car4 …

0.29 1.0 0.85 0.96 …

31

Flat vs. Structured Utility Representation

Naïve representation: vector of values

• e.g., car7:1.0, car15:0.92, car3:0.85, …, car22:0.0

Impractical for combinatorial domains

• e.g., can’t enumerate exponentially many cars, nor expect user

to assess them all (choose among them)

Instead we try to exploit independence of user

preferences and utility for different attributes

• the relative preference/utility of one attribute is independent of

the value taken by (some) other attributes

Assume X Dom(X1) x Dom(X2) x … Dom(Xn)

• e.g., car7: Color=red, Doors=2, Power=320hp, LuggageCap=0.52m3

32

Preferential, Utility Independence**

X and Y = V-X are preferentially independent if:

• x1y1 x2y1 iff x1y2 x2y2 (for all x1, x2, y1, y2)

• e.g., Color: red>blue regardless of value of Doors, Power, LugCap

• conditional P.I. given set Z: definition is straightforward

X and Y = V-X are utility independent if:

• l1(Xy1) l2(Xy1) iff l1(Xy2) l2(Xy2) (for all y1, y2 , all distr. l1,l2)

• e.g., preference for lottery(Red,Green,Blue) does not vary with

value of Doors, Power, LugCap

implies existence of a “utility” function over local (sub)outcomes

• conditional U.I. given set Z: definition is straightforward

33

Additive Utility Functions

Additive representations commonly used [KR76]

• breaks exponential dependence on number of attributes

• use sum of local utility functions ui over attributes

• or (normalized) local value functions vi plus scaling factors i

This will make elicitation much easier

Color u1

red 1.0

blue 0.7

grey 0.0

Drs u2

2 1.0

4 0.8

hatch 0.2

wag’n 0.0

Pwr u3

350 1.0

280 0.7

150 0.0

1= 0.2

2= 0.3

3= 0.5

u(red,2dr,280hp) = 0.85

34

Additive Utility Functions

An additive representation of u exists iff decision maker is

indifferent between any two lotteries where the marginals

over each attribute are identical

• l1(X) ~ l2(X) whenever l1(Xi) = l2(Xi) for all Xi

35

Generalized Additive Utility

Generalized additive models more flexible interdependent value additivity [Fishburn67], GAI [BG95]

• assume (overlapping) set of m subsets of vars X[j]

• use sum of local utility functions uj over attributes

This will make elicitation much easier

Color Drs u1

red 2 1.0

blue 4 0.9

red 4 0.6

blue 2 0.4

Pwr Drs u1

350 2 1.0

350 4 0.7

280 2 0.65

280 4 0.55

1= 0.4 2= 0.6

u(red,2dr,280hp) = 0.79

36

GAI Utility Functions

An GAI representation of u exists iff decision maker is

indifferent between any two lotteries where the marginals

over each factor are identical

• l1(X) ~ l2(X) whenever l1(X[i]) = l2(X[i]) for all i

37

Basic Elicitation: Flat Representation

“Typical” approach to assessment

• normalization: set best outcome utility 1.0; worst 0.0

• standard gamble queries: ask user for probability p with which

indifference holds between x and SG(p)

• e.g., car3 ~ <0.85, car7; 0.15, car22 >

38

Basic Elicitation: Flat Representation

SG queries: require precise numerical assessments

Bound queries: fix p, ask if x preferred to SG(p)

• yes/no response: places (lower/upper) bound on utility

• easier to answer, much less info (narrows down interval)

Simple binary search can be used to identify utility u(x)

• but how much precision is needed in practice?

39

Elicitation: Additive Models (Classical)

First: assess local value functions with local SG queries

• calibrates on [0,1]

For instance,

• ask for best value of Color (say, red ), worst value (say, grey)

• then ask local standard gamble for each remaining Color to

assess it’s local value

blue ~ <0.85, red; 0.15, grey >

green ~ <0.67, red; 0.33, grey >, …

Bound queries can be asked as well

• only refine intervals on local utility

40

Elicitation: Additive Models

Second: assess scaling factors with “global” queries

• define reference outcome

e.g., could be best global outcome, or any salient outcome

e.g., user’s current car: (red, 2door, 150hp, 0.35m3)

• define by setting Xj to best value, others to reference value

e.g., for doors: (red, 4door, 150hp, 0.35m3)

• compute scaling factor

• assess the 2n utility values with (global) SG queries

Altogether: gives us full utility function

41

Elicitation: GAI Models (Classical)

Assessment is subtle (won’t get into gory details)

• overlap of factors a key issue [F67,GP04,DB05]

• cannot rely on purely local queries: values cannot be fixed without

reference to others!

• seemingly “different” local prefs correspond to same u

u(Color,Doors,Power) = u1(Color,Doors) + u2(Doors,Power)

u(red,2door,280hp) = u1(red,2door) + u2(2door,280hp)

u(red,4door,280hp) = u1 (red,4door) + u2(4door,280hp)

10 6 4

6 3 3

9 1

42

Local Queries [Braziunas, Boutilier UAI05]

We wish to avoid queries on whole outcomes

• can’t be purely local; but condition on a subset of reference values

Conditioning set Ci for factor ui(Xi) :

• vars (excl. Xi) in any factor uk(Xk) where Xi Xk≠

• setting Ci to reference values renders Xi independent of remaining

variables

e.g.,Power=280hp shields <Color,Door> from any other vars

• Define local best/worst for ui assuming Ci set at reference levels

• Ask SG queries relative to local best/worst with Ci fixed

e.g., fix Power=280hp and ask SG queries on <Color,Door>

conditioned on 280hp

43

Local Queries [BB05]**

Theorem: If for some y (where Y =X - Xi - C(Xi) )

then for all y’

Hence we can legitimately ask local queries:

Conditioning Sets**

44

BCD

ABC

FGH

EF DE

AE=a0e0

BCF=b0c0f0

D=d0

EH=e0h0

DGHJ=d0g0h0j0

FGJ

EJ=e0j0

45

Local Standard Gamble Queries*

Local standard gamble queries for each factor

• use “best” and “worst” local outcome―conditioned on default

values of conditioning set

e.g., xT[1] = abcd0 for factor ABC; x[1] = ~abcd0

• SG queries on other parameters relative to these

• gives local value function v(x[i]) (e.g., v(ABC) )

Can use bound queries as well

But local VFs not enough: must calibrate

• requires global scaling

46

Global Scaling*

Assess scaling factors with “global” queries

• exactly as with additive models

• define reference outcome

• define by setting X[j] to best value, others to reference vals

• compute scaling factor

• assess the 2n utility values with (global) SG queries

• can use bound queries as well

47

Elicitation: Beyond the Classical View

The classic view involving standard gambles difficult:

• large number of parameters to assess (structure helps)

• unreasonable precision required (SGQs)

• queries over full outcomes difficult (structure helps)

• cost (cognitive, communication, computational, revelation) may

outweigh benefit

can often make optimal decisions without full utility information

General approach to practical, automated elicitation

• cognitively plausible forms of interaction

• incremental elicitation until decision possible that is good enough

• collaborative models to allow generalization across users

48

Beyond Standard Gamble Queries

Bound queries

• a boolean version a (global/local) SG query

• global: “Do you prefer x to [(p, xT), (1-p, x)]?”

• local: “Do you prefer x[k] to [(p, xT[k]), (1-p, x[k])]?”

need to fix reference values Ck if using GAI model

• response tightens bound on specific utility parameter

Comparison queries (is x preferred to x’ ?)

• global: “Do you prefer x to x’?”

• local: “Do you prefer x[k] to x’[k] ?”

• impose linear constraints on parameters

Sk uk(x[k]) > Sk uk(x’[k])

• interpretation is straightforward

49

Other Modes of Interaction

Stated choice (global or local) • choose xi from set {x1, …xk}

• imposes k-1 linear constraints on utility parameters

Ranking alternatives (global or local) • order set {x1, …xk} : similar

Graphical manipulation of parameters • bound queries: allow tightening of bound (user controlled)

generally must show implications of moves made

• approximate valuations: user-controlled precision

useful in quasi-linear settings

Passive observation/revealed preference • if choice x made in context c, the x at least as preferred as all

other available alternatives

Active, but indirect assessment • e.g., dynamically generate Web page, with k links

• assume response model: Pr(linkj | u)

Global Comparison Query (GCQ)

50

Local sorting

51

Reference values (local context)

Factor 2 attributes

Anchor bound query (ABQ)

52

Reference values

Factor attributes

Local Bound

Query (LBQ)

53

Reference values (local context)

Scale 0 - 100

70

100

0

Bin 2 (0-70)

54

A General Framework for Elicitation and

Interactive Decision Making

B: beliefs about user’s utility function u

Opt(B): “optimal” decision given incomplete, noisy, and/or imprecise beliefs about u

Repeat until B meets some termination condition • ask user some query (propose some interaction) q

• observe user response r

• update B given r

Return/recommend Opt(B)

Will discuss this in depth over the rest of course

• think about this: what are some appropriate termination criteria?

55

Cognitive Biases: Anchoring**

Decision makers susceptible to context in assessing

preferences (and other relevant info, like probabilities)

Anchoring: assessment of utility dependent on arbitrary

influences

Classic experiment [ALP03]:

• (business execs) write last 2 digits of SSN on piece of paper

• place bids in mock auction for wine, chocolate

• those with SSN>50 submitted bids 60-120% higher than SSN<50

Often explained by focus of attention plus adjustment

• holds for estimation of probabilities (Tversky, Kahneman estimate

of # African countries), numerical quantities, …

How should this impact the design of elicitation methods?

56

Cognitive Biases: Framing**

How questions are framed is critical

Classic Tversky, Kahneman experiment (1981); disease predicted to

kill 600 people, choose vaccination program

• Choose between:

Program A: "200 people will be saved"

Program B: "there is a one-third probability that 600 people will be

saved, and a two-thirds probability that no people will be saved“

• Choose between:

Program C: "400 people will die"

Program D: "there is a one-third probability that nobody will die, and

a two-thirds probability that 600 people will die"

• 72 percent prefer A over B; 78 percent prefer D over C

• Notice that A and C are equivalent, as are B and D

How should this impact design of elicitation schemes?

57

Cognitive Biases: Endowment Effect**

People become “attached” to their possessions

• e.g., experiment of Kahneman, et al. 1990

Randomly assign subjects as buyers, sellers

• sellers given a coffee mug (sells for $6); all can examine closely

• sellers asked: “at what price would you sell?”

• buyers asked: “at what price would you buy?”

• median asking price: $5.79; median offer price: $2.25

would expect these to be identical given random asst to groups

• if sellers are given tokens with a monetary value (can be used later

to buy mugs/chocolate in bookstore), no difference between offers

and ask prices

How should this impact the design of elicitation methods?

References P. C. Fishburn. Interdependence and additivity in multivariate, unidimensional expected utility

theory. International Economic Review, 8:335–342, 1967.

Peter C. Fishburn. Utility Theory for Decision Making. Wiley, New York, 1970.

R. L. Keeney and H. Raiffa. Decisions with Multiple Objectives: Preferences and Value Trade-offs.

Wiley, NY, 1976.

F. Bacchus , A. Grove. Graphical models for preference and utility. UAI-95, pp.3–10, 1995.

John von Neumann and Oskar Morgenstern. Theory of Games and Economic Behavior. Princeton

University Press, Princeton, 1944.

L. Savage. The Foundations of Statistics. Wiley, NY, 1954.

C. Gonzales and P. Perny. GAI networks for utility elicitation. In Proc. of KR-04, pp.224–234,

Whistler, BC, 2004.

D. Braziunas and C. Boutilier. Local utility elicitation in GAI models. In Proc. of UAI-05, pp.42–

49,Edinburgh,2005.

D. Braziunas and C. Boutilier. Minimax regret based elicitation of generalized additive utilities. In

Proc. of UAI-07, 2007.

Daniel Kahneman, Jack L. Knetsch, and Richard H. Thaler, Experimental Tests of the Endowment

Effect and the Coase Theorem, J. Political Economy 98(6), 1990

A Tversky and D Kahneman, The framing of decisions and the psychology of choice, Science 211,

1981.

D. Ariely, G. Loewenstein and D. Prelec (2003), "Coherent arbitrariness: Stable demand curves

without stable preferences," Quarterly Journal of Economics, No.118 (1), (February), 73-105.

58

59

Fishburn’s Decomposition [F67]

Define reference outcome:

For any x, let x[I] be restriction of x to vars I, with

remaining replaced by reference values:

Utility of x can be written [Fishburn67]

• sum of utilities of certain related “key” outcomes

60

Key Outcome Decomposition

Example: GAI over I={ABC}, J={BCD}, K={DE}

u(x) = u(x[I]) + u(x[J]) + u(x[K])

- u(x[IJ]) - u(x[IK]) - u(x[JK])

+ u(x[IJK])

u(abcde) = u(x[abc]) + u(x[bcd]) + u(x[de])

- u(x[bc]) - u(x[]) - u(x[d])

+ u(x[])

u(abcde) = u(abcd0e0) + u(a0bcde0) + u(a0b0c0de)

- u(a0bcd0e0) - u(a0b0c0de0)

61

Canonical Decomposition [F67]

This leads to canonical decomposition of u:

u1(x1, x2) u2(x2, x3)

u(abcde) = u(abcd0e0)

+ u(a0bcde0) - u(a0bcd0e0)

+ u(a0b0c0de) - u(a0b0c0de0)

e.g., I={ABC}, J={BCD}, K={DE}

u(abcde) = u1(abc)

+ u2(bcd)

+ u3(de)

62

Local Queries: Comparison*

63

Local Query: Bound*

64

Local Query: Bound*

65

Global Query: Anchor Comparison*

66

Global Query: Anchor Bound*

67

Weight-bound Manipulation*

Exploit feedback to

encourage sharper

bounds

Display implications of

bound refinement on

pairwise max regret

between mmx-optimal

alloct’n and adversary’s

choice

Need real-time update so

can’t compute new

minimax regret

1



Craig Boutilier



2

Overview

Section 1: Decision Theory and Basics of Preference Elicitation

Section 2: Regret-based and Polyhedral Methods • computational motivations, imprecise utility functions

• minimax regret











3

A General Framework for Elicitation and

Interactive Decision Making

B: beliefs about user’s utility function u

Opt(B): “optimal” decision given incomplete, noisy, and/or imprecise

beliefs about u

Repeat until B meets some termination condition

• ask user some query (propose some interaction) q

• observe user response r

• update B given r

Return/recommend Opt(B)

4

Utility Function Uncertainty

General approaches to representation/decisions • Strict uncertainty models

nonprobabilistic (regret-based)

probabilistic (non-Bayesian)

• Bayesian models

Key components • decision criterion: what decision given beliefs B?

requires effective inference (error metrics)

• effective update (function of queries/interaction)

• elicitation strategy: which query to ask next?

• termination condition: when is decision “good enough”

5

Strict Utility Function Uncertainty

User’s utility parameters w unknown

• u(x; w) linear in utility parameters w

Assume feasible set W

• W defined by linear constraints on w

• Polytope induced by query responses

How should one make a decision? elicit info?

• regret-based approaches

• polyhedral approaches (and other heuristics)

u1(red)+u2(4dr)+u3(150hp) > 0.4

u1(blue)+u2(4dr)+u3(280hp) >

u1(red)+u2(4dr)+u3(150hp)

W

6

• Regret of x under w

• Max regret of x under W

• Minimax regret; optimal option

),(minarg

),(min)(

* WMR

WMRWMMR

f

f

W xx

x

Xx

Xx

)'w,(Rmax)W,(MRW'w

xx

)w;(u)w;'(umax)w,(R

f'

xxxXx

W

x

w

x’

w’

R(x,w)

x

x’

R(x,w’)

x*

x’

R(x*,w’)

Minimax Regret

7

Minimax Regret: An Example

Simple example to contrast minimax regret with maximin

Maxmin: recommends D3 (too cautious?)

MMR recommends D2

• might be worse than D3, but never by more than a little

U1 U2 U3 Min MR

D1 8 2 1 1 5

D2 7 7 1 1 1

D3 2 2 2 2 6

8

Why Minimax Regret?

Minimizes regret in presence of adversary

• provides bound worst-case loss (cf. maximin)

• robustness in the face of utility function uncertainty

In contrast to Bayesian methods:

• useful when priors not readily available

• can be more tractable; see [CKP00/02, Bou02]

• user unwilling to “leave money on the table” [BSS04]

• preference aggregation settings [BSS04]

• effective elicitation even if priors available [WB03]

9

Example Domains

Product configuration: GAI models

• product contraints (feasibility: CSP)

• product database (feasibility: elements of DB)

Winner determination in procurement: additive

• feasibility: solution to combinatorial allocation problem

Resource allocation in autonomic computing

• utility function can only be sampled

• continuous action/outcome space

• sequential (MDP) extensions

Travel planning, mechanism design (auctions and

bargaining), social choice, etc.

10

Computing Minimax Regret

Difficulties computing minimax regret: • minimax (integer) program with quadratic objective

General Approach: • Bender’s decomposition and constraint generation to

break minimax program

• Various encoding tricks to linearize quadratic terms

details and formulation depend on domain

xxXxXx

w'wmaxmaxminMMR(W)

ff 'Ww

Convert MMR to (linear) IP with infinitely many constraints

Simplify to linear (IP) with finitely many constraints

• Here: V(W) are vertices of polytope W and x*(w) is optimal

configuration for utility function w

Still (potentially) exponentially many constraints

11

WwXx

Xx

,';xw'xw. fiii

k

i

i

1

s.t

min

)(,),(;'.s.t

min

*

1

Wwwx

Xx

Vxwxw wiii

k

i

i

MMR: Bender’s Reformulation

Repeatedly solve

• Let solution be x* with objective value *

Compute MR(x*,W) of solution x*: MR = r, with

witness (x’’, w’’)

if r > *, add (x’’, w’’) to Gen, repeat; else terminate

note: (x’’, w’’) is maximally violated constraint

12

Gens.t

min

1

),'(;xw'xw. iii

k

i

i wx

Xx

MMR: Constraint Generation

13

Computing Max Regret* Objective is naturally quadratic

Since factor attribute instantiations are discrete

• quadratic terms: products of binary, continuous vars

• linearized by introducing auxiliary variables: induces (linear) MIP

14

MMR Solution Time: Real estate (20vars, 2-5

values/var, 29 GAI factors,160 parameters)

15

Random Problems (Varying Size)

Randomly generated problems

• for fixed number of vars (up to 5 values per var), factors; randomly set

vars within factors and the utility bounds for each parameter

• running time, constraints grow exponentially with problem size

• number of constraints negligible (47 avg. for 30 vars)

16

Anytime Performance

(*Key to Good Elicitation Performance)

17

Application to GAI Models

Similar techniques can be used as with linear models

• but details vary somewhat

Model is especially simple if only local bound queries

ignoring conditioning sets [BPPS03; BPPS06]

• justifiable if compared to common numeraire (money)

For computation with scaling factors and conditioning

variables, see [BB07]

18

Other Generalizations*

Formulations are much simpler with upper and lower

bounds on weights (hyper-rectangular W)

Nonlinearly definable features

Anytime speed ups

• use approximate solutions (small duality gap) in CG

• use approximate solutions for query generation

e.g. early termimation of CG

• warm start (see elicitation)

Nonlinear utility for specific features

19

Regret-based Elicitation [BSS04, BPPS-05,06]

Minimax optimal solution may not be satisfactory

Improve quality by asking queries

• new bounds on utility model parameters

Which queries to ask?

• what will reduce regret most quickly?

• myopically? sequentially?

Closed form solution seems infeasible for sequential case

• to date we’ve looked at heuristic elicitation

• computing myopically optimal queries often feasible, but

heuristics cheaper and seems to work as well

20

Query Types

Recall query types (both local and global variants)

• Comparison queries (is x preferred to x’ ?)

Sk fk(x[k]) > Sk fk(x’[k])

global or local (with conditioning set fixed)

• Bound queries (is fk(x[k]) > v ?)

global or local (with conditioning set fixed)

Each imposes linear constraints

U U

21

Elicitation Strategies (Bound): Simple GAI

Halve Largest Gap (HLG)

• ask if parameter with largest gap > midpoint

• MMR(U) ≤ maxgap(U), hence nlog(maxgap(U)/e) queries

needed to reduce regret to e

• bound is tight

• like polyhedral-based conjoint analysis [THS04]

f1(a,b) f1(a,b) f1(a,b) f1(a,b) f2(b,c) f2(b,c) f2(b,c) f2(b,c)

22

Elicitation Strategies (Bound): Simple GAI

Current Solution (CS)

• only ask about parameters of optimal solution x* or regret-

maximizing witness xw

• intuition: focus on parameters that contribute to regret

reducing u.b. on xw or increasing l.b. on x* helps

• use early stopping to get regret bounds (CS-5sec)

f1(a,b) f1(a,b) f1(a,b) f1(a,b) f2(b,c) f2(b,c) f2(b,c) f2(b,c)

23

Elicitation Strategies (Bound): Simple GAI**

Optimistic • query largest-gap parameter in optimistic soln xo

Pessimistic • query largest-gap parameter in pessimistic soln xp

Optimistic-pessimistic (OP) • query largest-gap parameter xo or xp

Most uncertain state (MUS) • query largest-gap parameter in uncertain soln xmu

CS needs minimax optimization; HLG needs no optimization; others require standard optimization

None except CS knows what MMR is (termination is problematic)

24

Results (Small Rand, Unif)

10vars; < 5 vals 10 factors, at most 3 vars Users drawn using uniform prior over parameters (45 trials) Gaussian priors similar

25

Results (Car Rental, Unif)

26 vars; 61 billion configs 36 factors, at most 5 vars; 150 parameters Users drawn using uniform prior over parameters (45 trials) Gaussian priors similar

26

Results (Real Estate, Unif)

20 vars; 47 million configs 29 factors, at most 5 vars; 100 parameters Users drawn using uniform prior over parameters (45 trials) Gaussian priors similar

27

Results (Large Rand, Unif)

25 vars; < 5 vals 20 factors, at most 3 vars Users drawn using uniform prior over parameters (45 trials) Gaussian priors similar

Elicitation Strategies (Comparison)

Comparison queries can be generated using CSS too

• HLG is harder to generalize to comparisons (see polyhedral)

CSS: ask user to compare minimax optimal solution x*

with regret-maximizing witness xw

easy to prove this query is never “vacuous”

28

29

Summary of Results

CS works best on test problems

• time bounds (CS-5): little impact on query quality

• always know max regret (or bound) on solution

• time bound adjustable (use bounds, not time)

OP competitive on most problems

• computationally faster (e.g., 0.1s vs 14s on RealEst)

• no regret computed so termination decisions harder

Other strategies less promising (incl. HLG)

30

Interpretation

Provable regret reduced very quickly

• true regret faster (often to optimality)

• CS focuses on relevant parameters

Seems like a lot of bound queries

• problems very large (several hundred dimensions)

• several hundred queries quite reasonable for high stakes

domains (cf. manual scenario navigation!)

Comparison queries can work much better empirically

(see CA results [BSS04])

Apartment Search with Minimax Regret

Are users comfortable with minimax regret?

Study with UofT students

• search subset of student housing DB (100 apts) for rental

• GAI model over 9 variables, 7 factors

• queries generated using CSS (bound, anchor, local, global)

various conditions: GAI, no context, additive

continue until MMR=0 or user terminates (“happy”)

• post-search: let user search through entire DB to find best 10 or

so apartments

31

32

Apartment Search (DB= 100, 9 attr, 6 factors)

../../../../CraigWork/RESEARCH/SLIDES/ECAI-ACAI09/Darius_demo2.swf

User Study Goals

Aim: test the efficacy, comprehension, acceptability of

MMR-based recommendation

• User comments and ratings

• Decision quality and time to decision

Evaluate

• GAI model vs. additive model

• significance of local context

• different query strategies (mix of types vs. GCQP)

Assess query costs • Time

• Perceived difficulty

33

User study design

40 participants, randomly assigned to 6 different

subgroups

Task 1: search for an apartment in Toronto using UTPref

to search through a DB of 100 apartments

Task 2: evaluate recommended apartment and UTPref

experience

34

45

Recommended apartment

“Final list” of highest-rated and low-regret apartments

Results: overall evaluation

46

I found this application easy to use 6.35 7

I am satisfied with recommendation 5.35 5

I fully understood all questions 6.30 6

Some questions were too hard 1.65 2

The task took too much time 2.23 2

(1-strongly disagree, 7 – strongly agree)

Average Median

Results: recommendation quality

47

Results: time

UTPref recommendation process

• Average 481s (scalable)

Linear search through database (100 appts.) with

“support”

• Average 708s (not scalable)

• have familiarity from phase 1

48

Results: comparison of subgroups

GAI vs. additive models

• GAI performs better (rank, qrank) [no statistical significance]

Mix of queries vs. global comparison only

• GCQP is faster

• mix of queries: better quality results [no statistical significance]

•GAI with and without local context

• no detectable difference

49

Summary of Results

Qualitative Results:

• system-recommended apartment almost always in top ten

• if MMR-apartment not top ranked, error (how much more is top

apartment worth) tends to be very small : median $45

• very few queries/interactions needed (8-40); time taken roughly

1/3 of that of searching through DB with our tools

• user feedback: comfortable with queries, MMR, felt search was

efficient

50

51

Regret-Based Methods: Summary

Minimax regret a valuable means to make decisions of

behalf of others in presence of utility function uncertainty

• requires no prior over utility function

Computationally effective means to solve many

interesting classes of problems

• works well compared to ACA/PACE methods

Heuristic elicitation methods appear to work well

52

How to improve?

GAI, MMR have some nice properties

• does not assume restrictive additive model

• all queries are semantically sound/motivated

• MMR drives elicitation effectively, offers guarantees

prior-free, but defaults/priors could be exploited

• flexible forms of interaction (costs can be used)

So how do we move from this to the “vision” of the truly

intelligent agents (e.g., travel agent, real estate agent)?

References F. Bacchus and A. Grove. Graphical models for preference and utility. UAI-95, pp.3–10, 1995.

A. Ben-Tal, A. Nemirovski. Robust solutions of uncertain linear programs. Operations Res. Letters, 25:1–13, 1999.

C. Boutilier, F. Bacchus, and R. I. Brafman. UCP-Networks: A directed graphical representation of conditional

utilities. In Proc. of UAI-01, pp.56–64, Seattle, 2001.

C. Boutilier,R. Das, J. O. Kephart, G. Tesauro and W. E. Walsh. Cooperative Negotiation in Autonomic Systems

using Incremental Utility Elicitation. UAI-03, Acapulco, pp.89–97 (2003).

C. Boutilier, R. Patrascu, P. Poupart, and D. Schuurmans. Constraint-based optimization and utility elicitation using

the minimax decision criterion. Artifical Intelligence, 170(8–9):686–713, 2006.

C. Boutilier, T. Sandholm, and R. Shields. Eliciting bid taker non-price preferences in (combinatorial) auctions. In

Proc. of AAAI-04, pp.204–211, San Jose, CA, 2004.

D. Braziunas and C. Boutilier. Minimax regret based elicitation of generalized additive utilities. UAI-07, 2007

D. Braziunas and C. Boutilier. Assessing Regret-based Preference Elicitation with the UTPREF Recommendation

System. In Proc. of ACM EC-10, 2010.

Vijay S. Iyengar, Jon Lee, and Murray Campbell. Q-Eval: Evaluating multiple attribute items using queries. Third

ACM Conference on Electronic Commerce, pages 144–153, Tampa, FL, 2001.

S. Ghosh, J. Kalagnanam. Polyhedral Sampling for Multiattribute Preference Elicitation. Fifth ACM Conference on

Electronic Commerce, pages 256-257, San Diego, 2003.

P. Kouvelis and G. Yu. Robust Discrete Optimization and Its Applications. Kluwer, Dordrecht, 1997.

R. Patrascu,, C. Boutilier, R. Das, J. O. Kephart, G. Tesauro andW. E.Walsh. New Approaches to Optimization and

Utility Elicitation in Autonomic Computing. AAAI-05, pp.140–145, Pittsburgh (2005).

A. Salo and R. P. Hämäläinen. Preference ratios in multiattribute evaluation (PRIME)–elicitation and decision

procedures under incomplete information. IEEE Trans. on Systems, Man and Cybernetics, 31(6):533–545, 2001.

L. Savage. The Foundations of Statistics. Wiley, NY, 1954.

Olivier Toubia, John Hauser, and Duncan Simester. Polyhedral methods for adaptive choice-based conjoint

analysis. Technical Report 4285-03, Sloan School of Management, MIT, Cambridge, 2003.

T.Wang and C. Boutilier. Incremental utility elicitation with the minimax regret decision criterion. In Proc. of IJCAI-

03, pp.309–316, Acapulco, 2003.

53

54

Decision Problem: Constraint Optimization

Standard constraint satisfaction problem (CSP):

• outcomes over variables X = {X1 … Xn}

• constraints C over X : feasible decisions/outcomes

generally compact, e.g., X1 & X2 ¬ X3

e.g., Power > 280hp & Make=BMW FuelEff > 9.5l/100km

e.g., Volume(Supplier27) > $10,000,000

Feasible solution: a satisfying variable assignment

Constraint-based/combinatorial optimization:

• add to C a utility function u: Dom(X) → R / [0,1]

• u parameterized compactly (weight vector w)

e.g., linear/additive, generalized additive models

55

Polyhedral Conjoint Analysis* [THS04]

CA approach to marketing, product design

• often “unconstrained” design

• adaptive CA: queries depend on previous responses

Polyhedral adaptive method: FastPACE [THS04]

Assume additive utility, discrete attributes

Query types (global outcomes)

• Choice scenarios: pick/rank from a list

• Metric paired comparisons

A “much/somewhat/barely” better than B?

• Induce linear constraints on utility space

56

FastPACE: Decisions*

Polyhedron P of feasible utility

vectors (given prior responses)

Analytic center (AC) of P

• maximizes geometric mean of

distances to each facet

• relatively easy to compute

Treat AC as “consensus” utility

function (approx. average assuming

uniform distr. over U)

Decision: max wrt AC w1

w2

AC

57

FastPACE: Elicitation Principles*

Principles:

• volume of region where each choice

dominates others should be roughly

equal (“balanced”)

• avoid “short axis” cuts: get precision

where it’s needed most

reduce perimeter as well as volume

Note: exactly HLG strategy if

restricted to local (axis parallel)

queries w1

w2

58

FastPACE: Elicitation Heuristic*

Compute AC c of P

Find tightest bounding ellipsoid of P

centered at c

• finding longest axes in P too hard

Find intersection (ui) of each axis with

boundary of P

Compute product profiles for each ui

that will induce good cuts

w1

w2

u1

u2

u3

u4

59

FastPACE: Computing Profiles*

Fix some constant m

Compute xi by solving:

• max xi ui s.t xic m

• indifference curve between xi and xj will

(approx) pass thru c

Choice query: choose from one of the

computed profiles

• response will induce one of small

regions

w1

w2

u1

u2

u3

u4 c

60

Alternatives*

GK04: • Assume “uniform distribution” over U within P

• Use Markov chain sampling procedure to estimate centroid (avg utility function) and “longest axis” (queries)

ILC04: • Use bounding box rather than bounding ellipsoid

Summary of PACE • OK in practice, small dimensions

• Downsides: decision criterion, doesn’t exploit feasibility (volume based), only global queries

61

Why Minimax Regret?*

Appealing decision criterion for strict uncertainty

• contrast maximin, etc.

• not often used for utility uncertainty [BBB01,HS010]

x

x’

x’ x

x

x’ x’

x

x x’

x

x’

Bett

er

u1 u2 u3 u4 u5 u6

62

Reverse Combinatorial Auctions

Buyer: desires collection of items G

Sellers: offer bids bi,pi, where bG

• possibly side constraints (seller, buyer)

Feasible allocation: subset B’ B covering G

let X denote the set of feasible allocations

Winner determination: least-cost allocation

63


A and B for $12000. C and D for $5000…

A for $10000.

B and D for $5000 if A; B and D for $7000 if not A...

Joe

Hank

etc…

A, C to Fred. B, D, G to Frank. F, H, K to Joe…

Cost: $57,500.

That gives too much business

to Joe!!

64

Utility for Non-price Attributes

Assume utility bearing features F = { f1 ,…,fk }

• e.g., num-winners, avg. quality, suppliers from region R,…

Assume utility u for allocation is linear

Utility function u: non-negative weight vector: w = w1,…,wk

• WD algorithms can be used directly

Assumptions (can be relaxed):

• linear, independent utility for features

• linearly definable features

65

Automated Scenario Navigation

Given partial utility info, I suggest allocation x (least max-regret). It could be up to $8000 from optimal. Accept?

No, that’s too much potential error.

OK. Let’s refine your utility function: would you prefer x ($, %Joe, AvgQual) or

x’ ($’, %Joe’, AvgQual’)?

I definitely prefer x’.

OK. I suggest allocation x’’ (least max regret). It could be up to $2500 from

optimal. Accept?

66

Robust Winner Determination:


Now assume:

• allocation features f1, …, fk

• linear utility

• unknown weight vector w, but we know it lies in polytope W

Minimax regret optimal allocation wrt W:

67

Bender’s Reformulation

Initial formulation: minimax IP, quadratic objective

Linear IP formulation (infinitely many constraints)

Linear IP formulation (finitely many constraints)

68

Constraint Generation

Avoid W-vertex enumeration: constraint generation

Let Gen = {(x’,w)} for some feasible x’, wW

• solve

let solution be x* with objective value *

• compute max regret MR(x*,W) of solution x*

solution has max regret r, witness (x’’, w’’)

• if r > *, add (x’’, w’’) to Gen, repeat; else terminate

note: (x’’, w’’) is maximally violated constraint

69

Details of Linearization**

Discrete features in objective modeled as follows:

• e.g.,

Replace quadratic term wi Iij(x’) by new variable zij(x’)

Constrain (where mi ≤wi ≤ Mi are bounds on weights)

70

Constraint Generation Performance

Procurement: 10 suppliers, 50 items, 500 bids

• six features (#suppliers in each of five regions, overall)

Avg. WD Sol’n Time: 19 sec

Avg. Max Regret and True Regret as function

of number of comparison queries: 40 items, 400 bidders, 7 features (100 instances)

72

Number of comparison queries to reach max

regret zero: 40 items, 400 bidders, 7 features (100

instances)

73

1



Craig Boutilier



2

Overview Section 1: Decision Theory

• Basics, Axioms, Multiattribute utility, Preference Representations

Section 2: Basics of Preference Elicitation

• Queries, revealed preference, gambles, MAUT methods



• minimax regret

• polyhedral conjoint analysis, volumetric methods Section 4 (optional): Bayesian methods, preference aggregation

• generic Bayesian model, classification-based approach

• aggregation methods, collaborative filtering (brief overview)

Section 5: Elicitation in Mechanism Design



Section 6: Elicitation in Social Choice: Voting and Stable Matching

• basics of social choice, voting, stable matching problems



3

Product Configuration



4

Bargaining for a Car



$$ $$

$$

$$

$$

$$

$$

$$

5

Mechanism Design

Incentive to misreport preferences places us in the realm

of mechanism design

Design protocol for interacting agents

• maximize some objective assuming self-interest

• generally, a social choice function (e.g., efficiency) that picks

outcome based on agent preferences

• e.g., auctions, bargaining, network protocols, facility location, …

Revelation principle

• focus on direct, incentive compatible mechanisms

• e.g., famous mechanisms like VCG

• these require each agent to reveal their entire utility function to

the mechanism

6

The Preference Bottleneck

Full utility revelation problematic as we’ve seen

• computational, cognitive, communication costs

• often most of this information is not relevant

• preference elicitation tackles this is single-agent case

elicit only “relevant” information

tradeoff decision quality with elicitation effort

Can we apply the same ideas in multiagent settings?

Key issue: must address the issue of incentives

• may be in an agent’s interest to lie about its true preferences

Example: First Price Auction

Auction off a single good

• accept (sealed) bids from each potential buyer

“I am willing to buy this for X”

• winner is highest bidder

• Winner pays her bid

Clear incentive to hedge your bid

• you should not state what you are truly willing to pay

• bid should depend on your beliefs of others’ valuations

means you won’t generally be efficient

there are exceptions (e.g., common priors under fairly general

assumptions)

7

Inefficiency of First Price Auction

8

$$ $$ Agent A •Value $25,000 •Believes B’s value around $17K •Bids $17K+D (say., $18K)

Agent B •Value $22,000 •Believes A’s value around $19K •Bids $19K+D (say., $20K)

Resulting outcome is inefficient: •B wins auction at $20K, realizes surplus of $2K •If A had bid $20K+e, would have won with surplus $5K-e

•A could have paid B $2500 not to bid and all would be better off

Example: Second Price Auction

Auction off a single good

• accept (sealed) bids from each potential buyer

“I am willing to buy this for X”

• winner is highest bidder

• Winner pays the second highest bid

No incentive to hedge your bid

• you should state what you are truly willing to pay

• this is independent of what others bid

• consequence: winner will be bidder with highest valuation (i.e.,

efficient outcome)

9

Second Price (Vickrey) Auction

Why not bid more than true valuation?

• suppose your bid is highest?

• suppose your bid is not highest?

Why not bid less than true valuation?

• suppose your bid is highest?

• suppose your bid is not highest?

Basic principle:

• charge person based on externality they imposed on system

(other bidders)

• what would the best outcome have been had the winner not

participated?

10

11

Preference Elicitation in Mechanism Design

Considerable work on preference elicitiation in auctions,

combinatorial auctions (Sandholm, Nisan, Parkes, etc.)

• sequentially elicit enough info for optimal winner determination

and VCG payments

• e.g., A prefers XY most, B prefers WZ most; no need to

determine their preferences for other bundles or precise values

Drawbacks

• not general (methods focus specifically on auctions, CAs)

• usually sequential, not one-shot

• requires determining optimal allocation (not an approximation);

hence can’t offer savings in general (Nisan&Segal 05)

12

Preference Elicitation in Mechanism Design

Move beyond this: allow approximately optimal allocation

• adopts the view of work in single-agent preference elicitation

specifically, regret-based allocation models we just studied

• difficulties: dominant strategy implementation not possible

(Roberts 1979; Fadel&Segal 2005)

• we go for approximate, (ex-post or DS) implementation

• provide a general view of the problem of (one-shot and

sequential) partial revelation mechanisms

13

Overview (Hyafil and Boutilier 2006, 2007)

General framework for PRM design

One-shot mechanisms

• generalize VCG to PRMs (approximate efficiency/regret)

• new payment schemes to induce approximate IC and IR in

dominant strategy implementation (DSE if zero regret)

• further optimization of secondary objectives

• algorithm to design partial types

Sequential mechanisms

• slightly different model, but similar results

• algorithm to design query strategy

Viewpoint: why approximate incentives are useful

14

Basic Mechanism Design Setup

Choice of x from outcomes X

• e.g., car1 = <red,2door,280hp,Audi>, car2, car3, …

Agents 1..n: type ti Ti and valuation vi(x, ti)

• e.g., tbuyer = [car1:$25072, car2:$14991, car3:$17623…]

Type vectors: tT and t-i T-i

15

Basic Mechanism Design Setup

Goal: optimize social choice function f: T X

• e.g., social welfare SW(x,t) = vi(x, ti)

• car-seller pair that maximizes surplus

Assume payments and quasi-linear utility:

• ui(x, i ,ti ) = vi(x, ti ) - i

Our focus: SW maximization, quasi-linear utility with the possibility of

payments

16

Mechanism Design

A mechanism m consists of three components:

• actions Ai

• allocation function O: A X

• payment functions pi : A R

m induces a Bayesian game

• m implements social choice function f if

in equilibrium it induces each agent i to play strategy i

and O((t)) = f(t) for all tT

• note the dependence on the type of equilibrium

e.g., dominant strategy, ex post, Bayes Nash

17

A Simple Mechanism: Allocate an Object

Each agent has three types: values object either 1, 2 or 3

What is best strategy for each agent?

Rt Lft Both

Rt ½ ; ½ A A

Lft B ½ ; ½ A

Both B B ½ ; ½

Agent A

Age

nt B

Outcome Function

Rt Lft Both

Rt 1 1 1

Lft 1 2 2

Both 1 2 3

Agent A

Age

nt B

Payment Function (paid by winner)

18

Direct Mechanisms

A direct mechanism is one where Ai =Ti

A direct mechanism is incentive compatible if truth-telling

is an equilibrium strategy for each agent

Revelation principle: if there is a mechanism that

implements social choice function f, then there is a direct,

incentive compatible mechanism that implements f

Revelation principle has placed focus is on mechanisms

where agents reveal their utility functions

Note: famous theorem of Gibbard-Satterthwaite ensures

implementation of arbitrary SCFs not possible in general

19

A Direct Mechanism: Allocate an Object

What is best strategy for each agent?

1 2 3

1 ½ ; ½ A A

2 B ½ ; ½ A

3 B B ½ ; ½

Agent A

Age

nt B

Outcome Function

1 2 3

1 1 1 1

2 1 2 2

3 1 2 3

Agent A

Age

nt B

Payment Function (paid by winner)

20

Groves Schemes

For example, Groves scheme:

• elicit all agent utility functions

• determine/select efficient (SWM) allocation

• charge agents using the following payment function:

Groves implements SWM in dominant strategies

• transfer hi (arbitrary) doesn’t depend on i’s report

• so i influences utility by revealing utility function that maximizes

her own utility for allocation x* plus social welfare of other

agents; i.e., (full) social welfare

• since allocation rule maximizes SW, i should report true

type/utility

VCG Mechanism

VCG is a specific Groves scheme where the function hi is

given by the social welfare received by other agents had i

not be present

• hi(t-i) = SW(x*-i (t-i); t-i )

This “Clark payment” ensures individual rationality (no

agent has incentive not to participate) as well as incentive

compatibility in dominant strategies

21

Partial Revelation

Expecting users/agents to reveal their entire utility

function to a mechanism is simply unrealistic

• computational, cognitive, communication, revelation costs

• generally don’t need all that information to determine a good

outcome (e.g., maximizing social welfare)

Can we adapt Groves/VCG to partial revelation or

incremental utility elicitation mechanisms?

A stumbling block: incentives!

22

23

The Generality of Groves Schemes

Strong results: Groves is basically the “only choice” for

dominant strategy implementation

Roberts (1979): only social choice functions

implementable in dominant strategies are affine welfare

maximizers (if all valuations possible)

• Fadel & Segal 2005 extend somewhat

Green and Laffont (1977): must use Groves payments

to implement affine maximizers

Implications for partial revelation? Coming…

24

Partial Revelation Mechanisms

Full revelation unappealing for many reasons

A partial type is any subset i Ti

A one-shot (direct) partial revelation mechanism

• each agent reports a partial type i i

• typically i partitions type space, but not required

A truthful strategy: ti i(ti ) for all ti Ti

Goal: minimize revelation, computation, communication

by suitable choice of partial types

Intuitive Illustration

Ask buyer for bounds on valuations of specific cars

• what would you pay for a Toyota?

< $20K, [$20-22K], [$22-23K], [$23-25K], [$25-30K], > $30K

• what would you pay for an Audi?

< $32K, [$32-36K], [$36-45K], > $45K

Can be broken up by attributes, local factors, pairwise

comparisons, etc.

Queries can be asked in sequence (needn’t express type

in one shot)

• “one shot” means that the queries one asks cannot be influenced

by the responses to previous queries

• “sequential” means it can (and changes equilibrium properties)

25

26

Implications of Roberts/Green-Laffont

Partial revelation means we can’t generally maximize

social welfare

• must allocate under type uncertainty (as in single-agent case)

• also cannot compute exact Groves payments

But if SCF is not an affine maximizer, or if we don’t use

Groves payments, we can’t expect dominant strategy

implementation in general!

What are some solutions?

• incremental and “hope for” less than full elicitation

• relax conditions on Roberts results

• relax the solution concept and hope for intuitive results

27

Existing Work on PRMs

Bisection auction (GHMV-02) and incremental elicitation in

CAs (Sandholm et al., Nisan, Parkes, etc.)

• require enough revelation to determine optimal outcome and (to

ensure incentives) to determine VCG payments

• incremental (so ex post rather than DSE)

Priority games (Blumrosen&Nisan 02)

• genuinely partial and approximate efficiency

• but very restricted valuation space

Ascending (and other) auction designs are PRMs to

some extent, but not fully

• most agents must still determine exact valuation (up to

precisions of bid increment)

28

Regret-based PRMs [HB06,07a,07b]

In any PRM, how is allocation to be chosen?

Let’s use minimax regret

• Pairwise regret of wrt partial type vector

• Max regret and minimax regret

• x*() is minimax optimal decision for

A regret-based PRM: O()=x*() for all

xx ˆ,

29

Regret-based Allocations

buyer

loveAudi,hateToyota

loveAudi,ToyotaOK

likeAudi,loveToyota

likeAudi,ToyotaOK

seller

lotsAudi,fewToyota

someAudi,fewToyota

someAudi,someToy

MatchToyota? SWToy: Low SWAudi: High MaxRegr: High

Match Audi? SWToy: Med. SWAudi: OK MaxRegr: Low

30

Regret-based PRMs: Efficiency

Efficiency not possible with PRMs (unless MR=0)

• but bounds are quite obvious

Prop: If MR(x*(),) e for all , then any regret-

based PRM m is e-efficient for truthtelling agents.

• thus we can tradeoff efficiency for elicitation effort

• but how do we ensure truthfulness?

31

Regret-based PRMs: Incentives

Can generalize Groves payments

• let fi (i) be an arbitrary type in i

Thm: Let m be a regret-based PRM with partial types

and a partial Groves payment scheme. If MR(x*(),) e

for all , then m is e-dominant strategy incentive

compatible.

In other words, no agent can gain more than e by

misreporting their preferences (no matter what other

agents do)

32

Regret-based PRMs: Rationality

Can generalize Clark payments as well

Thm: Let m be a regret-based PRM with partial types

and a partial Clark payment scheme. If MR(x*(),) e

for all , then m is e-ex post individually rational.

In other words, no agent can gain more than e by

abstaining from participation

• A Clark-style regret-based PRM gives approximate efficiency,

approximate IC (dominant) and approximate IR (ex post)

33

Approximate Incentives and IR

Natural to trade off efficiency for elicitation effort

Is approximate IC acceptable?

• note that computing a good “lie” can be difficult

• if incentive to deviate from truth is small enough, then formal,

approximate IC ensures practical, exact IC

Is approximate IR acceptable?

• determining value of nonparticipation very difficult; if potential

gain of withdrawing is small enough, then m is “practically” IR

Thus regret-based PRMs offer scope for tradeoffs

• as long as we can find a good set of partial types

34

Computing MMR Allocations

Minimax regret optimization nontrivial

• connected to work in robust optimization (see previous Sections)

• especially difficult for large, multiattribute models

Use methods for (single-agent) PE/optimization

• exploit generalized additive independence

• assume partial types represented by linear constraints over

utility function parameters

• optimization as a semi-infinite IP using constraint generation

(Bender’s style decomposition)

35

Partial Type Optimization

Designing PRM: must pick partial types

• we focus on bounds on utility parameters

Here’s a simple greedy approach

• Let be current partial type vectors (initially {T} )

• Let =(1,… i,…n ) be partial type vector with greatest

MMR

• Choose agent i and suitable split of partial type i into ’i and ’’i

• Replace all [i ] by pair of vectors: i ’i ;’’i

• Repeat until bound e is acceptable

36

The Mechanism Tree

37

A More Refined Approach

Simple model has drawbacks

• exponential blowup (“uniform” partitioning)

• split of i useful in reducing regret in one partial type vector ,

but is applied at all partial type vectors

Refinement

• apply split only at leaves where it is “useful”

keeps tree from blowing up, saves computation

• new splits traded off against “cached” splits

• once done, use either uniform/variable resolution types for each

agent

38

Uniform vs. Variable Resolution

i

p1

p2

i

p1

p2

39

Heuristic for Choosing Splits

Adopt variant of current solution strategy

Let be PTV with max MMR

• optimal solution x* regret-maximizing witness xw

• only split on parameters of utility functions of optimal solution x*

or regret-maximizing witness xw

• intuition: focus on parameters that contribute to regret

reducing u.b. on xw or increasing l.b. on x* helps

• pick agent-parameter pair with largest gap

40

Suggestive Empirical Results

To illustrate:

• use only naïve (split) algorithm

• single buyer, single seller

• 16 goods specified by 4 boolean variables

• valuation/cost given by GAI model

two factors, two vars each (buyer/seller factors are different)

thus 16 values/costs specified by 8 parameters

no constraints on feasible allocations

41

Suggestive Empirical Results

42

Interpretation of Results

Initial Regret: 50%-146% of optimal SW

• reduced to 20%-56% with 11 bits (regret-based splits)

reduced to 30%-86% (uniform)

• good anytime behavior

• savings relative to uniform:

5.5 bits vs. 11 to reach worst-case regret of 90

6.5 bits vs. 11 to reach average-case regret of 70

11 bits of communication

• 0.7 bits per good/item; 1.4 bits per utility function parameter

43

Sequential PRMs

Optimization of one-shot PRMs unable to exploit

conditional “queries”

• e.g., if seller cost of x greater than $, needn’t ask you for your

valuation of x

Sequential PRMs

• incrementally elicit partial type information

• apply similar heuristics for designing query policy

• incentive properties somewhat weaker: opportunity to

manipulate payments by altering the query path

thus additional criteria can be used to optimize

44

Sequential PRMs: Definition*

Set of queries Qi

• response rRi(qi) interpreted as partial type i (r) Ti

• history h: sequence of query-response pairs possibly followed by

allocation (terminal)

Sequential mechanism m maps:

• nonterminal histories to queries/allocations

• terminal histories to set of payment functions pi

Revealed partial type i(h): intersection ri of in h

m is partial revelation if exists realizable terminal h s.t.

i(h) that admits more than one type ti

45

Sequential PRMs: Properties*

Strategies i(hi ,qi ,ti) selects responses

• i is truthful if ti i (i(hi ,qi ,ti))

• truthful strategies must be history independent

(Deterministic) strategy profile induces history h

• if h is terminal, then quasi-linear utility realized

• if history is unbounded, then assume utility = 0

Regret-based PRM defined as in one-shot

• payment schemes vary a bit

46

Max VCG Payment Scheme*

Assume terminal history h

• let be revealed PTV at h, x*() be allocation

Max VCG payment scheme:

• where VCG payment is:

47

Incentive Properties*

Suppose we elicit type info until MMR allocation has max

regret and we use “max VCG”

Define:

Thm: m is -efficient, -ex post IR and (+e(x*()))-ex

post IC

• weaker results due to possible payment manipulation

48

Elicitation Approaches*

Standard max regret based approaches

• give us bounds on efficiency , no a priori e bounds

Two-phase (2P): akin to existing schemes

• once small enough (e.g., 0), elicit additional payment

information until max e is small enough

Neither reduces manipulability directly

• allocation chosen to minimize SW-loss

• payments chosen to reduce manipulability

• sum provides upper bound on manipulability

• but allocation choice influences manipulability as well

49

Direct Manipulability Measures*

Greatest gain agent can realize by lying: • difference of “best case” and actual utility

• mechanism in a-manipulable iff

Thm: If m is a-manipulable with partial VCG payments,

then m is a-efficient, a-ex post IR and a-ex post IC.

50

Manipulability Reduction*

Direct optimization

• ask queries that directly reduce a bound

• can be formulated as regret-style optimization

• analogous query strategies possible, but don’t work very well

Hybrid approach to elicitation

51

Suggestive Empirical Results*

Similar bargaining problem (larger)

1 buyer, 2 sellers

• 13 factors per agent, 1-4 vars/factor, 2-9 values/var

• 825 parameters per value/cost model

Compare

• 2-phase approach (2P): plot +e(x*())

• a2-phase (a2P): same, plot a

• a hybrid approach: query parameters that have potential to

reduce both SW-regret and manipulability regret

52

Empirical Results (40 random agent profiles)

CH summary: •mnp=0 after 95 queries •regret=0 after 71 queries •only 8% of parameters queried (avg) •92% of utility uncertainty (perim) remains (cf. 64% with “halving” in same number of queries… and far from zero-regret)

53

Summary

Partial revelation mechanisms important, especially in

multiattribute, combinatorial domains

Formalization of PRMs and regret-based design

• approximate efficiency requires relaxation of DS/IC

• our approach offers bounds on efficiency, IC, IR

• more importantly, design framework allows tradeoffs

Mechanism optimization techniques

• so far fairly crude, but encouraging emprically

• leverages existing regret-based optimization

• much to be done!

References L. Blumrosen and N. Nisan. Auctions with severely bounded communication. FOCS-02, 2002.

W. Conen, T. Sandholm. Partial-revelation VCG mechanisms for combinatorial auctions. AAAI-02.

N. Hyafil and C. Boutilier. Partial Revelation Automated Mechanism Design. AAAI-07, pp.72–78,

Vancouver (2007).

N. Hyafil and C. Boutilier. Mechanism Design with Partial Revelation IJCAI-07, pp.1333–1340,

Hyderabad, India (2007).

N. Hyafil and C. Boutilier. Regret-based Incremental Partial Revelation Mechanisms. AAAI-06,

pp.672–678, Boston (2006).

N. Hyafil and C. Boutilier. Regret Minimizing Equilibria and Mechanisms for Games with Strict Type

Uncertainty. UAI-04, pp.268–277, Banff, AB (2004).

A. Mas-Colell, M. D. Whinston, and J. R. Green. Microeconomic Theory. Oxford University Press,

New York, 1995.

N. Nisan and A. Ronen. Computationally feasible VCG mechanisms. ACM EC-00, 242-252, 2000.

N. Nisan and I. Segal. The communication requirements of efficient allocations and supporting

prices. J. Econ. Th., 2005.

Sandholm, T. and C. Boutilier. Preference Elicitation in CombinatorialAuctions, in Combinatorial

Auctions, P. Crampton, Y. Shoham and R. Steinberg (eds.), MIT Press, pp.233–264, Jan. 2006.

T. Sandholm, V.Conitzer, and C. Boutilier. Automated design of multistage mechanisms. IJCAI-07,

Hyderabad, India, 2007.

M. Zinkevich, A. Blum, and T. Sandholm. On polynomialtime preference elicitation with value

queries. ACM EC-03, San Diego, 2003.

54

1



Craig Boutilier



2




• minimax regret





Section 4: Elicitation in Voting • basics of social choice, voting





Social Choice

Social choice: study of collective decision making

Aggregation of individual preferences determines a consensus outcome for some population

• Political representatives, committees, public projects,…

• Studied for millennia, formally for centuries

???

3

Why Computational Social Choice

Computational models/tradeoffs inherently interesting

• Winner determination, manipulation, approximations,

computational/communication complexity

Decision making in multiagent systems

Preference and rank learning in machine learning

• Ready availability of preference data of millions of individuals

• Web search data, ratings data in recommender systems, …

• Often implicit; but explicit preferences available at low cost

???

4

Our Agenda

Move to lower stakes, complex domains makes new

demands on social choice

• New models and decision criteria reflecting new uses

Focus today: minimizing amount of information needed to

come to good consensus choice

• Robust decision making with partial rankings/votes

• Incremental elicitation of voter preferences

• Incremental elicitation of preferences in stable matching

• Exploiting distributional information to make decisions and

minimize expected elicitation effort

• Learning probabilistic models of population preferences

5

Social Choice: Basic Framework

Alternative set A = {a1, …, am }

Voters N = {1..n}, each with preferences over A

Vote vi of voter i: a linear ordering (permutation) of A

Profile is collection of votes v = (v1, …, vn )

Consensus winner: alternative maximizing “consensus”

> > :

> > :

> > : 6

Voting Rules

Voting rule r: V →A selects a winner given a profile

Plurality: winner a with most 1st-place votes

• voters needn’t provide full ranking

Positional scoring: Assign score α to each rank position

with α(1) ≥ α(2) ≥ … α(m)

• Borda count well-known: α = <m,m-1,...,1>

• Winner: a with max sum of scores: ∑i α(vi (a))

• Plurality, k-approval, k-veto special cases

Maxmin Fairness (egalitarian):

• Score of a is min {i: m - vi (a)}

• Choose a with highest score

> >

3 2 1

7

Score-based Voting Rules

Many other rules: Copeland, maximin, Bucklin, etc.

Most voting rules have natural scoring functions

s(a, v) measures “quality” of alternative a given profile v

Rule r is consistent with s iff

r(v) argmax {s(a, v) : a A}

8

Vote Elicitation [Lu, Boutilier, IJCAI-11]

Use of complex (rank-based) voting schemes rare

• Cognitive complexity, communication costs

Elicitation of partial votes could ease this burden

• Find relevant comparisons… or even approximate winners

Voting Protocol with Approximation: Ask a few

queries of voters: if close enough, stop; otherwise ask a

few more; continue until satisfied

Theoretically, relevance won’t save much:

• Comm. compl. O(nm log m) for Borda, etc. [CS ACM-EC-05]

• This doesn’t mean practical savings are not possible!

9

Partial Vote Profiles

Partial vote pi of voter i: consistent set of pairwise

comparisons of form aj ≻ ak • Captures most natural constraints: paired comp, top-k, etc.

Partial profile p = (p1, …, pn)

Completions C(pi), C(p) : set of votes extending pi , p

10

a b c

d e

f ≻

≻ ≻ ≻ ≻

Robust Winner Determination

In general, may want to decide given a partial profile • Robustness criteria rarely discussed in social choice

We propose minimax regret to determine winners

11

Adversarial

choice

Best

response

Minimax Regret: Illustration (Borda)

>

>

Proposed Winner: Tennis


>

>

Borda Score(Tennis) = 2

Borda Score (Park) = 4

Max Regret(Tennis) = 2 (4-2)

> >

>

>

Proposed Winner: Tennis

Proposed Winner: Pool



Borda Score (Park) = 4

Max Regret(Tennis) = 2 (4-2)

> > Proposed Winner: Tennis

Proposed Winner: Pool


Borda Score (Pool) = 0

Max Regret(Pool) =6 (6-0)

> >

> > Minimax Optimal: Tennis

Minimax Regret: 2

Why Minimax Regret*

Rationale is same as in single-agent decision problems

MMR offers a natural robustness criterion

• candidate with tightest error bounds (loss wrt optimal)

• provably optimal if MMR=0

Contrast with maximin

• provides quality guarantee, not optimality guarantee

Contrast with Bayesian methods, which have/are:

• need for a prior

• no (worst-case) guarantees

• computationally difficult (even to approximate)

15

Properties of Minimax Regret Solution**

MMR(p)=0 iff winner ap* is a necessary co-winner

Obs: MMR computation at least as hard as NecCo-Win

Obs: MMR-winner may not be a possible winner • In fact, all possible winners may have high max regret

16

Assume 2-approval:

• Only a, c are PWs: one

has score at least 2k+1,

while b has score 2k

• MR(b) = k+1

• MR(a) = MR(c) = 2k+1

MR of a, c twice that of b


MMR for many problems often specified as an IP

• Problematic for voting: too many voters/variables

Instead, compute PMR (a, w, p) for all m2 pairs (a,w)

• Then MMR(p) = mina maxw PMR (a, w, p)

PMR can be computed in polytime for many rules

• find worst case completion of each voter’s partial vote pi ; can

usually be done independently for each voter

Xia, Conitzer (AAAI08) use similar ideas for necessary winners

• we illustrate with the Borda rule 17

PMR a b c MR

a 0 2 2 2

b 2 0 6 6

c 5 3 0 5

Computing Minimax Regret**

We illustrate with Borda (positional) scoring

• Positional: additively decomposable: s(a, v) = ∑i s(a, vi )

• Thus PMR decomposable: complete each pi independently

18


Fix partial vote p: proposed alternative a and adversarial

witness w stand in only one of three relations in p

19


Case 2: Maximize PMR(a,w) in only “one” way:

PMR(a,w) = |B’ F E U| + 1 = m - (|A W| + 1)



21

PMR(a,w) = ‒(|B| + 1)



PMR(a,w) = |F E U| +1


Similar analysis: other positional scoring rules

Similar approach for non-decomposable scoring rules

Max regret computation is polytime for:

• Positional scoring rules

• Egalitarian (maxmin fairness)

• Bucklin

• Maximin

23

Regret-based Vote Elicitation

If MMR(p) too high, refine knowledge of voter

preferences

Current Solution Strategy (CSS):

• Use MMR solution (a*,w) to generate query: if we don’t reduce

PMR(a*,w), MMR will not be reduced

• So find some voter i with vote pi and ask query with potential

to reduce advantage of w over a* in C(pi)

• For each voter, queries considered depend on structural

properties of partial vote (whether Case 1, 2, 3; and size of sets)

24


Case 2: four reasonable query types

• a ≻ f for some f F

Max potential: f at “top” of large group

a ≻ u for some u U

Max potential: u at “top” of large group

e ≻ w for some e E

Max potential: e at “bottom” of large group

u ≻ w for some u U

Max potential: u at “bottom” of large group

Note: if MMR>0, one of U,E,F nonempty

for some voter (or sets in cases 1, 3)

25

f


Case 2: four reasonable query types • a ≻ f for some f F

Max potential: f at “top” of large group

• a ≻ u for some u U Max potential: u at “top” of large group

• e ≻ w for some e E Max potential: e at “bottom” of large

group

• u ≻ w for some u U Max potential: u at “bottom” of large

group

• Note: if MMR>0, one of U,E,F nonempty for some voter (or sets in cases 1, 3)

26

f

Vote Elicitation: Experiments*

Intuitions behind pairwise CSS can be generalized to top-

t queries (only pick voter, not alternative pair)

Compare CSS to two strategies

• Volumetric: choose voter/candidate-pair which introduces

greatest number of new paired comparisons

• Rand: random voter/candidate pair

27

Vote Elicitation: Sushi

28

Sushi: 5000 rankings of 10 varieties of sushi

Vote Elicitation: Dublin North 2002

29

Irish: 2002 electoral data (Dublin North); 3662 rankings over 12 candidates

Mallows Models

Let d(r,) denote Kendall-tau distance

• Number of pairwise inversions (swaps) between r,

Let be some central/modal ranking

Mallows -model (with dispersion )specifies P(r):

If = 1, P is uniform (IC); as →0, P concentrates on

Unimodal nature of model inflexible; but mixtures of

Mallows models can reasonably capture certain types of

population preferences

30

Vote Elicitation: Mallows

31

Mallows: 100 random rankings over 20 items; vary

dispersion

Summary of Results

MMR=0 after k paired comparisons per voter

• Sushi: CSS 11.82; Vol 20.64; Rand 20.63; MergeSort 25

• Irish: CSS 18.57; Vol 31.82; Rand 31.22; MergeSort 33

MMR=0 after k top-t queries per voter

• Sushi: CSS 3.40; Vol 4.18; Rand 5.50

• Irish: CSS 5.47; Vol 6.91; Rand 8.38

Anytime performance better for CSS as well

• E.g., reach 18% of initial regret on Irish data set after

only 5.82 queries (vs. 25.77 Vol; 24.03 Rand)

32

Voting: Summary

Robust optimization using MMR easy for several

important voting rules, easy to approximate well

CSS using MMR offers an effective form of elicitation

• Tends to be more effective when preferences correlated

• Anytime profile attractive for approximating winners

Issues:

• Multi-round, purely interactive nature

• Exploiting probabilistic info, priors

33

34




• minimax regret








Section 5: Elicitation in Stable Matching • basics of stable matching problems


Stable Marriage Problem Classic two-sided matching problem

Set of n men M, n women W • Each m has preference order ≻m over W

• Each w has preference order ≻w over M

• Variants: acceptability, ties, many-to-one, non-bipartite …

Aim: find a stable matching μ of men-women:

• A pair (m,w) blocks μ if w≻m μ(m) and m≻w μ(w)

• Matching is stable iff it is unblocked by any pair

35

Gale-Shapley Algorithm (Male Proposing)

Each man maintains list of unproposed women

Each women records “best proposal” so far

At stage k:

• Each unengaged man proposes to most preferred woman on unproposed list

• Each women accepts her most preferred proposal (only if better than best prior proposal)

• If man’s proposal accepted, becomes engaged; and if engaged, but rejected, becomes unengaged

When no men are able to propose, we are done

36

> >

> >

> >

> >

> >

> >

Properties of Gale-Shapley

Always returns a stable matching

Requires at most O(n2) proposals and rounds

Matching is male-optimal, female-pessimal

Truthful for men (not women)

With ties: several forms of stability

With ties or incomplete lists separately: polytime

With ties and incomplete lists: NP-complete

37

Preference Bottleneck

Gale-Shapley usually viewed as an “algorithm” • input complete preference lists, output matching

• tremendous burden: lots of irrelevant preference info(!)

• comparison, interview, communication costs, etc.

It can be used directly as an elicitation scheme • only ask specific queries: who next? who do you prefer?

• can reduce the burden on users

• question: is it effective as an elicitation scheme?

38

Illustration: Fully Correlated Preferences

GS elicitation performs poorly with “identical” preferences • If real preferences are correlated, similar issues arise

A “binary search” would be much more effective… • … if we knew the targets

We’ll use minimax-regret to guide this search

> > > >

39

Regret-based Elicitation, Matching [Drummond, B. IJCAI-13]

Robust matching with partial preferences

• useful for partial info, low-stakes domains

• define max-regret of matching, or degree of instability

• computation of matchings with minimax-regret

Preference elicitation

• use regret-based solution to determine user queries

• compare to GS: number of queries, rounds, cognitive costs

Related work:

• Rastegari, et al. ACM EC’13

• Stochastic matching processes (e.g., Biro, Norman 2012)

40

Partial Preferences

Partial preferences: set of pairwise comparisons

• partial preference Pq for 𝑞 ∈ 𝑀 ∪ 𝑊, partial profile P

• completions C(Pq ) and C(P) defined as usual

• will sometimes use partitioned preferences

> >

>

41

Degree of Instability With partial preferences, can’t generally guarantee stability

Degree of stability: max incentive for couple to deviate

• use Borda score for utility: sq(r, ≻q) = n- rank(q, ≻q)

• Pairwise regret: incentive for q to drop r, defect with r’

Instability of (m,w) in μ : regret of least willing blocking partner

Instability of matching is maximum instability over any pair

PWRegret 𝑞, 𝑟′, 𝑟, ≻𝑞 =

𝑠𝑞 𝑟′, ≻𝑞 − 𝑠𝑞 𝑟, ≻𝑞

Inst(m, w, 𝜇, ≻𝑚 , ≻𝑤)= min[ PWRegret 𝑚,𝑤, 𝜇(𝑚),≻𝑚 ,

PWRegret 𝑤,𝑚, 𝜇(𝑤), ≻𝑤 ]

Inst(𝜇, ≻) = max(𝑚,𝑤)

Inst(m, w, 𝜇, ≻𝑚 , ≻𝑤)

> > :

> > :

Improve

= 2

Improve

= 1 Inst = 1

42

Robust Matching: Minimizing Max Regret

Given partial profile P, minimize worst-case instability

over all possible realizations of preferences

Max regret MR(𝜇, 𝑷) : • max possible incentive for some couple to defect (unravel)

Minimax regret MMR(𝑷) : tightest stability guarantee

• If MMR(𝑷) ≤ 0 then 𝜇∗(P) is stable in usual sense

• Full information: MMR( ≻) ≤ 0

43

Computing MMR

Given a (partitioned) partial profile: MMR is NP-complete

• reduction from weakly stable matching with incomplete lists and ties

Can be formulated as a mixed integer program (MIP)

• uses pairwise max regret terms for each couple

• PMR terms computed using same completion methods as with voting

O(n3) preference completions, each solvable in polytime

• But MIP not relax-able (fractional solutions much better)

• Practically solvable for roughly n=30

44

Heuristic Computation of MMR

Goal is elicitation, so approximate MMR acceptable

Computing MR(P, μ): polytime for fixed μ, P

Simple heuristic: Partial Preference GS (PPGS) • select some completion ≻ from C(P)

• run GS to compute stable matching μ for ≻

• compute MR(P, μ): upper bound on MMR(P)

Selection methods (all feasible in our settings):

• uniform at random:

one or best of k sampled completions

fixed at outset or reselected at each round

• maximum likelihood completion (given distribution)

45

Elicitation: Regret-based Halving

• Like a binary search for (approx.) rank of matched partner

• We use current MMR-matching at each point to estimate block

of matched partner, generate query

> >

> > >

> > > >

46

Elicitation: Regret-based Halving

Compute MMR matching μ for current profile

If non-zero, identify pair(s) (m,w) that determine MR(μ)

• PMR(w, m, μ(w)) = MR(μ) and PMR(m, w, μ(m)) ≥ MR(μ) (or vice

versa)

Query one of these:

• if lower regret partner m has w, μ(m) in same block, ask to split

• else, if w has m, μ(w) in same block, ask to split

• else, ask each to split largest block (other schemes possible)

> >

> > >

m μ(w) w

47

Performance, Cognitive Cost

Compare RBH scheme to GS (as elicitation method)

• measure number of queries

• measure number of rounds

• measure cognitive cost: number, difficulty of binary comparisons

> > > > > > >

Easier Harder

• Luce-Shepard Choice Model

• threshold τ= 5

• temperature γ = 0.5

48

Mallows, n=250 (30 runs): Queries until MR=0

49

Mallows, n=250 (20 runs): Queries until MR=0

50

Mallows, n=250 (20 runs): Anytime, ϕ =0.2

51

Cognitive Costs

Mallows, n=250; Correlated, ϕ= 0.2

• RBH: Proposers 53.8, Acceptors 53.7; Average 53.8

• GS: Proposers 1830.9, Acceptors 13.1; Average 922

• GS: Proposers sort a priori: 310.3; Average 161.7

Mallows, n=250; impartial culture, ϕ= 1.0

• RBH: Average 57.8

• GS: Proposers 121.1, Acceptors 0.43; Average 60.8

52

Riffle Models n=250 (20 runs)*

Two types of men (women)

• Distinct Mallows model for each type: determines ranking within type

• Type rankings interleaved: each woman biased towards Type 1 with

probability p (drawn from a 2-component Gaussian mixture) 53

MovieLens, n=250 (20 runs)*

Generate preferences for partners based on similarity of ratings vectors

• map real-valued “affinity” scores into rankings of partners

• two different processes: unnormalized (more correlated), normalized (less)

• Cognitive costs

• U-MM, RBH: Proposers 55.3, Acceptors 55.4; Average 55.4

• U-MM, GS: Proposers 788.5, Acceptors 3.56; Average 396

• N-MM, RBH: Proposers 58.8, Acceptors 58.6; Average 58.7

• N-MM, GS: Proposers 250.0, Acceptors 1.35; Average 125.7

54

Stable Matching: Summary

Robust optimization using MMR hard in principle, easy to

approximate well

RBH using MMR effective form of elicitation

• Especially effective (cf. GS) when preferences correlated

• Anytime profile attractive for approximate stability

Others measures of instability, other quality measures

Other forms of queries

Variants: stable roommates, many-to-one, etc.

55

Probabilistic Models**

Fairly effective models and techniques for learning

probabilistic models of preferences/rankings from

data

Elicitation: exploit distributional information

• Average case query complexity [Oren, Filmus, B, IJCAI-13]

• Pure Bayesian optimization or expected max regret

• Generate samples, optimize “batch protocols” [Lu, B, ADT-11]

Sample-complexity: generate low-MR alternative with

high prob.

• More general distribution-sensitive elicitation schemes

56

Single vs. Multi-round Elicitation**

Fully sequential elicitation often not practical

• Tradeoff: quality, information elicited, rounds/interruption

see Kalech et al. [JAAMAS 2011]

Reduce interruption cost by using coarser “rounds”

• E.g., ask each voter for their top k candidates

• Stop if MMR low enough

• Otherwise select a few voters and ask for their next k’

candidates; etc.

Suitable choice of k balances the three criteria

57

Optimizing Single-round Protocols [Lu, B. ADT-11]**

General framework for addressing tradeoffs

Focus on optimizing single-round protocols

• for one round of elicitation, what is trade off between information

elicited (k) and minimax regret?

Requires a probabilistic model Pr of voter preferences

• weak guarantees otherwise (hard to predict MMR)

Our goal: find minimal k s.t. Pr(MMR < ) > 1-

• regret tolerance

• confidence

58

Exploiting Distribution: Sampling**

Many models of ranking distributions:

• Mallows, Plackett-Luce, Bradley-Terry, impartial culture, …

• in principal, can derive analytical results for each

We propose an empirical (sampling) methodology

• sample t vote profiles

learned model, generative process, subsample data sets

• compute MMR for each profile and for each k < m-1

• use empirical distribution over MMR to determine suitable k

achieves desired MMR < with desired probability Pr > 1-

59

MMR Histograms: Mallows (m=10, n=1000, =0.6, Borda)**

60

MMR Confidence Plot: Mallows (m=10, n=100, =0.6, Borda)**

61

Sample Complexity**

One may use methodology purely heuristically

• actual MMR (after elicitation) can suggest further queries

Theoretical sample complexity bounds possible

• assume sampling accuracy ξ and sampling confidence η

• with t sampled profiles, where:

• output min satisfying:

62

𝑘

MMR Histograms: Sushi Data Set (50 samples, 100 voters)**

63

MMR Histograms: Dublin Data Set (73 samples, 50 voters)**

64

Learning Probabilistic Models [Lu, B. ICML-11]**

Where do probabilistic models come from?

• can be learned from sample/survey/historical data

• two key difficulties: inference and learning

Much research in stats, psychometrics, ML, etc. but learning Mallows models with pairwise evidence ignored

Inference task: given paired comparisons (partial vote)

pi, what is posterior over i’s ranking: P(r| pi ; , )

Learning task: given partial profile p = (p1, …, pn), what

is max likelihood Mallows model/mixture?

• Solvable by EM if you can solve the inference task

65

Social Networks as Preference Source**

Valuable source of preference data: probabilistic models

of preference correlation on networks?

• impact on elicitation could be immense

• both for individual or social choice problems

Social networks shape behavior

• Homophily well-documented

• Often claimed that preferences

correlated; but less evidence to

this effect

Social Choice on Social Networks**

Many social choice problems occur in network context • e.g., externalities in assignment (BGM EC-12), matching (BLCHW10),

voting (ABKLT EC-12), coalition formation (BL11)

Voting with empathetic preferences [Saheli-Abari, B. 12]

• utility trades off intrinsic and empathetic preference

• e.g., casual group decision, elections, supply chain, …

Many new elicitation, optimization challenges

a>b>c>d

b>c>d>a

b>d>c>a d>c>a>b

c>b>a>d

0.2

0.1

0.3

0.2

0.2

0.1

Fixed point solution

(à la PageRank):

Simple weighted

voting scheme.

Next Steps

Just a starting point: learning, probabilistic models,

decision-theoretic optimization for effective elicitation

and decision making in social choice settings

• Move toward behavioural SC, connections to social media

Next steps

• Sophisticated, distribution-aware elicitation schemes

• Learning other distributional models (e.g., Plackett-Luce)

• Distributions over multi-attribute preference domains

• Exploiting social media: networks, CF, sentiment, …

• Computation, elicitation in combinatorial domains

• New analyses of manipulation

• Other social choice problems: matching; multi-

winner/segmentation; allocation; etc.

68

69

Recap

Preference bottleneck a key challenge in AI and decision

support

Fortunately, good decisions can often be made with very

little preference/utility information

• quantitative approaches important if you want to accurately

assess impact of approximation, tradeoff elicitation effort with

decision quality

• however, key is to avoid unnecessary precision

Some Key Issues Key viewpoints:

• strict uncertainty, probabilistic methods, Bayesian methods

• incentive issues is multiagent interactions

Key issues:

• interaction and query costs, passive vs. active assessment

• user controlled exploration vs. system-generated interaction

• overcoming cognitive biases (framing, anchoring, thresholding…)

• active elicitation in collaborative models

• vague, subjective, user-specified features

fundamental vs. means objectives (Keeney)

• dialog-based approaches, linguistic cues

• inconsistency management, sensitivity analysis

• transient, nonstationary, context-specific preferences

• multi-source preference data integration

70

References C. Boutilier, I. Caragiannis, S. Haber, T.Lu, A. Procaccia and O. Sheffet. Optimal Social Choice

Functions: A Utilitarian View. Thirteenth ACM Conference on Electronic Commerce (EC'12),

Valencia, Spain, pp.723-740 (2012).

Y. Chevaleyre, U.Endriss, J. Lang, and N. Maudet. A short introduction to computational social

choice. SOFSEM-07, pp.51–69, Harrachov, Czech Republic, 2007.

V.Conitzer and T.Sandholm. Vote elicitation: Complexity and strategyproofness. AAAI-02,, 2002.

V.Conitzer and T.Sandholm. Communication complexity of common voting rules EC’05, 2005.

J. Drummond, C. Boutilier. Elicitation and Approximately Stable Matching with Partial Preferences.

23rd International Joint Conference on Artificial Intelligence (IJCAI-13), pp.97-105, Beijing (2013).

W. Gaertner. A Primer in Social Choice Theory. Oxford University Press, USA, 2006.

M. Kalech, S. Kraus, G. A. Kaminka, and C.V. Goldman. Practical voting rules with partial

information. J. of Autonomous Agents and Multi-Agent Systems, 22(1):151–182, 2011.

T. Lu and C. Boutilier. Robust Approximation and Incremental Elicitation in Voting Protocols. Proc.

of IJCAI-11, pp.287-293, Barcelona (2011).

T. Lu and C. Boutilier. Learning Mallows Models with Pairwise Preferences. Proc. of ICML 2011,

pp.145-152, Bellevue, WA (2011).

T. Lu, C. Boutilier. Vote Elicitation with Probabilistic Preference Models: Empirical Estimation and

Cost Tradeoffs. 2nd Conf. on Algorithmic Decision Theory (ADT-11), Piscataway, NJ,134-149 (2011).

T. Lu, C. Boutilier. Budgeted Social Choice: A Framework for Multiple Recommendations in

Consensus Decision Making. 11th ACM Conf. on Elec. Comm. (EC'10), 263--274, Boston (2010).

C. Mallows. Non-null ranking models. Biometrika:44, pages 114–130, 1957.

J. Marden. Analyzing and modeling rank data. Chapman and Hall, 1995.

L. Xia and V. Conitzer. Determining possible and necessary winners under common voting rules

given partial orders. AAAI-08, pp. 202–207, Chicago, 2008.

71

Preference Elicitation in Multiagent Domains: Mechanism ... · 20 Implications of Properties on...

Documents

Transcript of Preference Elicitation in Multiagent Domains: Mechanism ... · 20 Implications of Properties on...