Preference Elicitation in Multiagent Domains: Mechanism ... · 20 Implications of Properties on...
Transcript of Preference Elicitation in Multiagent Domains: Mechanism ... · 20 Implications of Properties on...
1
Preference Elicitation in Multiagent Domains:
Mechanism Design, Voting and Stable Matching
Craig Boutilier
Department of Computer Science
University of Toronto
2
The Preference Bottleneck in AI
Decisions on behalf of individuals (groups, organizations)
• match individuals to desired products, services, information,
people, behaviors, courses of action
Decision theory provides foundations for automated
decision support systems
• actions, outcomes, dynamics, utilities: MEU
But what is the objective function?
• user preferences (or utilities) are often unknown
• vary much more widely than dynamics
Consider applications in your area of research where the dynamics are stable, but goals or preferences vary
3
The Preference Bottleneck
Two difficult questions faced in decision analysis:
• Decomposition, natural representation of preferences
• Assessing precise tradeoffs (cognitive cost)
Other difficult questions:
• what preference info is relevant to the task at hand?
• when is the elicitation effort worth the improvement it offers in
terms of decision quality?
• what decision criterion to use given partial utility info?
4
Product Configuration
Luggage Capacity? Two Door? Cost?
Engine Size? Color? Options?
5
COACH*
POMDP for prompting Alzheimer’s patients
• solved using factored models, value-directed compression of
belief space
Reward function (patient/caregiver preferences)
• indirect assessment (observation, policy critique)
6
Active Collaborative Filtering*
Probabilistic model assumed (MCVQ, naïve Bayes)
• expected utility (given uncertain preferences) for specific recommendations
Active querying to produce better suggestions
• which new ratings would (in expectation) improve expected value of best suggestions the most?
• computed using EVOI
• offline bounding of effect on posterior to prune queries allows real time response
7
Combinatorial Auctions
Expressive bidding in auctions becoming common
• combinatorial bids, side-constraints, discount schedules, etc.
• direct expression of utility/cost: economic efficiency
Advances in winner determination
• determine least-cost allocation of business to bidders
8
Non-price Preferences
A and B for $12000. C and D for $5000…
A for $10000.
B and D for $5000 if A; B and D for $7000 if not A...
Joe
Hank
etc…
A, C to Fred. B, D, G to Frank. F, H, K to Joe…
Cost: $57,500.
That gives too much business
to Joe!!
9
Non-price Preferences
Winner determination algs. minimize cost alone
• but preferences for non-price attributes play key role
• Some typical attributes in sourcing:
percentage volume business to specific supplier
average quality of product, delivery on time rating
geographical diversity of suppliers
number of winners (too few, too many), …
Clear utility function involved
• difficult to articulate precise tradeoff weights
• “What would you pay to reduce %volumeJoe by 1%?”
10
Manual Scenario Navigation*
Current practice: manual scenario navigation
• impose constraints on winning allocation
• re-run winner determination
• new allocation satisfying constraint: higher cost
not a hard constraint!
• assess tradeoff and repeat (often hundreds of times)
until satisfied with some allocation
Here’s a new allocation with less business to Joe.
Cost is now: $62,000.
12
Bargaining for a Car
Luggage Capacity? Two Door? Cost?
Engine Size? Color? Options?
$$ $$
$$
$$
$$
$$
$$
$$
Social Choice
13
???
Winner determination: non-price attributes % volume business to specific supplier average quality of product delivery on time rating geographical diversity of suppliers number of winners (too few, too many), …
14
Overview Section 1: Decision Theory and Basics of Preference Elicitation
Section 2: Regret-based and Polyhedral Methods
• computational motivations, imprecise utility functions
• minimax regret
• polyhedral conjoint analysis, volumetric methods (if time)
[May skip due to time] Section 3: Elicitation in Mechanism Design
• basics of mechanism design, incentives, VCG
• partial revelation mechanisms
Section 4: Elicitation in Voting
• basics of social choice, voting
• minimax regret and incremental vote elicitation
Section 5: Elicitation in Stable Matching
• basics of stable matching problems
• preference elicitation in stable matching
15
Why preferences?
Natural question: why not specify behavior with goals?
Preferences: coffee > OJ > tea
• Natural goal: coffee
but what if unavailable? requires a 30 minute wait? …
• allows alternatives to be explored in face of costs, infeasibility,…
16
Preference Orderings
Assume (finite) outcome set X (states, products, etc.)
Preference ordering over X: x1 x2, x1 > x2, x1 ~ x2, …
• must be: (a) transitive; (b) connected (orderable)
• i.e., a total preorder
Why connected? Why transitive?
• e.g., money pump
Consider the planning problem: what if uncertainty?
> > > … >
Uncertainty in Decision Outcomes
What if:
• 2% chance no coffee made (30 min delay)? 10%? 20%? 95%?
• robot has charge to check only one possibility
• 5% chance of damage in coffee room, 1% at OJ vending machine
17
18
Preference over Lotteries
If there’s uncertainty in choice outcomes, is not enough
A simple lottery over X has form:
[ (p1 ,x1), (p2 ,x2), …, (pn ,xn) ]
where pi 0 and pi = 1
A compound lottery allows outcomes to be lotteries:
[ (p1 ,l1), (p2 ,l2), …, (pn ,ln) ]
• outcomes are just trivial lotteries; restrict to finite compounding
19
Constraints on Lotteries*
Continuity:
• If x1>x2>x3 then p s.t. [(p,x1), (1-p,x3)] ~ x2
Substitutability:
• If x1 ~ x2 then [(p,x1), (1-p,x3)] ~ [(p,x2), (1-p,x3)]
Mononoticity:
• If x1 x2 and p q then [(p,x1), (1-p,x2)] [(q,x1), (1-q,x2)]
Reduction of Compound Lotteries (“no fun gambling”):
• [ (p, [(q,x1), (1-q,x2)] ), (1-p, [(q’,x3), (1-q’,x4)]) ]
~ [ (pq,x1), (p-pq,x2), (q’-pq’,x3), ((1-p)(1-q’),x4) ]
Nontriviality:
• xT > x
20
Implications of Properties on
Since is trans, connected: representable by an ordinal
value function V(x)
With constraints on lotteries: we can construct a utility
function U(l)∈ R s.t. U(l1) U(l2) iff l1 l2
• where U([ (p1 ,x1), … , (pn ,xn) ]) = i piU(xi)
• famous result of Ramsey, vonNeumann&Morgenstern, Savage
• Exercise: prove existence of such a utility function
Thus knowing U(xi) for each outcome allows tradeoffs to
be made over uncertain courses of action (lotteries)
Principle of Maximum Expected Utility (MEU)
• utility of choice is a expected utility of its outcome
• appropriate choice is that with maximum expected utility
21
Some Discussion Points**
Utility function existence: proof is straighforward
Utility function for > over lotteries is not unique:
• any positive affine transformation of U induces same ordering >
• normalization in range [0,1] common
Ordinal preferences “easy” too elicit (if X small)
• cardinal utilities trickier for people: almost an “art form” in D.A.
Outcome space often factored: exponential size
• requires techniques of multiattribute utility theory MAUT
Expected utility accounts for risk: inherent in preferences
over lotteries
• see utility of money
22
Risk profiles and Utility of money**
What would you choose?
• (a) $100,000 or (b) [(.5, $200,000), (.5, 0) ]
• what if (b) was $250K, $300K, $400K, $1M; p = .6, .7, .9, .999, …
• generally, U(EMV(lottery)) > U(lottery) EMV = expected monetary value
Utility of money is nonlinear: e.g., U($100K) > .5U($200K)+.5U($0)
Certainty equivalent of l: U(CE) = U(l); CE = U-1(EU(l))
0 40K 100K 200K U(0)
EU(lottery)
U($100K)
U($200K) For many people, CE ~ $40K Note: 2nd $100K “worth less” than 1st $100K
23
Risk attitudes**
Risk Premium: EMV(l) – CE(l)
• how much of EMV will I give up to remove risk of losing
Risk averse:
• decision maker has positive risk premium; U(money) is concave
Risk neutral:
• decision maker has zero risk premium; U(money) is linear
Risk seeking:
• decision maker has negative risk premium; U(money) is convex
Most people are risk averse
• this explains insurance
• often risk seeking in negative range
• linear a good approx in small ranges
24
St. Peterburg Paradox**
How much would you pay to play this game?
• A coin is tossed until it falls heads. If it occurs on the Nth toss
you get $2N
• Most people will pay about $2-$20
11
122
1
n
n
n
n
EMV
25
Allais’ Paradox**
Situation 1: choose either
• (1) $1M, Prob=1.00
• (2) $5M, Prob=0.10; $1M, Prob=0.89; nothing, Prob=0.01
Situation 2: choose either
• (3) $1M, Prob=0.11; nothing, Prob=0.89
• (4) $5M, Prob=0.10; nothing, Prob=0.90
Most people: (1) > (2) and (4) > (3)
Paradox: no way to assign utilities to monetary outcomes
that conforms to expected utility theory and the stated
preferences (violates substitutability)
• possible explanation: regret
…and the survey says**
Situation 1:
• (1)>(2): 11 (37%)
• (2)>(1): 19 (63%)
Situation 2:
• (3)>(4): 2 (7%)
• (4)>(3): 27 (93%)
26
Allais’ Paradox: The Paradox**
Situation 1: choose either
• (1) $1M, Prob=1.00
equiv: ($1M 0.89; $1M 0.11)
• (2) $5M, Prob=0.10; $1M, Prob=0.89; nothing, Prob=0.01
• So if (1)>(2), by subst: $1M > ($5M 10/11; nothing 1/11)
Situation 2: choose either
• (3) $1M, Prob=0.11; nothing, Prob=0.89
• (4) $5M, Prob=0.10; nothing, Prob=0.90
equiv: nothing 0.89; $5M 0.10; nothing 0.01
• So if (4)>(3), by subst: ($5M 10/11; nothing 1/11) > $1M
27
28
Ellsberg Paradox**
Urn with 30 red balls, 60 yellow or black balls; well mixed
Situation 1: choose either
• (1) $100 if you draw a red ball
• (2) $100 if you draw a black ball
Situation 2: choose either
• (3) $100 if you draw a red or yellow ball
• (4) $100 if you draw a black or yellow ball
Most people: (1) > (2) and (4) > (3)
Paradox: no way to assign utilities (all the same) and
beliefs about yellow/black proportions that conforms to
expected utility theory
• possible explanation: ambiguity aversion
29
Utility Representations
Utility function u: X [0,1] • decisions induce distribution over outcomes
• or we simply choose an outcome (no uncertainty), but constraints on outcomes
If X is combinatorial, sequential, etc. • representing, eliciting u difficult in explicit form
Some structural form usually assumed • so u parameterized compactly (weight vector w)
• e.g., linear/additive, generalized additive models
Representations for qualitative preferences, too
• e.g., CP-nets, TCP-nets, etc. [BBDHP03, BDS05]
Configuration Problems
• Configuration variables
X = {X1 … Xn}
• Constraints C over X
• Xf : set of feasible outcomes
Utility function u: X [0,1] • user’s strength of preference
Optimal decision x* = argmax {u(x) : x Xf }
30
Color
cherryRed
metallicBlue
grey
#Doors
Coupe/2
Sedan/4
Hatch/5
Wagon/5
Power
150hp
280hp
350hp
Power > 280hp → FuelCons > 8l /100km
Make=Lexus & Power > 200hp → AutoTrans
…
car1 car2 car3 car4 …
0.29 1.0 0.85 0.96 …
31
Flat vs. Structured Utility Representation
Naïve representation: vector of values
• e.g., car7:1.0, car15:0.92, car3:0.85, …, car22:0.0
Impractical for combinatorial domains
• e.g., can’t enumerate exponentially many cars, nor expect user
to assess them all (choose among them)
Instead we try to exploit independence of user
preferences and utility for different attributes
• the relative preference/utility of one attribute is independent of
the value taken by (some) other attributes
Assume X Dom(X1) x Dom(X2) x … Dom(Xn)
• e.g., car7: Color=red, Doors=2, Power=320hp, LuggageCap=0.52m3
32
Preferential, Utility Independence**
X and Y = V-X are preferentially independent if:
• x1y1 x2y1 iff x1y2 x2y2 (for all x1, x2, y1, y2)
• e.g., Color: red>blue regardless of value of Doors, Power, LugCap
• conditional P.I. given set Z: definition is straightforward
X and Y = V-X are utility independent if:
• l1(Xy1) l2(Xy1) iff l1(Xy2) l2(Xy2) (for all y1, y2 , all distr. l1,l2)
• e.g., preference for lottery(Red,Green,Blue) does not vary with
value of Doors, Power, LugCap
implies existence of a “utility” function over local (sub)outcomes
• conditional U.I. given set Z: definition is straightforward
33
Additive Utility Functions
Additive representations commonly used [KR76]
• breaks exponential dependence on number of attributes
• use sum of local utility functions ui over attributes
• or (normalized) local value functions vi plus scaling factors i
This will make elicitation much easier
Color u1
red 1.0
blue 0.7
grey 0.0
Drs u2
2 1.0
4 0.8
hatch 0.2
wag’n 0.0
Pwr u3
350 1.0
280 0.7
150 0.0
1= 0.2
2= 0.3
3= 0.5
u(red,2dr,280hp) = 0.85
34
Additive Utility Functions
An additive representation of u exists iff decision maker is
indifferent between any two lotteries where the marginals
over each attribute are identical
• l1(X) ~ l2(X) whenever l1(Xi) = l2(Xi) for all Xi
35
Generalized Additive Utility
Generalized additive models more flexible interdependent value additivity [Fishburn67], GAI [BG95]
• assume (overlapping) set of m subsets of vars X[j]
• use sum of local utility functions uj over attributes
This will make elicitation much easier
Color Drs u1
red 2 1.0
blue 4 0.9
red 4 0.6
blue 2 0.4
Pwr Drs u1
350 2 1.0
350 4 0.7
280 2 0.65
280 4 0.55
1= 0.4 2= 0.6
u(red,2dr,280hp) = 0.79
36
GAI Utility Functions
An GAI representation of u exists iff decision maker is
indifferent between any two lotteries where the marginals
over each factor are identical
• l1(X) ~ l2(X) whenever l1(X[i]) = l2(X[i]) for all i
37
Basic Elicitation: Flat Representation
“Typical” approach to assessment
• normalization: set best outcome utility 1.0; worst 0.0
• standard gamble queries: ask user for probability p with which
indifference holds between x and SG(p)
• e.g., car3 ~ <0.85, car7; 0.15, car22 >
38
Basic Elicitation: Flat Representation
SG queries: require precise numerical assessments
Bound queries: fix p, ask if x preferred to SG(p)
• yes/no response: places (lower/upper) bound on utility
• easier to answer, much less info (narrows down interval)
Simple binary search can be used to identify utility u(x)
• but how much precision is needed in practice?
39
Elicitation: Additive Models (Classical)
First: assess local value functions with local SG queries
• calibrates on [0,1]
For instance,
• ask for best value of Color (say, red ), worst value (say, grey)
• then ask local standard gamble for each remaining Color to
assess it’s local value
blue ~ <0.85, red; 0.15, grey >
green ~ <0.67, red; 0.33, grey >, …
Bound queries can be asked as well
• only refine intervals on local utility
40
Elicitation: Additive Models
Second: assess scaling factors with “global” queries
• define reference outcome
e.g., could be best global outcome, or any salient outcome
e.g., user’s current car: (red, 2door, 150hp, 0.35m3)
• define by setting Xj to best value, others to reference value
e.g., for doors: (red, 4door, 150hp, 0.35m3)
• compute scaling factor
• assess the 2n utility values with (global) SG queries
Altogether: gives us full utility function
41
Elicitation: GAI Models (Classical)
Assessment is subtle (won’t get into gory details)
• overlap of factors a key issue [F67,GP04,DB05]
• cannot rely on purely local queries: values cannot be fixed without
reference to others!
• seemingly “different” local prefs correspond to same u
u(Color,Doors,Power) = u1(Color,Doors) + u2(Doors,Power)
u(red,2door,280hp) = u1(red,2door) + u2(2door,280hp)
u(red,4door,280hp) = u1 (red,4door) + u2(4door,280hp)
10 6 4
6 3 3
9 1
42
Local Queries [Braziunas, Boutilier UAI05]
We wish to avoid queries on whole outcomes
• can’t be purely local; but condition on a subset of reference values
Conditioning set Ci for factor ui(Xi) :
• vars (excl. Xi) in any factor uk(Xk) where Xi Xk≠
• setting Ci to reference values renders Xi independent of remaining
variables
e.g.,Power=280hp shields <Color,Door> from any other vars
• Define local best/worst for ui assuming Ci set at reference levels
• Ask SG queries relative to local best/worst with Ci fixed
e.g., fix Power=280hp and ask SG queries on <Color,Door>
conditioned on 280hp
43
Local Queries [BB05]**
Theorem: If for some y (where Y =X - Xi - C(Xi) )
then for all y’
Hence we can legitimately ask local queries:
Conditioning Sets**
44
BCD
ABC
FGH
EF DE
AE=a0e0
BCF=b0c0f0
D=d0
EH=e0h0
DGHJ=d0g0h0j0
FGJ
EJ=e0j0
45
Local Standard Gamble Queries*
Local standard gamble queries for each factor
• use “best” and “worst” local outcome―conditioned on default
values of conditioning set
e.g., xT[1] = abcd0 for factor ABC; x[1] = ~abcd0
• SG queries on other parameters relative to these
• gives local value function v(x[i]) (e.g., v(ABC) )
Can use bound queries as well
But local VFs not enough: must calibrate
• requires global scaling
46
Global Scaling*
Assess scaling factors with “global” queries
• exactly as with additive models
• define reference outcome
• define by setting X[j] to best value, others to reference vals
• compute scaling factor
• assess the 2n utility values with (global) SG queries
• can use bound queries as well
47
Elicitation: Beyond the Classical View
The classic view involving standard gambles difficult:
• large number of parameters to assess (structure helps)
• unreasonable precision required (SGQs)
• queries over full outcomes difficult (structure helps)
• cost (cognitive, communication, computational, revelation) may
outweigh benefit
can often make optimal decisions without full utility information
General approach to practical, automated elicitation
• cognitively plausible forms of interaction
• incremental elicitation until decision possible that is good enough
• collaborative models to allow generalization across users
48
Beyond Standard Gamble Queries
Bound queries
• a boolean version a (global/local) SG query
• global: “Do you prefer x to [(p, xT), (1-p, x)]?”
• local: “Do you prefer x[k] to [(p, xT[k]), (1-p, x[k])]?”
need to fix reference values Ck if using GAI model
• response tightens bound on specific utility parameter
Comparison queries (is x preferred to x’ ?)
• global: “Do you prefer x to x’?”
• local: “Do you prefer x[k] to x’[k] ?”
• impose linear constraints on parameters
Sk uk(x[k]) > Sk uk(x’[k])
• interpretation is straightforward
49
Other Modes of Interaction
Stated choice (global or local) • choose xi from set {x1, …xk}
• imposes k-1 linear constraints on utility parameters
Ranking alternatives (global or local) • order set {x1, …xk} : similar
Graphical manipulation of parameters • bound queries: allow tightening of bound (user controlled)
generally must show implications of moves made
• approximate valuations: user-controlled precision
useful in quasi-linear settings
Passive observation/revealed preference • if choice x made in context c, the x at least as preferred as all
other available alternatives
Active, but indirect assessment • e.g., dynamically generate Web page, with k links
• assume response model: Pr(linkj | u)
Global Comparison Query (GCQ)
50
Local sorting
51
Reference values (local context)
Factor 2 attributes
Anchor bound query (ABQ)
52
Reference values
Factor attributes
Local Bound
Query (LBQ)
53
Reference values (local context)
Scale 0 - 100
70
100
0
Bin 2 (0-70)
54
A General Framework for Elicitation and
Interactive Decision Making
B: beliefs about user’s utility function u
Opt(B): “optimal” decision given incomplete, noisy, and/or imprecise beliefs about u
Repeat until B meets some termination condition • ask user some query (propose some interaction) q
• observe user response r
• update B given r
Return/recommend Opt(B)
Will discuss this in depth over the rest of course
• think about this: what are some appropriate termination criteria?
55
Cognitive Biases: Anchoring**
Decision makers susceptible to context in assessing
preferences (and other relevant info, like probabilities)
Anchoring: assessment of utility dependent on arbitrary
influences
Classic experiment [ALP03]:
• (business execs) write last 2 digits of SSN on piece of paper
• place bids in mock auction for wine, chocolate
• those with SSN>50 submitted bids 60-120% higher than SSN<50
Often explained by focus of attention plus adjustment
• holds for estimation of probabilities (Tversky, Kahneman estimate
of # African countries), numerical quantities, …
How should this impact the design of elicitation methods?
56
Cognitive Biases: Framing**
How questions are framed is critical
Classic Tversky, Kahneman experiment (1981); disease predicted to
kill 600 people, choose vaccination program
• Choose between:
Program A: "200 people will be saved"
Program B: "there is a one-third probability that 600 people will be
saved, and a two-thirds probability that no people will be saved“
• Choose between:
Program C: "400 people will die"
Program D: "there is a one-third probability that nobody will die, and
a two-thirds probability that 600 people will die"
• 72 percent prefer A over B; 78 percent prefer D over C
• Notice that A and C are equivalent, as are B and D
How should this impact design of elicitation schemes?
57
Cognitive Biases: Endowment Effect**
People become “attached” to their possessions
• e.g., experiment of Kahneman, et al. 1990
Randomly assign subjects as buyers, sellers
• sellers given a coffee mug (sells for $6); all can examine closely
• sellers asked: “at what price would you sell?”
• buyers asked: “at what price would you buy?”
• median asking price: $5.79; median offer price: $2.25
would expect these to be identical given random asst to groups
• if sellers are given tokens with a monetary value (can be used later
to buy mugs/chocolate in bookstore), no difference between offers
and ask prices
How should this impact the design of elicitation methods?
References P. C. Fishburn. Interdependence and additivity in multivariate, unidimensional expected utility
theory. International Economic Review, 8:335–342, 1967.
Peter C. Fishburn. Utility Theory for Decision Making. Wiley, New York, 1970.
R. L. Keeney and H. Raiffa. Decisions with Multiple Objectives: Preferences and Value Trade-offs.
Wiley, NY, 1976.
F. Bacchus , A. Grove. Graphical models for preference and utility. UAI-95, pp.3–10, 1995.
John von Neumann and Oskar Morgenstern. Theory of Games and Economic Behavior. Princeton
University Press, Princeton, 1944.
L. Savage. The Foundations of Statistics. Wiley, NY, 1954.
C. Gonzales and P. Perny. GAI networks for utility elicitation. In Proc. of KR-04, pp.224–234,
Whistler, BC, 2004.
D. Braziunas and C. Boutilier. Local utility elicitation in GAI models. In Proc. of UAI-05, pp.42–
49,Edinburgh,2005.
D. Braziunas and C. Boutilier. Minimax regret based elicitation of generalized additive utilities. In
Proc. of UAI-07, 2007.
Daniel Kahneman, Jack L. Knetsch, and Richard H. Thaler, Experimental Tests of the Endowment
Effect and the Coase Theorem, J. Political Economy 98(6), 1990
A Tversky and D Kahneman, The framing of decisions and the psychology of choice, Science 211,
1981.
D. Ariely, G. Loewenstein and D. Prelec (2003), "Coherent arbitrariness: Stable demand curves
without stable preferences," Quarterly Journal of Economics, No.118 (1), (February), 73-105.
58
59
Fishburn’s Decomposition [F67]
Define reference outcome:
For any x, let x[I] be restriction of x to vars I, with
remaining replaced by reference values:
Utility of x can be written [Fishburn67]
• sum of utilities of certain related “key” outcomes
60
Key Outcome Decomposition
Example: GAI over I={ABC}, J={BCD}, K={DE}
u(x) = u(x[I]) + u(x[J]) + u(x[K])
- u(x[IJ]) - u(x[IK]) - u(x[JK])
+ u(x[IJK])
u(abcde) = u(x[abc]) + u(x[bcd]) + u(x[de])
- u(x[bc]) - u(x[]) - u(x[d])
+ u(x[])
u(abcde) = u(abcd0e0) + u(a0bcde0) + u(a0b0c0de)
- u(a0bcd0e0) - u(a0b0c0de0)
61
Canonical Decomposition [F67]
This leads to canonical decomposition of u:
u1(x1, x2) u2(x2, x3)
u(abcde) = u(abcd0e0)
+ u(a0bcde0) - u(a0bcd0e0)
+ u(a0b0c0de) - u(a0b0c0de0)
e.g., I={ABC}, J={BCD}, K={DE}
u(abcde) = u1(abc)
+ u2(bcd)
+ u3(de)
62
Local Queries: Comparison*
63
Local Query: Bound*
64
Local Query: Bound*
65
Global Query: Anchor Comparison*
66
Global Query: Anchor Bound*
67
Weight-bound Manipulation*
Exploit feedback to
encourage sharper
bounds
Display implications of
bound refinement on
pairwise max regret
between mmx-optimal
alloct’n and adversary’s
choice
Need real-time update so
can’t compute new
minimax regret
1
Preference Elicitation in Multiagent Domains:
Mechanism Design, Voting and Stable Matching
Craig Boutilier
Department of Computer Science
University of Toronto
2
Overview
Section 1: Decision Theory and Basics of Preference Elicitation
Section 2: Regret-based and Polyhedral Methods • computational motivations, imprecise utility functions
• minimax regret
• polyhedral conjoint analysis, volumetric methods (if time)
[May skip due to time] Section 3: Elicitation in Mechanism Design
• basics of mechanism design, incentives, VCG
• partial revelation mechanisms
Section 4: Elicitation in Voting
• basics of social choice, voting
• minimax regret and incremental vote elicitation
Section 5: Elicitation in Stable Matching
• basics of stable matching problems
• preference elicitation in stable matching
3
A General Framework for Elicitation and
Interactive Decision Making
B: beliefs about user’s utility function u
Opt(B): “optimal” decision given incomplete, noisy, and/or imprecise
beliefs about u
Repeat until B meets some termination condition
• ask user some query (propose some interaction) q
• observe user response r
• update B given r
Return/recommend Opt(B)
4
Utility Function Uncertainty
General approaches to representation/decisions • Strict uncertainty models
nonprobabilistic (regret-based)
probabilistic (non-Bayesian)
• Bayesian models
Key components • decision criterion: what decision given beliefs B?
requires effective inference (error metrics)
• effective update (function of queries/interaction)
• elicitation strategy: which query to ask next?
• termination condition: when is decision “good enough”
5
Strict Utility Function Uncertainty
User’s utility parameters w unknown
• u(x; w) linear in utility parameters w
Assume feasible set W
• W defined by linear constraints on w
• Polytope induced by query responses
How should one make a decision? elicit info?
• regret-based approaches
• polyhedral approaches (and other heuristics)
u1(red)+u2(4dr)+u3(150hp) > 0.4
u1(blue)+u2(4dr)+u3(280hp) >
u1(red)+u2(4dr)+u3(150hp)
W
6
• Regret of x under w
• Max regret of x under W
• Minimax regret; optimal option
),(minarg
),(min)(
* WMR
WMRWMMR
f
f
W xx
x
Xx
Xx
)'w,(Rmax)W,(MRW'w
xx
)w;(u)w;'(umax)w,(R
f'
xxxXx
W
x
w
x’
w’
R(x,w)
x
x’
R(x,w’)
x*
x’
R(x*,w’)
Minimax Regret
7
Minimax Regret: An Example
Simple example to contrast minimax regret with maximin
Maxmin: recommends D3 (too cautious?)
MMR recommends D2
• might be worse than D3, but never by more than a little
U1 U2 U3 Min MR
D1 8 2 1 1 5
D2 7 7 1 1 1
D3 2 2 2 2 6
8
Why Minimax Regret?
Minimizes regret in presence of adversary
• provides bound worst-case loss (cf. maximin)
• robustness in the face of utility function uncertainty
In contrast to Bayesian methods:
• useful when priors not readily available
• can be more tractable; see [CKP00/02, Bou02]
• user unwilling to “leave money on the table” [BSS04]
• preference aggregation settings [BSS04]
• effective elicitation even if priors available [WB03]
9
Example Domains
Product configuration: GAI models
• product contraints (feasibility: CSP)
• product database (feasibility: elements of DB)
Winner determination in procurement: additive
• feasibility: solution to combinatorial allocation problem
Resource allocation in autonomic computing
• utility function can only be sampled
• continuous action/outcome space
• sequential (MDP) extensions
Travel planning, mechanism design (auctions and
bargaining), social choice, etc.
10
Computing Minimax Regret
Difficulties computing minimax regret: • minimax (integer) program with quadratic objective
General Approach: • Bender’s decomposition and constraint generation to
break minimax program
• Various encoding tricks to linearize quadratic terms
details and formulation depend on domain
xxXxXx
w'wmaxmaxminMMR(W)
ff 'Ww
Convert MMR to (linear) IP with infinitely many constraints
Simplify to linear (IP) with finitely many constraints
• Here: V(W) are vertices of polytope W and x*(w) is optimal
configuration for utility function w
Still (potentially) exponentially many constraints
11
WwXx
Xx
,';xw'xw. fiii
k
i
i
1
s.t
min
)(,),(;'.s.t
min
*
1
Wwwx
Xx
Vxwxw wiii
k
i
i
MMR: Bender’s Reformulation
Repeatedly solve
• Let solution be x* with objective value *
Compute MR(x*,W) of solution x*: MR = r, with
witness (x’’, w’’)
if r > *, add (x’’, w’’) to Gen, repeat; else terminate
note: (x’’, w’’) is maximally violated constraint
12
Gens.t
min
1
),'(;xw'xw. iii
k
i
i wx
Xx
MMR: Constraint Generation
13
Computing Max Regret* Objective is naturally quadratic
Since factor attribute instantiations are discrete
• quadratic terms: products of binary, continuous vars
• linearized by introducing auxiliary variables: induces (linear) MIP
14
MMR Solution Time: Real estate (20vars, 2-5
values/var, 29 GAI factors,160 parameters)
15
Random Problems (Varying Size)
Randomly generated problems
• for fixed number of vars (up to 5 values per var), factors; randomly set
vars within factors and the utility bounds for each parameter
• running time, constraints grow exponentially with problem size
• number of constraints negligible (47 avg. for 30 vars)
16
Anytime Performance
(*Key to Good Elicitation Performance)
17
Application to GAI Models
Similar techniques can be used as with linear models
• but details vary somewhat
Model is especially simple if only local bound queries
ignoring conditioning sets [BPPS03; BPPS06]
• justifiable if compared to common numeraire (money)
For computation with scaling factors and conditioning
variables, see [BB07]
18
Other Generalizations*
Formulations are much simpler with upper and lower
bounds on weights (hyper-rectangular W)
Nonlinearly definable features
Anytime speed ups
• use approximate solutions (small duality gap) in CG
• use approximate solutions for query generation
e.g. early termimation of CG
• warm start (see elicitation)
Nonlinear utility for specific features
19
Regret-based Elicitation [BSS04, BPPS-05,06]
Minimax optimal solution may not be satisfactory
Improve quality by asking queries
• new bounds on utility model parameters
Which queries to ask?
• what will reduce regret most quickly?
• myopically? sequentially?
Closed form solution seems infeasible for sequential case
• to date we’ve looked at heuristic elicitation
• computing myopically optimal queries often feasible, but
heuristics cheaper and seems to work as well
20
Query Types
Recall query types (both local and global variants)
• Comparison queries (is x preferred to x’ ?)
Sk fk(x[k]) > Sk fk(x’[k])
global or local (with conditioning set fixed)
• Bound queries (is fk(x[k]) > v ?)
global or local (with conditioning set fixed)
Each imposes linear constraints
U U
21
Elicitation Strategies (Bound): Simple GAI
Halve Largest Gap (HLG)
• ask if parameter with largest gap > midpoint
• MMR(U) ≤ maxgap(U), hence nlog(maxgap(U)/e) queries
needed to reduce regret to e
• bound is tight
• like polyhedral-based conjoint analysis [THS04]
f1(a,b) f1(a,b) f1(a,b) f1(a,b) f2(b,c) f2(b,c) f2(b,c) f2(b,c)
22
Elicitation Strategies (Bound): Simple GAI
Current Solution (CS)
• only ask about parameters of optimal solution x* or regret-
maximizing witness xw
• intuition: focus on parameters that contribute to regret
reducing u.b. on xw or increasing l.b. on x* helps
• use early stopping to get regret bounds (CS-5sec)
f1(a,b) f1(a,b) f1(a,b) f1(a,b) f2(b,c) f2(b,c) f2(b,c) f2(b,c)
23
Elicitation Strategies (Bound): Simple GAI**
Optimistic • query largest-gap parameter in optimistic soln xo
Pessimistic • query largest-gap parameter in pessimistic soln xp
Optimistic-pessimistic (OP) • query largest-gap parameter xo or xp
Most uncertain state (MUS) • query largest-gap parameter in uncertain soln xmu
CS needs minimax optimization; HLG needs no optimization; others require standard optimization
None except CS knows what MMR is (termination is problematic)
24
Results (Small Rand, Unif)
10vars; < 5 vals 10 factors, at most 3 vars Users drawn using uniform prior over parameters (45 trials) Gaussian priors similar
25
Results (Car Rental, Unif)
26 vars; 61 billion configs 36 factors, at most 5 vars; 150 parameters Users drawn using uniform prior over parameters (45 trials) Gaussian priors similar
26
Results (Real Estate, Unif)
20 vars; 47 million configs 29 factors, at most 5 vars; 100 parameters Users drawn using uniform prior over parameters (45 trials) Gaussian priors similar
27
Results (Large Rand, Unif)
25 vars; < 5 vals 20 factors, at most 3 vars Users drawn using uniform prior over parameters (45 trials) Gaussian priors similar
Elicitation Strategies (Comparison)
Comparison queries can be generated using CSS too
• HLG is harder to generalize to comparisons (see polyhedral)
CSS: ask user to compare minimax optimal solution x*
with regret-maximizing witness xw
easy to prove this query is never “vacuous”
28
29
Summary of Results
CS works best on test problems
• time bounds (CS-5): little impact on query quality
• always know max regret (or bound) on solution
• time bound adjustable (use bounds, not time)
OP competitive on most problems
• computationally faster (e.g., 0.1s vs 14s on RealEst)
• no regret computed so termination decisions harder
Other strategies less promising (incl. HLG)
30
Interpretation
Provable regret reduced very quickly
• true regret faster (often to optimality)
• CS focuses on relevant parameters
Seems like a lot of bound queries
• problems very large (several hundred dimensions)
• several hundred queries quite reasonable for high stakes
domains (cf. manual scenario navigation!)
Comparison queries can work much better empirically
(see CA results [BSS04])
Apartment Search with Minimax Regret
Are users comfortable with minimax regret?
Study with UofT students
• search subset of student housing DB (100 apts) for rental
• GAI model over 9 variables, 7 factors
• queries generated using CSS (bound, anchor, local, global)
various conditions: GAI, no context, additive
continue until MMR=0 or user terminates (“happy”)
• post-search: let user search through entire DB to find best 10 or
so apartments
31
32
Apartment Search (DB= 100, 9 attr, 6 factors)
User Study Goals
Aim: test the efficacy, comprehension, acceptability of
MMR-based recommendation
• User comments and ratings
• Decision quality and time to decision
Evaluate
• GAI model vs. additive model
• significance of local context
• different query strategies (mix of types vs. GCQP)
Assess query costs • Time
• Perceived difficulty
33
User study design
40 participants, randomly assigned to 6 different
subgroups
Task 1: search for an apartment in Toronto using UTPref
to search through a DB of 100 apartments
Task 2: evaluate recommended apartment and UTPref
experience
34
35
36
37
38
39
40
41
42
43
44
45
Recommended apartment
“Final list” of highest-rated and low-regret apartments
Results: overall evaluation
46
I found this application easy to use 6.35 7
I am satisfied with recommendation 5.35 5
I fully understood all questions 6.30 6
Some questions were too hard 1.65 2
The task took too much time 2.23 2
(1-strongly disagree, 7 – strongly agree)
Average Median
Results: recommendation quality
47
Results: time
UTPref recommendation process
• Average 481s (scalable)
Linear search through database (100 appts.) with
“support”
• Average 708s (not scalable)
• have familiarity from phase 1
48
Results: comparison of subgroups
GAI vs. additive models
• GAI performs better (rank, qrank) [no statistical significance]
Mix of queries vs. global comparison only
• GCQP is faster
• mix of queries: better quality results [no statistical significance]
•GAI with and without local context
• no detectable difference
49
Summary of Results
Qualitative Results:
• system-recommended apartment almost always in top ten
• if MMR-apartment not top ranked, error (how much more is top
apartment worth) tends to be very small : median $45
• very few queries/interactions needed (8-40); time taken roughly
1/3 of that of searching through DB with our tools
• user feedback: comfortable with queries, MMR, felt search was
efficient
50
51
Regret-Based Methods: Summary
Minimax regret a valuable means to make decisions of
behalf of others in presence of utility function uncertainty
• requires no prior over utility function
Computationally effective means to solve many
interesting classes of problems
• works well compared to ACA/PACE methods
Heuristic elicitation methods appear to work well
52
How to improve?
GAI, MMR have some nice properties
• does not assume restrictive additive model
• all queries are semantically sound/motivated
• MMR drives elicitation effectively, offers guarantees
prior-free, but defaults/priors could be exploited
• flexible forms of interaction (costs can be used)
So how do we move from this to the “vision” of the truly
intelligent agents (e.g., travel agent, real estate agent)?
References F. Bacchus and A. Grove. Graphical models for preference and utility. UAI-95, pp.3–10, 1995.
A. Ben-Tal, A. Nemirovski. Robust solutions of uncertain linear programs. Operations Res. Letters, 25:1–13, 1999.
C. Boutilier, F. Bacchus, and R. I. Brafman. UCP-Networks: A directed graphical representation of conditional
utilities. In Proc. of UAI-01, pp.56–64, Seattle, 2001.
C. Boutilier,R. Das, J. O. Kephart, G. Tesauro and W. E. Walsh. Cooperative Negotiation in Autonomic Systems
using Incremental Utility Elicitation. UAI-03, Acapulco, pp.89–97 (2003).
C. Boutilier, R. Patrascu, P. Poupart, and D. Schuurmans. Constraint-based optimization and utility elicitation using
the minimax decision criterion. Artifical Intelligence, 170(8–9):686–713, 2006.
C. Boutilier, T. Sandholm, and R. Shields. Eliciting bid taker non-price preferences in (combinatorial) auctions. In
Proc. of AAAI-04, pp.204–211, San Jose, CA, 2004.
D. Braziunas and C. Boutilier. Minimax regret based elicitation of generalized additive utilities. UAI-07, 2007
D. Braziunas and C. Boutilier. Assessing Regret-based Preference Elicitation with the UTPREF Recommendation
System. In Proc. of ACM EC-10, 2010.
Vijay S. Iyengar, Jon Lee, and Murray Campbell. Q-Eval: Evaluating multiple attribute items using queries. Third
ACM Conference on Electronic Commerce, pages 144–153, Tampa, FL, 2001.
S. Ghosh, J. Kalagnanam. Polyhedral Sampling for Multiattribute Preference Elicitation. Fifth ACM Conference on
Electronic Commerce, pages 256-257, San Diego, 2003.
P. Kouvelis and G. Yu. Robust Discrete Optimization and Its Applications. Kluwer, Dordrecht, 1997.
R. Patrascu,, C. Boutilier, R. Das, J. O. Kephart, G. Tesauro andW. E.Walsh. New Approaches to Optimization and
Utility Elicitation in Autonomic Computing. AAAI-05, pp.140–145, Pittsburgh (2005).
A. Salo and R. P. Hämäläinen. Preference ratios in multiattribute evaluation (PRIME)–elicitation and decision
procedures under incomplete information. IEEE Trans. on Systems, Man and Cybernetics, 31(6):533–545, 2001.
L. Savage. The Foundations of Statistics. Wiley, NY, 1954.
Olivier Toubia, John Hauser, and Duncan Simester. Polyhedral methods for adaptive choice-based conjoint
analysis. Technical Report 4285-03, Sloan School of Management, MIT, Cambridge, 2003.
T.Wang and C. Boutilier. Incremental utility elicitation with the minimax regret decision criterion. In Proc. of IJCAI-
03, pp.309–316, Acapulco, 2003.
53
54
Decision Problem: Constraint Optimization
Standard constraint satisfaction problem (CSP):
• outcomes over variables X = {X1 … Xn}
• constraints C over X : feasible decisions/outcomes
generally compact, e.g., X1 & X2 ¬ X3
e.g., Power > 280hp & Make=BMW FuelEff > 9.5l/100km
e.g., Volume(Supplier27) > $10,000,000
Feasible solution: a satisfying variable assignment
Constraint-based/combinatorial optimization:
• add to C a utility function u: Dom(X) → R / [0,1]
• u parameterized compactly (weight vector w)
e.g., linear/additive, generalized additive models
55
Polyhedral Conjoint Analysis* [THS04]
CA approach to marketing, product design
• often “unconstrained” design
• adaptive CA: queries depend on previous responses
Polyhedral adaptive method: FastPACE [THS04]
Assume additive utility, discrete attributes
Query types (global outcomes)
• Choice scenarios: pick/rank from a list
• Metric paired comparisons
A “much/somewhat/barely” better than B?
• Induce linear constraints on utility space
56
FastPACE: Decisions*
Polyhedron P of feasible utility
vectors (given prior responses)
Analytic center (AC) of P
• maximizes geometric mean of
distances to each facet
• relatively easy to compute
Treat AC as “consensus” utility
function (approx. average assuming
uniform distr. over U)
Decision: max wrt AC w1
w2
AC
57
FastPACE: Elicitation Principles*
Principles:
• volume of region where each choice
dominates others should be roughly
equal (“balanced”)
• avoid “short axis” cuts: get precision
where it’s needed most
reduce perimeter as well as volume
Note: exactly HLG strategy if
restricted to local (axis parallel)
queries w1
w2
58
FastPACE: Elicitation Heuristic*
Compute AC c of P
Find tightest bounding ellipsoid of P
centered at c
• finding longest axes in P too hard
Find intersection (ui) of each axis with
boundary of P
Compute product profiles for each ui
that will induce good cuts
w1
w2
u1
u2
u3
u4
59
FastPACE: Computing Profiles*
Fix some constant m
Compute xi by solving:
• max xi ui s.t xic m
• indifference curve between xi and xj will
(approx) pass thru c
Choice query: choose from one of the
computed profiles
• response will induce one of small
regions
w1
w2
u1
u2
u3
u4 c
60
Alternatives*
GK04: • Assume “uniform distribution” over U within P
• Use Markov chain sampling procedure to estimate centroid (avg utility function) and “longest axis” (queries)
ILC04: • Use bounding box rather than bounding ellipsoid
Summary of PACE • OK in practice, small dimensions
• Downsides: decision criterion, doesn’t exploit feasibility (volume based), only global queries
61
Why Minimax Regret?*
Appealing decision criterion for strict uncertainty
• contrast maximin, etc.
• not often used for utility uncertainty [BBB01,HS010]
x
x’
x’ x
x
x’ x’
x
x x’
x
x’
Bett
er
u1 u2 u3 u4 u5 u6
62
Reverse Combinatorial Auctions
Buyer: desires collection of items G
Sellers: offer bids bi,pi, where bG
• possibly side constraints (seller, buyer)
Feasible allocation: subset B’ B covering G
let X denote the set of feasible allocations
Winner determination: least-cost allocation
63
Non-price Preferences
A and B for $12000. C and D for $5000…
A for $10000.
B and D for $5000 if A; B and D for $7000 if not A...
Joe
Hank
etc…
A, C to Fred. B, D, G to Frank. F, H, K to Joe…
Cost: $57,500.
That gives too much business
to Joe!!
64
Utility for Non-price Attributes
Assume utility bearing features F = { f1 ,…,fk }
• e.g., num-winners, avg. quality, suppliers from region R,…
Assume utility u for allocation is linear
Utility function u: non-negative weight vector: w = w1,…,wk
• WD algorithms can be used directly
Assumptions (can be relaxed):
• linear, independent utility for features
• linearly definable features
65
Automated Scenario Navigation
Given partial utility info, I suggest allocation x (least max-regret). It could be up to $8000 from optimal. Accept?
No, that’s too much potential error.
OK. Let’s refine your utility function: would you prefer x ($, %Joe, AvgQual) or
x’ ($’, %Joe’, AvgQual’)?
I definitely prefer x’.
OK. I suggest allocation x’’ (least max regret). It could be up to $2500 from
optimal. Accept?
66
Robust Winner Determination:
Computing Minimax Regret
Now assume:
• allocation features f1, …, fk
• linear utility
• unknown weight vector w, but we know it lies in polytope W
Minimax regret optimal allocation wrt W:
67
Bender’s Reformulation
Initial formulation: minimax IP, quadratic objective
Linear IP formulation (infinitely many constraints)
Linear IP formulation (finitely many constraints)
68
Constraint Generation
Avoid W-vertex enumeration: constraint generation
Let Gen = {(x’,w)} for some feasible x’, wW
• solve
let solution be x* with objective value *
• compute max regret MR(x*,W) of solution x*
solution has max regret r, witness (x’’, w’’)
• if r > *, add (x’’, w’’) to Gen, repeat; else terminate
note: (x’’, w’’) is maximally violated constraint
69
Details of Linearization**
Discrete features in objective modeled as follows:
• e.g.,
Replace quadratic term wi Iij(x’) by new variable zij(x’)
Constrain (where mi ≤wi ≤ Mi are bounds on weights)
70
Constraint Generation Performance
Procurement: 10 suppliers, 50 items, 500 bids
• six features (#suppliers in each of five regions, overall)
Avg. WD Sol’n Time: 19 sec
Avg. Max Regret and True Regret as function
of number of comparison queries: 40 items, 400 bidders, 7 features (100 instances)
72
Number of comparison queries to reach max
regret zero: 40 items, 400 bidders, 7 features (100
instances)
73
1
Preference Elicitation in Multiagent Domains:
Mechanism Design, Voting and Stable Matching
Craig Boutilier
Department of Computer Science
University of Toronto
2
Overview Section 1: Decision Theory
• Basics, Axioms, Multiattribute utility, Preference Representations
Section 2: Basics of Preference Elicitation
• Queries, revealed preference, gambles, MAUT methods
Section 3: Regret-based and Polyhedral Methods
• computational motivations, imprecise utility functions
• minimax regret
• polyhedral conjoint analysis, volumetric methods Section 4 (optional): Bayesian methods, preference aggregation
• generic Bayesian model, classification-based approach
• aggregation methods, collaborative filtering (brief overview)
Section 5: Elicitation in Mechanism Design
• basics of mechanism design, incentives, VCG
• partial revelation mechanisms
Section 6: Elicitation in Social Choice: Voting and Stable Matching
• basics of social choice, voting, stable matching problems
• minimax regret and incremental vote elicitation
• preference elicitation in stable matching
3
Product Configuration
Luggage Capacity? Two Door? Cost?
Engine Size? Color? Options?
4
Bargaining for a Car
Luggage Capacity? Two Door? Cost?
Engine Size? Color? Options?
$$ $$
$$
$$
$$
$$
$$
$$
5
Mechanism Design
Incentive to misreport preferences places us in the realm
of mechanism design
Design protocol for interacting agents
• maximize some objective assuming self-interest
• generally, a social choice function (e.g., efficiency) that picks
outcome based on agent preferences
• e.g., auctions, bargaining, network protocols, facility location, …
Revelation principle
• focus on direct, incentive compatible mechanisms
• e.g., famous mechanisms like VCG
• these require each agent to reveal their entire utility function to
the mechanism
6
The Preference Bottleneck
Full utility revelation problematic as we’ve seen
• computational, cognitive, communication costs
• often most of this information is not relevant
• preference elicitation tackles this is single-agent case
elicit only “relevant” information
tradeoff decision quality with elicitation effort
Can we apply the same ideas in multiagent settings?
Key issue: must address the issue of incentives
• may be in an agent’s interest to lie about its true preferences
Example: First Price Auction
Auction off a single good
• accept (sealed) bids from each potential buyer
“I am willing to buy this for X”
• winner is highest bidder
• Winner pays her bid
Clear incentive to hedge your bid
• you should not state what you are truly willing to pay
• bid should depend on your beliefs of others’ valuations
means you won’t generally be efficient
there are exceptions (e.g., common priors under fairly general
assumptions)
7
Inefficiency of First Price Auction
8
$$ $$ Agent A •Value $25,000 •Believes B’s value around $17K •Bids $17K+D (say., $18K)
Agent B •Value $22,000 •Believes A’s value around $19K •Bids $19K+D (say., $20K)
Resulting outcome is inefficient: •B wins auction at $20K, realizes surplus of $2K •If A had bid $20K+e, would have won with surplus $5K-e
•A could have paid B $2500 not to bid and all would be better off
Example: Second Price Auction
Auction off a single good
• accept (sealed) bids from each potential buyer
“I am willing to buy this for X”
• winner is highest bidder
• Winner pays the second highest bid
No incentive to hedge your bid
• you should state what you are truly willing to pay
• this is independent of what others bid
• consequence: winner will be bidder with highest valuation (i.e.,
efficient outcome)
9
Second Price (Vickrey) Auction
Why not bid more than true valuation?
• suppose your bid is highest?
• suppose your bid is not highest?
Why not bid less than true valuation?
• suppose your bid is highest?
• suppose your bid is not highest?
Basic principle:
• charge person based on externality they imposed on system
(other bidders)
• what would the best outcome have been had the winner not
participated?
10
11
Preference Elicitation in Mechanism Design
Considerable work on preference elicitiation in auctions,
combinatorial auctions (Sandholm, Nisan, Parkes, etc.)
• sequentially elicit enough info for optimal winner determination
and VCG payments
• e.g., A prefers XY most, B prefers WZ most; no need to
determine their preferences for other bundles or precise values
Drawbacks
• not general (methods focus specifically on auctions, CAs)
• usually sequential, not one-shot
• requires determining optimal allocation (not an approximation);
hence can’t offer savings in general (Nisan&Segal 05)
12
Preference Elicitation in Mechanism Design
Move beyond this: allow approximately optimal allocation
• adopts the view of work in single-agent preference elicitation
specifically, regret-based allocation models we just studied
• difficulties: dominant strategy implementation not possible
(Roberts 1979; Fadel&Segal 2005)
• we go for approximate, (ex-post or DS) implementation
• provide a general view of the problem of (one-shot and
sequential) partial revelation mechanisms
13
Overview (Hyafil and Boutilier 2006, 2007)
General framework for PRM design
One-shot mechanisms
• generalize VCG to PRMs (approximate efficiency/regret)
• new payment schemes to induce approximate IC and IR in
dominant strategy implementation (DSE if zero regret)
• further optimization of secondary objectives
• algorithm to design partial types
Sequential mechanisms
• slightly different model, but similar results
• algorithm to design query strategy
Viewpoint: why approximate incentives are useful
14
Basic Mechanism Design Setup
Choice of x from outcomes X
• e.g., car1 = <red,2door,280hp,Audi>, car2, car3, …
Agents 1..n: type ti Ti and valuation vi(x, ti)
• e.g., tbuyer = [car1:$25072, car2:$14991, car3:$17623…]
Type vectors: tT and t-i T-i
15
Basic Mechanism Design Setup
Goal: optimize social choice function f: T X
• e.g., social welfare SW(x,t) = vi(x, ti)
• car-seller pair that maximizes surplus
Assume payments and quasi-linear utility:
• ui(x, i ,ti ) = vi(x, ti ) - i
Our focus: SW maximization, quasi-linear utility with the possibility of
payments
16
Mechanism Design
A mechanism m consists of three components:
• actions Ai
• allocation function O: A X
• payment functions pi : A R
m induces a Bayesian game
• m implements social choice function f if
in equilibrium it induces each agent i to play strategy i
and O((t)) = f(t) for all tT
• note the dependence on the type of equilibrium
e.g., dominant strategy, ex post, Bayes Nash
17
A Simple Mechanism: Allocate an Object
Each agent has three types: values object either 1, 2 or 3
What is best strategy for each agent?
Rt Lft Both
Rt ½ ; ½ A A
Lft B ½ ; ½ A
Both B B ½ ; ½
Agent A
Age
nt B
Outcome Function
Rt Lft Both
Rt 1 1 1
Lft 1 2 2
Both 1 2 3
Agent A
Age
nt B
Payment Function (paid by winner)
18
Direct Mechanisms
A direct mechanism is one where Ai =Ti
A direct mechanism is incentive compatible if truth-telling
is an equilibrium strategy for each agent
Revelation principle: if there is a mechanism that
implements social choice function f, then there is a direct,
incentive compatible mechanism that implements f
Revelation principle has placed focus is on mechanisms
where agents reveal their utility functions
Note: famous theorem of Gibbard-Satterthwaite ensures
implementation of arbitrary SCFs not possible in general
19
A Direct Mechanism: Allocate an Object
What is best strategy for each agent?
1 2 3
1 ½ ; ½ A A
2 B ½ ; ½ A
3 B B ½ ; ½
Agent A
Age
nt B
Outcome Function
1 2 3
1 1 1 1
2 1 2 2
3 1 2 3
Agent A
Age
nt B
Payment Function (paid by winner)
20
Groves Schemes
For example, Groves scheme:
• elicit all agent utility functions
• determine/select efficient (SWM) allocation
• charge agents using the following payment function:
Groves implements SWM in dominant strategies
• transfer hi (arbitrary) doesn’t depend on i’s report
• so i influences utility by revealing utility function that maximizes
her own utility for allocation x* plus social welfare of other
agents; i.e., (full) social welfare
• since allocation rule maximizes SW, i should report true
type/utility
VCG Mechanism
VCG is a specific Groves scheme where the function hi is
given by the social welfare received by other agents had i
not be present
• hi(t-i) = SW(x*-i (t-i); t-i )
This “Clark payment” ensures individual rationality (no
agent has incentive not to participate) as well as incentive
compatibility in dominant strategies
21
Partial Revelation
Expecting users/agents to reveal their entire utility
function to a mechanism is simply unrealistic
• computational, cognitive, communication, revelation costs
• generally don’t need all that information to determine a good
outcome (e.g., maximizing social welfare)
Can we adapt Groves/VCG to partial revelation or
incremental utility elicitation mechanisms?
A stumbling block: incentives!
22
23
The Generality of Groves Schemes
Strong results: Groves is basically the “only choice” for
dominant strategy implementation
Roberts (1979): only social choice functions
implementable in dominant strategies are affine welfare
maximizers (if all valuations possible)
• Fadel & Segal 2005 extend somewhat
Green and Laffont (1977): must use Groves payments
to implement affine maximizers
Implications for partial revelation? Coming…
24
Partial Revelation Mechanisms
Full revelation unappealing for many reasons
A partial type is any subset i Ti
A one-shot (direct) partial revelation mechanism
• each agent reports a partial type i i
• typically i partitions type space, but not required
A truthful strategy: ti i(ti ) for all ti Ti
Goal: minimize revelation, computation, communication
by suitable choice of partial types
Intuitive Illustration
Ask buyer for bounds on valuations of specific cars
• what would you pay for a Toyota?
< $20K, [$20-22K], [$22-23K], [$23-25K], [$25-30K], > $30K
• what would you pay for an Audi?
< $32K, [$32-36K], [$36-45K], > $45K
Can be broken up by attributes, local factors, pairwise
comparisons, etc.
Queries can be asked in sequence (needn’t express type
in one shot)
• “one shot” means that the queries one asks cannot be influenced
by the responses to previous queries
• “sequential” means it can (and changes equilibrium properties)
25
26
Implications of Roberts/Green-Laffont
Partial revelation means we can’t generally maximize
social welfare
• must allocate under type uncertainty (as in single-agent case)
• also cannot compute exact Groves payments
But if SCF is not an affine maximizer, or if we don’t use
Groves payments, we can’t expect dominant strategy
implementation in general!
What are some solutions?
• incremental and “hope for” less than full elicitation
• relax conditions on Roberts results
• relax the solution concept and hope for intuitive results
27
Existing Work on PRMs
Bisection auction (GHMV-02) and incremental elicitation in
CAs (Sandholm et al., Nisan, Parkes, etc.)
• require enough revelation to determine optimal outcome and (to
ensure incentives) to determine VCG payments
• incremental (so ex post rather than DSE)
Priority games (Blumrosen&Nisan 02)
• genuinely partial and approximate efficiency
• but very restricted valuation space
Ascending (and other) auction designs are PRMs to
some extent, but not fully
• most agents must still determine exact valuation (up to
precisions of bid increment)
28
Regret-based PRMs [HB06,07a,07b]
In any PRM, how is allocation to be chosen?
Let’s use minimax regret
• Pairwise regret of wrt partial type vector
• Max regret and minimax regret
• x*() is minimax optimal decision for
A regret-based PRM: O()=x*() for all
xx ˆ,
29
Regret-based Allocations
buyer
loveAudi,hateToyota
loveAudi,ToyotaOK
likeAudi,loveToyota
likeAudi,ToyotaOK
seller
lotsAudi,fewToyota
someAudi,fewToyota
someAudi,someToy
MatchToyota? SWToy: Low SWAudi: High MaxRegr: High
Match Audi? SWToy: Med. SWAudi: OK MaxRegr: Low
30
Regret-based PRMs: Efficiency
Efficiency not possible with PRMs (unless MR=0)
• but bounds are quite obvious
Prop: If MR(x*(),) e for all , then any regret-
based PRM m is e-efficient for truthtelling agents.
• thus we can tradeoff efficiency for elicitation effort
• but how do we ensure truthfulness?
31
Regret-based PRMs: Incentives
Can generalize Groves payments
• let fi (i) be an arbitrary type in i
Thm: Let m be a regret-based PRM with partial types
and a partial Groves payment scheme. If MR(x*(),) e
for all , then m is e-dominant strategy incentive
compatible.
In other words, no agent can gain more than e by
misreporting their preferences (no matter what other
agents do)
32
Regret-based PRMs: Rationality
Can generalize Clark payments as well
Thm: Let m be a regret-based PRM with partial types
and a partial Clark payment scheme. If MR(x*(),) e
for all , then m is e-ex post individually rational.
In other words, no agent can gain more than e by
abstaining from participation
• A Clark-style regret-based PRM gives approximate efficiency,
approximate IC (dominant) and approximate IR (ex post)
33
Approximate Incentives and IR
Natural to trade off efficiency for elicitation effort
Is approximate IC acceptable?
• note that computing a good “lie” can be difficult
• if incentive to deviate from truth is small enough, then formal,
approximate IC ensures practical, exact IC
Is approximate IR acceptable?
• determining value of nonparticipation very difficult; if potential
gain of withdrawing is small enough, then m is “practically” IR
Thus regret-based PRMs offer scope for tradeoffs
• as long as we can find a good set of partial types
34
Computing MMR Allocations
Minimax regret optimization nontrivial
• connected to work in robust optimization (see previous Sections)
• especially difficult for large, multiattribute models
Use methods for (single-agent) PE/optimization
• exploit generalized additive independence
• assume partial types represented by linear constraints over
utility function parameters
• optimization as a semi-infinite IP using constraint generation
(Bender’s style decomposition)
35
Partial Type Optimization
Designing PRM: must pick partial types
• we focus on bounds on utility parameters
Here’s a simple greedy approach
• Let be current partial type vectors (initially {T} )
• Let =(1,… i,…n ) be partial type vector with greatest
MMR
• Choose agent i and suitable split of partial type i into ’i and ’’i
• Replace all [i ] by pair of vectors: i ’i ;’’i
• Repeat until bound e is acceptable
36
The Mechanism Tree
37
A More Refined Approach
Simple model has drawbacks
• exponential blowup (“uniform” partitioning)
• split of i useful in reducing regret in one partial type vector ,
but is applied at all partial type vectors
Refinement
• apply split only at leaves where it is “useful”
keeps tree from blowing up, saves computation
• new splits traded off against “cached” splits
• once done, use either uniform/variable resolution types for each
agent
38
Uniform vs. Variable Resolution
i
p1
p2
i
p1
p2
39
Heuristic for Choosing Splits
Adopt variant of current solution strategy
Let be PTV with max MMR
• optimal solution x* regret-maximizing witness xw
• only split on parameters of utility functions of optimal solution x*
or regret-maximizing witness xw
• intuition: focus on parameters that contribute to regret
reducing u.b. on xw or increasing l.b. on x* helps
• pick agent-parameter pair with largest gap
40
Suggestive Empirical Results
To illustrate:
• use only naïve (split) algorithm
• single buyer, single seller
• 16 goods specified by 4 boolean variables
• valuation/cost given by GAI model
two factors, two vars each (buyer/seller factors are different)
thus 16 values/costs specified by 8 parameters
no constraints on feasible allocations
41
Suggestive Empirical Results
42
Interpretation of Results
Initial Regret: 50%-146% of optimal SW
• reduced to 20%-56% with 11 bits (regret-based splits)
reduced to 30%-86% (uniform)
• good anytime behavior
• savings relative to uniform:
5.5 bits vs. 11 to reach worst-case regret of 90
6.5 bits vs. 11 to reach average-case regret of 70
11 bits of communication
• 0.7 bits per good/item; 1.4 bits per utility function parameter
43
Sequential PRMs
Optimization of one-shot PRMs unable to exploit
conditional “queries”
• e.g., if seller cost of x greater than $, needn’t ask you for your
valuation of x
Sequential PRMs
• incrementally elicit partial type information
• apply similar heuristics for designing query policy
• incentive properties somewhat weaker: opportunity to
manipulate payments by altering the query path
thus additional criteria can be used to optimize
44
Sequential PRMs: Definition*
Set of queries Qi
• response rRi(qi) interpreted as partial type i (r) Ti
• history h: sequence of query-response pairs possibly followed by
allocation (terminal)
Sequential mechanism m maps:
• nonterminal histories to queries/allocations
• terminal histories to set of payment functions pi
Revealed partial type i(h): intersection ri of in h
m is partial revelation if exists realizable terminal h s.t.
i(h) that admits more than one type ti
45
Sequential PRMs: Properties*
Strategies i(hi ,qi ,ti) selects responses
• i is truthful if ti i (i(hi ,qi ,ti))
• truthful strategies must be history independent
(Deterministic) strategy profile induces history h
• if h is terminal, then quasi-linear utility realized
• if history is unbounded, then assume utility = 0
Regret-based PRM defined as in one-shot
• payment schemes vary a bit
46
Max VCG Payment Scheme*
Assume terminal history h
• let be revealed PTV at h, x*() be allocation
Max VCG payment scheme:
• where VCG payment is:
47
Incentive Properties*
Suppose we elicit type info until MMR allocation has max
regret and we use “max VCG”
Define:
Thm: m is -efficient, -ex post IR and (+e(x*()))-ex
post IC
• weaker results due to possible payment manipulation
48
Elicitation Approaches*
Standard max regret based approaches
• give us bounds on efficiency , no a priori e bounds
Two-phase (2P): akin to existing schemes
• once small enough (e.g., 0), elicit additional payment
information until max e is small enough
Neither reduces manipulability directly
• allocation chosen to minimize SW-loss
• payments chosen to reduce manipulability
• sum provides upper bound on manipulability
• but allocation choice influences manipulability as well
49
Direct Manipulability Measures*
Greatest gain agent can realize by lying: • difference of “best case” and actual utility
• mechanism in a-manipulable iff
Thm: If m is a-manipulable with partial VCG payments,
then m is a-efficient, a-ex post IR and a-ex post IC.
50
Manipulability Reduction*
Direct optimization
• ask queries that directly reduce a bound
• can be formulated as regret-style optimization
• analogous query strategies possible, but don’t work very well
Hybrid approach to elicitation
51
Suggestive Empirical Results*
Similar bargaining problem (larger)
1 buyer, 2 sellers
• 13 factors per agent, 1-4 vars/factor, 2-9 values/var
• 825 parameters per value/cost model
Compare
• 2-phase approach (2P): plot +e(x*())
• a2-phase (a2P): same, plot a
• a hybrid approach: query parameters that have potential to
reduce both SW-regret and manipulability regret
52
Empirical Results (40 random agent profiles)
CH summary: •mnp=0 after 95 queries •regret=0 after 71 queries •only 8% of parameters queried (avg) •92% of utility uncertainty (perim) remains (cf. 64% with “halving” in same number of queries… and far from zero-regret)
53
Summary
Partial revelation mechanisms important, especially in
multiattribute, combinatorial domains
Formalization of PRMs and regret-based design
• approximate efficiency requires relaxation of DS/IC
• our approach offers bounds on efficiency, IC, IR
• more importantly, design framework allows tradeoffs
Mechanism optimization techniques
• so far fairly crude, but encouraging emprically
• leverages existing regret-based optimization
• much to be done!
References L. Blumrosen and N. Nisan. Auctions with severely bounded communication. FOCS-02, 2002.
W. Conen, T. Sandholm. Partial-revelation VCG mechanisms for combinatorial auctions. AAAI-02.
N. Hyafil and C. Boutilier. Partial Revelation Automated Mechanism Design. AAAI-07, pp.72–78,
Vancouver (2007).
N. Hyafil and C. Boutilier. Mechanism Design with Partial Revelation IJCAI-07, pp.1333–1340,
Hyderabad, India (2007).
N. Hyafil and C. Boutilier. Regret-based Incremental Partial Revelation Mechanisms. AAAI-06,
pp.672–678, Boston (2006).
N. Hyafil and C. Boutilier. Regret Minimizing Equilibria and Mechanisms for Games with Strict Type
Uncertainty. UAI-04, pp.268–277, Banff, AB (2004).
A. Mas-Colell, M. D. Whinston, and J. R. Green. Microeconomic Theory. Oxford University Press,
New York, 1995.
N. Nisan and A. Ronen. Computationally feasible VCG mechanisms. ACM EC-00, 242-252, 2000.
N. Nisan and I. Segal. The communication requirements of efficient allocations and supporting
prices. J. Econ. Th., 2005.
Sandholm, T. and C. Boutilier. Preference Elicitation in CombinatorialAuctions, in Combinatorial
Auctions, P. Crampton, Y. Shoham and R. Steinberg (eds.), MIT Press, pp.233–264, Jan. 2006.
T. Sandholm, V.Conitzer, and C. Boutilier. Automated design of multistage mechanisms. IJCAI-07,
Hyderabad, India, 2007.
M. Zinkevich, A. Blum, and T. Sandholm. On polynomialtime preference elicitation with value
queries. ACM EC-03, San Diego, 2003.
54
1
Preference Elicitation in Multiagent Domains:
Mechanism Design, Voting and Stable Matching
Craig Boutilier
Department of Computer Science
University of Toronto
2
Overview Section 1: Decision Theory and Basics of Preference Elicitation
Section 2: Regret-based and Polyhedral Methods
• computational motivations, imprecise utility functions
• minimax regret
• polyhedral conjoint analysis, volumetric methods (if time)
[May skip due to time] Section 3: Elicitation in Mechanism Design
• basics of mechanism design, incentives, VCG
• partial revelation mechanisms
Section 4: Elicitation in Voting • basics of social choice, voting
• minimax regret and incremental vote elicitation
Section 5: Elicitation in Stable Matching
• basics of stable matching problems
• preference elicitation in stable matching
Social Choice
Social choice: study of collective decision making
Aggregation of individual preferences determines a consensus outcome for some population
• Political representatives, committees, public projects,…
• Studied for millennia, formally for centuries
???
3
Why Computational Social Choice
Computational models/tradeoffs inherently interesting
• Winner determination, manipulation, approximations,
computational/communication complexity
Decision making in multiagent systems
Preference and rank learning in machine learning
• Ready availability of preference data of millions of individuals
• Web search data, ratings data in recommender systems, …
• Often implicit; but explicit preferences available at low cost
???
4
Our Agenda
Move to lower stakes, complex domains makes new
demands on social choice
• New models and decision criteria reflecting new uses
Focus today: minimizing amount of information needed to
come to good consensus choice
• Robust decision making with partial rankings/votes
• Incremental elicitation of voter preferences
• Incremental elicitation of preferences in stable matching
• Exploiting distributional information to make decisions and
minimize expected elicitation effort
• Learning probabilistic models of population preferences
5
Social Choice: Basic Framework
Alternative set A = {a1, …, am }
Voters N = {1..n}, each with preferences over A
Vote vi of voter i: a linear ordering (permutation) of A
Profile is collection of votes v = (v1, …, vn )
Consensus winner: alternative maximizing “consensus”
> > :
> > :
> > : 6
Voting Rules
Voting rule r: V →A selects a winner given a profile
Plurality: winner a with most 1st-place votes
• voters needn’t provide full ranking
Positional scoring: Assign score α to each rank position
with α(1) ≥ α(2) ≥ … α(m)
• Borda count well-known: α = <m,m-1,...,1>
• Winner: a with max sum of scores: ∑i α(vi (a))
• Plurality, k-approval, k-veto special cases
Maxmin Fairness (egalitarian):
• Score of a is min {i: m - vi (a)}
• Choose a with highest score
> >
3 2 1
7
Score-based Voting Rules
Many other rules: Copeland, maximin, Bucklin, etc.
Most voting rules have natural scoring functions
s(a, v) measures “quality” of alternative a given profile v
Rule r is consistent with s iff
r(v) argmax {s(a, v) : a A}
8
Vote Elicitation [Lu, Boutilier, IJCAI-11]
Use of complex (rank-based) voting schemes rare
• Cognitive complexity, communication costs
Elicitation of partial votes could ease this burden
• Find relevant comparisons… or even approximate winners
Voting Protocol with Approximation: Ask a few
queries of voters: if close enough, stop; otherwise ask a
few more; continue until satisfied
Theoretically, relevance won’t save much:
• Comm. compl. O(nm log m) for Borda, etc. [CS ACM-EC-05]
• This doesn’t mean practical savings are not possible!
9
Partial Vote Profiles
Partial vote pi of voter i: consistent set of pairwise
comparisons of form aj ≻ ak • Captures most natural constraints: paired comp, top-k, etc.
Partial profile p = (p1, …, pn)
Completions C(pi), C(p) : set of votes extending pi , p
10
a b c
d e
f ≻
≻ ≻ ≻ ≻
Robust Winner Determination
In general, may want to decide given a partial profile • Robustness criteria rarely discussed in social choice
We propose minimax regret to determine winners
11
Adversarial
choice
Best
response
Minimax Regret: Illustration (Borda)
>
>
Proposed Winner: Tennis
Minimax Regret: Illustration (Borda)
>
>
Borda Score(Tennis) = 2
Borda Score (Park) = 4
Max Regret(Tennis) = 2 (4-2)
> >
>
>
Proposed Winner: Tennis
Proposed Winner: Pool
Minimax Regret: Illustration (Borda)
Borda Score(Tennis) = 2
Borda Score (Park) = 4
Max Regret(Tennis) = 2 (4-2)
> > Proposed Winner: Tennis
Proposed Winner: Pool
Borda Score(Tennis) = 6
Borda Score (Pool) = 0
Max Regret(Pool) =6 (6-0)
> >
> > Minimax Optimal: Tennis
Minimax Regret: 2
Why Minimax Regret*
Rationale is same as in single-agent decision problems
MMR offers a natural robustness criterion
• candidate with tightest error bounds (loss wrt optimal)
• provably optimal if MMR=0
Contrast with maximin
• provides quality guarantee, not optimality guarantee
Contrast with Bayesian methods, which have/are:
• need for a prior
• no (worst-case) guarantees
• computationally difficult (even to approximate)
15
Properties of Minimax Regret Solution**
MMR(p)=0 iff winner ap* is a necessary co-winner
Obs: MMR computation at least as hard as NecCo-Win
Obs: MMR-winner may not be a possible winner • In fact, all possible winners may have high max regret
16
Assume 2-approval:
• Only a, c are PWs: one
has score at least 2k+1,
while b has score 2k
• MR(b) = k+1
• MR(a) = MR(c) = 2k+1
MR of a, c twice that of b
Computing Minimax Regret
MMR for many problems often specified as an IP
• Problematic for voting: too many voters/variables
Instead, compute PMR (a, w, p) for all m2 pairs (a,w)
• Then MMR(p) = mina maxw PMR (a, w, p)
PMR can be computed in polytime for many rules
• find worst case completion of each voter’s partial vote pi ; can
usually be done independently for each voter
Xia, Conitzer (AAAI08) use similar ideas for necessary winners
• we illustrate with the Borda rule 17
PMR a b c MR
a 0 2 2 2
b 2 0 6 6
c 5 3 0 5
Computing Minimax Regret**
We illustrate with Borda (positional) scoring
• Positional: additively decomposable: s(a, v) = ∑i s(a, vi )
• Thus PMR decomposable: complete each pi independently
18
Computing Minimax Regret
Fix partial vote p: proposed alternative a and adversarial
witness w stand in only one of three relations in p
19
Computing Minimax Regret
Case 2: Maximize PMR(a,w) in only “one” way:
PMR(a,w) = |B’ F E U| + 1 = m - (|A W| + 1)
Computing Minimax Regret**
Case 1: Maximize PMR(a,w) in only “one” way:
21
PMR(a,w) = ‒(|B| + 1)
Computing Minimax Regret**
Case 3: Maximize PMR(a,w) in only “one” way:
PMR(a,w) = |F E U| +1
Computing Minimax Regret
Similar analysis: other positional scoring rules
Similar approach for non-decomposable scoring rules
Max regret computation is polytime for:
• Positional scoring rules
• Egalitarian (maxmin fairness)
• Bucklin
• Maximin
23
Regret-based Vote Elicitation
If MMR(p) too high, refine knowledge of voter
preferences
Current Solution Strategy (CSS):
• Use MMR solution (a*,w) to generate query: if we don’t reduce
PMR(a*,w), MMR will not be reduced
• So find some voter i with vote pi and ask query with potential
to reduce advantage of w over a* in C(pi)
• For each voter, queries considered depend on structural
properties of partial vote (whether Case 1, 2, 3; and size of sets)
24
Regret-based Vote Elicitation
Case 2: four reasonable query types
• a ≻ f for some f F
Max potential: f at “top” of large group
a ≻ u for some u U
Max potential: u at “top” of large group
e ≻ w for some e E
Max potential: e at “bottom” of large group
u ≻ w for some u U
Max potential: u at “bottom” of large group
Note: if MMR>0, one of U,E,F nonempty
for some voter (or sets in cases 1, 3)
25
f
Regret-based Vote Elicitation
Case 2: four reasonable query types • a ≻ f for some f F
Max potential: f at “top” of large group
• a ≻ u for some u U Max potential: u at “top” of large group
• e ≻ w for some e E Max potential: e at “bottom” of large
group
• u ≻ w for some u U Max potential: u at “bottom” of large
group
• Note: if MMR>0, one of U,E,F nonempty for some voter (or sets in cases 1, 3)
26
f
Vote Elicitation: Experiments*
Intuitions behind pairwise CSS can be generalized to top-
t queries (only pick voter, not alternative pair)
Compare CSS to two strategies
• Volumetric: choose voter/candidate-pair which introduces
greatest number of new paired comparisons
• Rand: random voter/candidate pair
27
Vote Elicitation: Sushi
28
Sushi: 5000 rankings of 10 varieties of sushi
Vote Elicitation: Dublin North 2002
29
Irish: 2002 electoral data (Dublin North); 3662 rankings over 12 candidates
Mallows Models
Let d(r,) denote Kendall-tau distance
• Number of pairwise inversions (swaps) between r,
Let be some central/modal ranking
Mallows -model (with dispersion )specifies P(r):
If = 1, P is uniform (IC); as →0, P concentrates on
Unimodal nature of model inflexible; but mixtures of
Mallows models can reasonably capture certain types of
population preferences
30
Vote Elicitation: Mallows
31
Mallows: 100 random rankings over 20 items; vary
dispersion
Summary of Results
MMR=0 after k paired comparisons per voter
• Sushi: CSS 11.82; Vol 20.64; Rand 20.63; MergeSort 25
• Irish: CSS 18.57; Vol 31.82; Rand 31.22; MergeSort 33
MMR=0 after k top-t queries per voter
• Sushi: CSS 3.40; Vol 4.18; Rand 5.50
• Irish: CSS 5.47; Vol 6.91; Rand 8.38
Anytime performance better for CSS as well
• E.g., reach 18% of initial regret on Irish data set after
only 5.82 queries (vs. 25.77 Vol; 24.03 Rand)
32
Voting: Summary
Robust optimization using MMR easy for several
important voting rules, easy to approximate well
CSS using MMR offers an effective form of elicitation
• Tends to be more effective when preferences correlated
• Anytime profile attractive for approximating winners
Issues:
• Multi-round, purely interactive nature
• Exploiting probabilistic info, priors
33
34
Overview Section 1: Decision Theory and Basics of Preference Elicitation
Section 2: Regret-based and Polyhedral Methods
• computational motivations, imprecise utility functions
• minimax regret
• polyhedral conjoint analysis, volumetric methods (if time)
[May skip due to time] Section 3: Elicitation in Mechanism Design
• basics of mechanism design, incentives, VCG
• partial revelation mechanisms
Section 4: Elicitation in Voting
• basics of social choice, voting
• minimax regret and incremental vote elicitation
Section 5: Elicitation in Stable Matching • basics of stable matching problems
• preference elicitation in stable matching
Stable Marriage Problem Classic two-sided matching problem
Set of n men M, n women W • Each m has preference order ≻m over W
• Each w has preference order ≻w over M
• Variants: acceptability, ties, many-to-one, non-bipartite …
Aim: find a stable matching μ of men-women:
• A pair (m,w) blocks μ if w≻m μ(m) and m≻w μ(w)
• Matching is stable iff it is unblocked by any pair
35
Gale-Shapley Algorithm (Male Proposing)
Each man maintains list of unproposed women
Each women records “best proposal” so far
At stage k:
• Each unengaged man proposes to most preferred woman on unproposed list
• Each women accepts her most preferred proposal (only if better than best prior proposal)
• If man’s proposal accepted, becomes engaged; and if engaged, but rejected, becomes unengaged
When no men are able to propose, we are done
36
> >
> >
> >
> >
> >
> >
Properties of Gale-Shapley
Always returns a stable matching
Requires at most O(n2) proposals and rounds
Matching is male-optimal, female-pessimal
Truthful for men (not women)
With ties: several forms of stability
With ties or incomplete lists separately: polytime
With ties and incomplete lists: NP-complete
37
Preference Bottleneck
Gale-Shapley usually viewed as an “algorithm” • input complete preference lists, output matching
• tremendous burden: lots of irrelevant preference info(!)
• comparison, interview, communication costs, etc.
It can be used directly as an elicitation scheme • only ask specific queries: who next? who do you prefer?
• can reduce the burden on users
• question: is it effective as an elicitation scheme?
38
Illustration: Fully Correlated Preferences
GS elicitation performs poorly with “identical” preferences • If real preferences are correlated, similar issues arise
A “binary search” would be much more effective… • … if we knew the targets
We’ll use minimax-regret to guide this search
> > > >
39
Regret-based Elicitation, Matching [Drummond, B. IJCAI-13]
Robust matching with partial preferences
• useful for partial info, low-stakes domains
• define max-regret of matching, or degree of instability
• computation of matchings with minimax-regret
Preference elicitation
• use regret-based solution to determine user queries
• compare to GS: number of queries, rounds, cognitive costs
Related work:
• Rastegari, et al. ACM EC’13
• Stochastic matching processes (e.g., Biro, Norman 2012)
40
Partial Preferences
Partial preferences: set of pairwise comparisons
• partial preference Pq for 𝑞 ∈ 𝑀 ∪ 𝑊, partial profile P
• completions C(Pq ) and C(P) defined as usual
• will sometimes use partitioned preferences
> >
>
41
Degree of Instability With partial preferences, can’t generally guarantee stability
Degree of stability: max incentive for couple to deviate
• use Borda score for utility: sq(r, ≻q) = n- rank(q, ≻q)
• Pairwise regret: incentive for q to drop r, defect with r’
Instability of (m,w) in μ : regret of least willing blocking partner
Instability of matching is maximum instability over any pair
PWRegret 𝑞, 𝑟′, 𝑟, ≻𝑞 =
𝑠𝑞 𝑟′, ≻𝑞 − 𝑠𝑞 𝑟, ≻𝑞
Inst(m, w, 𝜇, ≻𝑚 , ≻𝑤)= min[ PWRegret 𝑚,𝑤, 𝜇(𝑚),≻𝑚 ,
PWRegret 𝑤,𝑚, 𝜇(𝑤), ≻𝑤 ]
Inst(𝜇, ≻) = max(𝑚,𝑤)
Inst(m, w, 𝜇, ≻𝑚 , ≻𝑤)
> > :
> > :
Improve
= 2
Improve
= 1 Inst = 1
42
Robust Matching: Minimizing Max Regret
Given partial profile P, minimize worst-case instability
over all possible realizations of preferences
Max regret MR(𝜇, 𝑷) : • max possible incentive for some couple to defect (unravel)
Minimax regret MMR(𝑷) : tightest stability guarantee
• If MMR(𝑷) ≤ 0 then 𝜇∗(P) is stable in usual sense
• Full information: MMR( ≻) ≤ 0
43
Computing MMR
Given a (partitioned) partial profile: MMR is NP-complete
• reduction from weakly stable matching with incomplete lists and ties
Can be formulated as a mixed integer program (MIP)
• uses pairwise max regret terms for each couple
• PMR terms computed using same completion methods as with voting
O(n3) preference completions, each solvable in polytime
• But MIP not relax-able (fractional solutions much better)
• Practically solvable for roughly n=30
44
Heuristic Computation of MMR
Goal is elicitation, so approximate MMR acceptable
Computing MR(P, μ): polytime for fixed μ, P
Simple heuristic: Partial Preference GS (PPGS) • select some completion ≻ from C(P)
• run GS to compute stable matching μ for ≻
• compute MR(P, μ): upper bound on MMR(P)
Selection methods (all feasible in our settings):
• uniform at random:
one or best of k sampled completions
fixed at outset or reselected at each round
• maximum likelihood completion (given distribution)
45
Elicitation: Regret-based Halving
• Like a binary search for (approx.) rank of matched partner
• We use current MMR-matching at each point to estimate block
of matched partner, generate query
> >
> > >
> > > >
46
Elicitation: Regret-based Halving
Compute MMR matching μ for current profile
If non-zero, identify pair(s) (m,w) that determine MR(μ)
• PMR(w, m, μ(w)) = MR(μ) and PMR(m, w, μ(m)) ≥ MR(μ) (or vice
versa)
Query one of these:
• if lower regret partner m has w, μ(m) in same block, ask to split
• else, if w has m, μ(w) in same block, ask to split
• else, ask each to split largest block (other schemes possible)
> >
> > >
m μ(w) w
47
Performance, Cognitive Cost
Compare RBH scheme to GS (as elicitation method)
• measure number of queries
• measure number of rounds
• measure cognitive cost: number, difficulty of binary comparisons
> > > > > > >
Easier Harder
• Luce-Shepard Choice Model
• threshold τ= 5
• temperature γ = 0.5
48
Mallows, n=250 (30 runs): Queries until MR=0
49
Mallows, n=250 (20 runs): Queries until MR=0
50
Mallows, n=250 (20 runs): Anytime, ϕ =0.2
51
Cognitive Costs
Mallows, n=250; Correlated, ϕ= 0.2
• RBH: Proposers 53.8, Acceptors 53.7; Average 53.8
• GS: Proposers 1830.9, Acceptors 13.1; Average 922
• GS: Proposers sort a priori: 310.3; Average 161.7
Mallows, n=250; impartial culture, ϕ= 1.0
• RBH: Average 57.8
• GS: Proposers 121.1, Acceptors 0.43; Average 60.8
52
Riffle Models n=250 (20 runs)*
Two types of men (women)
• Distinct Mallows model for each type: determines ranking within type
• Type rankings interleaved: each woman biased towards Type 1 with
probability p (drawn from a 2-component Gaussian mixture) 53
MovieLens, n=250 (20 runs)*
Generate preferences for partners based on similarity of ratings vectors
• map real-valued “affinity” scores into rankings of partners
• two different processes: unnormalized (more correlated), normalized (less)
• Cognitive costs
• U-MM, RBH: Proposers 55.3, Acceptors 55.4; Average 55.4
• U-MM, GS: Proposers 788.5, Acceptors 3.56; Average 396
• N-MM, RBH: Proposers 58.8, Acceptors 58.6; Average 58.7
• N-MM, GS: Proposers 250.0, Acceptors 1.35; Average 125.7
54
Stable Matching: Summary
Robust optimization using MMR hard in principle, easy to
approximate well
RBH using MMR effective form of elicitation
• Especially effective (cf. GS) when preferences correlated
• Anytime profile attractive for approximate stability
Others measures of instability, other quality measures
Other forms of queries
Variants: stable roommates, many-to-one, etc.
55
Probabilistic Models**
Fairly effective models and techniques for learning
probabilistic models of preferences/rankings from
data
Elicitation: exploit distributional information
• Average case query complexity [Oren, Filmus, B, IJCAI-13]
• Pure Bayesian optimization or expected max regret
• Generate samples, optimize “batch protocols” [Lu, B, ADT-11]
Sample-complexity: generate low-MR alternative with
high prob.
• More general distribution-sensitive elicitation schemes
56
Single vs. Multi-round Elicitation**
Fully sequential elicitation often not practical
• Tradeoff: quality, information elicited, rounds/interruption
see Kalech et al. [JAAMAS 2011]
Reduce interruption cost by using coarser “rounds”
• E.g., ask each voter for their top k candidates
• Stop if MMR low enough
• Otherwise select a few voters and ask for their next k’
candidates; etc.
Suitable choice of k balances the three criteria
57
Optimizing Single-round Protocols [Lu, B. ADT-11]**
General framework for addressing tradeoffs
Focus on optimizing single-round protocols
• for one round of elicitation, what is trade off between information
elicited (k) and minimax regret?
Requires a probabilistic model Pr of voter preferences
• weak guarantees otherwise (hard to predict MMR)
Our goal: find minimal k s.t. Pr(MMR < ) > 1-
• regret tolerance
• confidence
58
Exploiting Distribution: Sampling**
Many models of ranking distributions:
• Mallows, Plackett-Luce, Bradley-Terry, impartial culture, …
• in principal, can derive analytical results for each
We propose an empirical (sampling) methodology
• sample t vote profiles
learned model, generative process, subsample data sets
• compute MMR for each profile and for each k < m-1
• use empirical distribution over MMR to determine suitable k
achieves desired MMR < with desired probability Pr > 1-
59
MMR Histograms: Mallows (m=10, n=1000, =0.6, Borda)**
60
MMR Confidence Plot: Mallows (m=10, n=100, =0.6, Borda)**
61
Sample Complexity**
One may use methodology purely heuristically
• actual MMR (after elicitation) can suggest further queries
Theoretical sample complexity bounds possible
• assume sampling accuracy ξ and sampling confidence η
• with t sampled profiles, where:
• output min satisfying:
62
𝑘
MMR Histograms: Sushi Data Set (50 samples, 100 voters)**
63
MMR Histograms: Dublin Data Set (73 samples, 50 voters)**
64
Learning Probabilistic Models [Lu, B. ICML-11]**
Where do probabilistic models come from?
• can be learned from sample/survey/historical data
• two key difficulties: inference and learning
Much research in stats, psychometrics, ML, etc. but learning Mallows models with pairwise evidence ignored
Inference task: given paired comparisons (partial vote)
pi, what is posterior over i’s ranking: P(r| pi ; , )
Learning task: given partial profile p = (p1, …, pn), what
is max likelihood Mallows model/mixture?
• Solvable by EM if you can solve the inference task
65
Social Networks as Preference Source**
Valuable source of preference data: probabilistic models
of preference correlation on networks?
• impact on elicitation could be immense
• both for individual or social choice problems
Social networks shape behavior
• Homophily well-documented
• Often claimed that preferences
correlated; but less evidence to
this effect
Social Choice on Social Networks**
Many social choice problems occur in network context • e.g., externalities in assignment (BGM EC-12), matching (BLCHW10),
voting (ABKLT EC-12), coalition formation (BL11)
Voting with empathetic preferences [Saheli-Abari, B. 12]
• utility trades off intrinsic and empathetic preference
• e.g., casual group decision, elections, supply chain, …
Many new elicitation, optimization challenges
a>b>c>d
b>c>d>a
b>d>c>a d>c>a>b
c>b>a>d
0.2
0.1
0.3
0.2
0.2
0.1
Fixed point solution
(à la PageRank):
Simple weighted
voting scheme.
Next Steps
Just a starting point: learning, probabilistic models,
decision-theoretic optimization for effective elicitation
and decision making in social choice settings
• Move toward behavioural SC, connections to social media
Next steps
• Sophisticated, distribution-aware elicitation schemes
• Learning other distributional models (e.g., Plackett-Luce)
• Distributions over multi-attribute preference domains
• Exploiting social media: networks, CF, sentiment, …
• Computation, elicitation in combinatorial domains
• New analyses of manipulation
• Other social choice problems: matching; multi-
winner/segmentation; allocation; etc.
68
69
Recap
Preference bottleneck a key challenge in AI and decision
support
Fortunately, good decisions can often be made with very
little preference/utility information
• quantitative approaches important if you want to accurately
assess impact of approximation, tradeoff elicitation effort with
decision quality
• however, key is to avoid unnecessary precision
Some Key Issues Key viewpoints:
• strict uncertainty, probabilistic methods, Bayesian methods
• incentive issues is multiagent interactions
Key issues:
• interaction and query costs, passive vs. active assessment
• user controlled exploration vs. system-generated interaction
• overcoming cognitive biases (framing, anchoring, thresholding…)
• active elicitation in collaborative models
• vague, subjective, user-specified features
fundamental vs. means objectives (Keeney)
• dialog-based approaches, linguistic cues
• inconsistency management, sensitivity analysis
• transient, nonstationary, context-specific preferences
• multi-source preference data integration
70
References C. Boutilier, I. Caragiannis, S. Haber, T.Lu, A. Procaccia and O. Sheffet. Optimal Social Choice
Functions: A Utilitarian View. Thirteenth ACM Conference on Electronic Commerce (EC'12),
Valencia, Spain, pp.723-740 (2012).
Y. Chevaleyre, U.Endriss, J. Lang, and N. Maudet. A short introduction to computational social
choice. SOFSEM-07, pp.51–69, Harrachov, Czech Republic, 2007.
V.Conitzer and T.Sandholm. Vote elicitation: Complexity and strategyproofness. AAAI-02,, 2002.
V.Conitzer and T.Sandholm. Communication complexity of common voting rules EC’05, 2005.
J. Drummond, C. Boutilier. Elicitation and Approximately Stable Matching with Partial Preferences.
23rd International Joint Conference on Artificial Intelligence (IJCAI-13), pp.97-105, Beijing (2013).
W. Gaertner. A Primer in Social Choice Theory. Oxford University Press, USA, 2006.
M. Kalech, S. Kraus, G. A. Kaminka, and C.V. Goldman. Practical voting rules with partial
information. J. of Autonomous Agents and Multi-Agent Systems, 22(1):151–182, 2011.
T. Lu and C. Boutilier. Robust Approximation and Incremental Elicitation in Voting Protocols. Proc.
of IJCAI-11, pp.287-293, Barcelona (2011).
T. Lu and C. Boutilier. Learning Mallows Models with Pairwise Preferences. Proc. of ICML 2011,
pp.145-152, Bellevue, WA (2011).
T. Lu, C. Boutilier. Vote Elicitation with Probabilistic Preference Models: Empirical Estimation and
Cost Tradeoffs. 2nd Conf. on Algorithmic Decision Theory (ADT-11), Piscataway, NJ,134-149 (2011).
T. Lu, C. Boutilier. Budgeted Social Choice: A Framework for Multiple Recommendations in
Consensus Decision Making. 11th ACM Conf. on Elec. Comm. (EC'10), 263--274, Boston (2010).
C. Mallows. Non-null ranking models. Biometrika:44, pages 114–130, 1957.
J. Marden. Analyzing and modeling rank data. Chapman and Hall, 1995.
L. Xia and V. Conitzer. Determining possible and necessary winners under common voting rules
given partial orders. AAAI-08, pp. 202–207, Chicago, 2008.
71