Can Negotiation Breakdown Probabilities of Laissez-Faire Agents Be Derived A Priori?

28 September 2006 p1 r.p.loui AIVR UIUC

Can Negotiation Breakdown Probabilities

of Laissez-Faire Agents Be Derived A Priori?

R. Loui, R. Ratkowski, J. RosenDepartment of Computer Science and Engineering

Washington UniversitySt. Louis

USA


Trailers/Previews• DataMining on OC192 Data Streams (10Gbps)

– A.k.a. "Streaming AI"

– With John Lockwood (from UIUC), PI

– 1M x speedup of classified intelligence task

– Better performance than available software

– Better performance than SVM methods

– Related to FPgrep, FPsed, FPawk patent US 7093023

• Methods, systems, and devices using reprogrammable hardware for high-speed processing of streaming data to find a redefinable pattern and respond thereto

– W/Moshe Looks, papers on hierarchical streaming clustering


Trailers/Previews• Scripting Languages and The New Programming Pragmatics

– http://www.cse.wustl.edu/~loui/praiseieee.txt– The real shock is that academia continues to reject the sea change

in programming practices brought about by scripting.

– Scripting was not enervating but was actually renewing: programmers who viewed code generation as tedious and tiresome … viewed scripting as rewarding self-expression or recreation.

– I personally believe that CS1 java is the greatest single mistake in the history of computing curricula.

– Linguists recognize something above syntax and semantics, and they call it "pragmatics". We are entering an era of comparative programming language study when the issues are higher-level, social, and cognitive too.


Trailers/Previews• Moshe Looks, D.Sc. 06 (expected)

– Externally advised by David Goldberg (UIUC) and Martin Pelikan (formerly UIUC)

– COMPETENT PROGRAM EVOLUTION

– My thesis is that the properties of programs and program spaces can be leveraged as inductive bias to reduce the burden of manual representation-building, leading to competent program evolution.

– The central contributions of this dissertation are • a view of the requirements for competent program

evolution, and • the design of a procedure, meta-optimizing semantic

evolutionary search (MOSES)


Trailers/Previews

• Dynamics of Rule Revision and Strategy Revision– With M. Looks, B. Cynamon (U Chicago), A. Schiller (Princeton)– A.k.a. Legislature vs. Population games

– H.L.A. Hart: There is a limit, inherent in the nature of language, to the guidance which general language can provide.

– Abridgement = Projection of a veridical utility function– Scenario extinction– Sheep, Weasels, and Gorillas


Today's AbstractWe present a different AI model of negotiation where agents are driven by dynamic

expectations (there is no solution concept and there is no recursive modeling of beliefs). We require two assumptions to paint a new picture:

(1) there is an empirically observable objective probability of breakdown that is monotonic (at some granularity) in elapsed time since last progress;

(2) there is a nonstandard utility attached to the act of unilateral breakdown: a process utility that models the satisfaction of breaking down on a non-cooperative negotiating partner. This is a procedural fairness adjustment, not the substantive distributive fairness effect that has been trendy in the economics literature.

We observe the variety of behaviors that can be generated by agents constructing action according to such Pessimism-Punishment (PP) negotiation models. We define a laissez-faire path for two PP agents starting in a given position and the proper calibration of their breakdown probabilities conditioned only on position. Finally, we discuss what iterative process could be used to reduce a priori miscalibration of breakdown probabilities.


Negotiation Behavior• Social Psychology: Dean Pruitt

– Logrolling issues

• Management Sci: Howard Raiffa / Max Bazerman - Margaret Neale– Integrative agreement

• Law: Roger Fisher – William Ury – Bruce Patton– Principled negotiation

• Artificial Intelligence: – Problem solving: R Davis – R Smith / V Lesser– Shared planning dialogue: S Carberry / G Ferguson – J Allen – Non-ideal game theory: E Durfee / S Rosenschein / T Sandholm– Argumentation: K Sycara / S Parsons – C Sierra – N Jennings

• Economics: John Nash / Ariel Rubinstein– Solution concept– Equilibrium


Negotiation Behavior: Equilibrium

• Mathematical curiosity (cf. Axelrod)• Starts with "Solution concept (I or II)": reduction

of uncertainty to a distinct outcome• Epistemologically far-fetched• Empirically ridiculous• Philosophically indefensible• Useless in the design of negotiating agents


Negotiation Behavior: AI

• A place for language / argument• A place for introspection on utilities• A range of interesting & reasonable behaviors

• Participation in the process of negotiation:– Exchanging proposals

– Reacting to proposals


My Theory Pt. I

• Observation: parties to a negotiation (can) construct a probability distribution over potential settlements


Party 1'saspirationParty 2's

aspiration


Party 2'sproposals at t

Party 1'sproposals at t


inadmissible(dominated)at t

inadmissible(dominated)at t


In black:admissiblesettlementsat t

(probabilityof agreementIs non-zero)


1's aspiration

2's aspiration


Breakdown (BATNA)


Breakdown row

Breakdown column


Breakdownwould occurhere (BATNA)


1's securitylevel

2's securitylevel

2 would rather breakdown

1 would rather breakdown


Eu1|s = 51

Eu2|s = 49α +54(1-α)

Prob(bd) = ?


My Theory Pt. I



My Theory Pt. I


• What kind of probability?


My Theory Pt. I• Observation: parties to a negotiation (can) construct an

objective, empirical or epistemological* (NOT SUBJECTIVE*) probability distribution over potential settlements from past experience in similar settings

• OBJECTIVE:– Constructed from data– Agree on P, given K– Not committed to P(a) until queried about P(a)

• SUBJECTIVE:– "Bayesian" (but not necessarily what AI people mean)– Can change by new prior, new conditioning, or shift in feelings– Total and consistent prior to querying


My Theory Pt. I

• Observation: from a probability distribution over potential settlements, there is an expected utility given settlement

• Observation: there is a probability of breakdown p(bd)


probability of break downP(bd) gap


My Theory Pt. I

• Observation: from a probability distribution (at t) over potential settlements, there is an expected utility given settlement (at t)

• Observation: there is a probability of breakdown pt(bd)


My Theory Pt. I• Observation: At t, calculate

1. An expected utility given settlement (Eut|s) and 2. An expected utility given continued negotiation, Eut = (Eut |s) (1 - pt(bd)) + u(bd) pt(bd)

• Definition: Rationality requires the agent, at t, to:

1. Extend an offer, o, if Eut < u(o) 2. Accept an offer, a, if Eut < u(a), a offers-to-you(t)

3. Break down unilaterally if Eut < u(bd)


My Theory Pt. I

• Why not iff?

• Extend an offer, o, if Eut < u(o)

• Withhold an offer?, o, if Eut > u(o)

• There may be other reasons for acting earlier• Constructivism:

– Multiple ways of constructing probability

– Multiple ways of deriving/justifying/motivating action


My Theory Pt. I

• Empirical Observation: At sufficient granularity, p(bd) is decreasing in the time since last progress


My Theory Pt. IPessimism

For sufficiently large Δ, where LP(t0) denotes last progress at t0

pt+Δ(bd | LP(t0)) > pt(bd | LP(t0))

What is progress? A non-trivial offer by the other party

What does this mean?(at some granularity, the past record implies that:)If there are no offers, the probability of breakdown rises


My Theory Pt. IPessimism

Linear Pessimism p(bd | NP(t)) = π t

Exponential Pessimismp(bd | NP(t)) = 1 - e-πt

Delayed Linear Pessimism p(bd | NP(t)) = π max(0, t - t0)

Whatever fits the empirical record!


Pessimism causes Eu to fall

Next offer is made at this timeand prob(bd) resets to 0Expectation starts to fall again


reciprocated offers

offers


Agreement reached as Eu < u1


security

Best offer received

Whenever u(acc) > security, acceptance occurs before breakdown!


security

Best offer received

Would you accept an 11-cent offer if yoursecurity were 10-cents?


My Theory Pt. II

• Observation: You wouldn't accept 11¢ over 10 ¢ security, nor 51 ¢ over 50 ¢ security

• Observation: You wouldn't let your kid do it

• Observation: Your Mother wouldn't let you do it

• Observation: Your lawyer wouldn't let you do it

• Observation: Your accountant wouldn't let you do it

• Proposition: We shouldn't automate our agents to do it


My Theory Pt. II

• Question: Isn't this an issue of distributive justice

• Answer: Substantive fairness is trivial to model by transforming utilities

• Observation: There may be a procedural fairness issue


My Theory Pt. II

• Procedural fairness: – the more the other party withholds progress, the

more you will punish

– When the other party resumes cooperation, you

are willing to forgo punishment


My Theory Pt. IIResentment

u(bd) = security + resentment(t)

What is resentment? 1. Dignity2. Pride3. Investment in society4. Protection against non-progressive manipulators5. A GENUINE source of satisfaction:

non-material, transactional, personal(?), transitory(?)


My Theory Pt. IIResentment

ut(bd) = security + resentment(t) = u(bd) + r(t)

for NP(t), non-progress for a period t

What is resentment? 6. Attached to a speech/dialogue act:

BATNA through breaking down vs. BATNA through agreement

7. A nonstandard utility (process utility)8. Specific or indifferent (I-bd-you vs. you-bd-me)


My Theory Pt. II

Resentment

linear resentment r(t) = ρt

sigmoid resentment r(t) = rmax(2/(1+e-ρt) -1)

You either feel it or you don't – you can't fake it!


Eu never falls to u1


Actually accepts becauseresentment resets with progress

Resentment resets to zero each time there is progress

Nontrivial progess


Resentment might not reset to zero if there is memory

Agent breaks down before accepting


P&P Agents

Pessimism + Punishment"purely" probablistic


Variety of Behaviors

• Agent can make a series of offers, responds to offers

• Agent can wait, then offer, accept, or break down

• Agent can accept, offer, or break down immediately

• Agent can offer before accepting and vice versa

• Agent can breakdown before accepting and vice versa

• Agent can offer before breaking down and vice versa

• Agent can be on path to breakdown, then on path to acceptance– because received offer changes Eu or resentment

– because extended offer changes Eu

• I wouldn't use this as my agent on ebay quite yet…


low-valued ρ high-valued ρ

(Assumes no progress)

Linear pess/linear specific pun




Linear pess/linear indifferent pun




Exponential pess/linear indifferent pun


rare alternation betweenbreakdown and acceptance


Exponential pess/sigmoidal specific pun


Laissez-Faire Paths


What happens when two P&P agents interact?

Dominatedby BATNA

1's offers inthis round

2's offer inthis roundEu2

2's aspiration

BATNA =<u1(bd),u2(bd)>

1's aspiration Eu1



Eu2

Eu1(t=2)Eu1(t=1)



1'ssecurity+resentment

2'ssecurity+resentment

1's offersin this round



1 breaks down

Amount of(specific)resentment

Laissez-faire path is

<Eu1,Eu2>through time


Does the starting offer affect the laissez-faire path?

Both aregenerousat the start

1 isgenerousat start,2 is not

2 isgenerousat start,1 is not


Breakdownat t=2(purepessimism)


Differentlaissez-fairepaths


Breakdownat t=5withresentment


All paths lead to breakdown


In a different negotiation,some paths lead to acceptance, some to breakdown

Fixedagentcharacteristics

Variedaccelerationof offers


A third example where player 1 can guaranteean acceptance outcome with the right initial offers


Controlling the Path

• Definition of rationality – Does not preclude accelerating offers

• You do not have to accept the laissez-faire outcome– Steer

– Estimate

– Control

• You can compensate for high aspiration or low security

• You can avoid gaps in the timing or density of offers


An Envelope of Normalcy

Can you keep the pathin a narrow envelope?

the axis passes through< uA(bd), uB(bd) >

If so, then agreement isPossible.


Is the model reasonable?

• Probability of Reaching an Agreement as a function of pessimism punishment:



• Substantive Utility as a function of pessimism punishment:



• Substantive+Procedural Utility as a function of pessimism punishment:



• Where are the laissez-faire states, in terms of agents' relative power? When both parties

do not have power,Negotiation ends

power = (ut(bd) – u1)/(Eut – u1)


Some Questions

• Can you estimate a ppagent's parameters and manipulate? (answer = mildly)


Some Questions

• Does ppagent parameter selection matter if there is meta-utility on fairness, time, % aspiration, % security?


Start with a game in strategic form, study the laissez-faireoutcome for each pair of and


Payoffs adjusted for meta-utility onfairness, time, concessions, aspiration,and security


Some Questions

Can we watch / play?http://www.cs.wustl.edu/~loui/313f04/project1/select.cgihttp://k9.cs.wustl.edu/~cs313/04loui/select.cgi


Review of Main Points

• Mathematical model can conceive of negotiation as a process to be controlled

• Simple probabilistic approach is appealing but requires non-standard process utility to make sense

• Also requires probability logic– Constructive– Objective

• Toward rational emotional agents (Lesser)


thanks to M. Foltz, V. Reddy, D. Weisberger, I. Figelman,D. Moore, K. Hashimoto, A. Jump,F. Tohme, K. Larson, S. Braynov,J. Nachbar, B. Dheeravongkit, L. Cai, K. Chin, R. Bujans, T. Shen, M. Looks, E. Wofsey, R, Ratkowski, J. Rosen, S. Grubor, K. Ormsby, J. Badino, R. Pless, M.A. Clark.

Research Funding from:

• NSF Information Technology and Organizations Program: Multi-Agent Negotiation, as co-PI with T. Sandholm, July 1997 to July 1999. 9610122.

• NSF Office of Cross-Disciplinary Affairs and Interactive Systems Program: Summer Undergraduate Research Assistants, March 1995 to February 1996. 9415573.

• NSF Office of Cross-Disciplinary Affairs: REU Continuing Award, April 1992 to September 1994. 9123643.


Can Negotiation Breakdown Probabilities

of Laissez-Faire Agents

Be Derived A Priori?


P(bd) Miscalibration

What is pA(bd) in this state?


P(bd) Miscalibration?

What is pA(bd) in this state?

Is the empirical recordDetailed enough to permitConditioning on<EuA,Eub> state?<EuA,Eub,tA,tB> state?

<EuA,Eub,tA,tB,parmsA,parmsB>?


P(bd) Miscalibration?

What is %(bd) in this state?

When simulated, averaging overparmsB?

RECALIBRATE w.r.t. population of parmsB

Is p(bd) is self-confirming?No: high p(bd) pushes agentTo agreement


Iterative Procedure

1. Given parmsA, including pA(bd)(t)(uA,uB)2. Start with a population O of parmsB

3. Foreach <uA,uB,tA,tB>:1. Sample randomly over O

1. Simulate outcome(parmsA, parmsB, <uA,uB,tA,tB>)

2. Calculate %(bd)(uA,uB,tA,tB)

4. Set pA(bd)(tB)(uA,uB) 1/2= %(bd)(uA,uB,tA,tB)5. Enforce monotonicity of p(bd) in t6. Repeat at 37. Hope for convergence

Can Negotiation Breakdown Probabilities of Laissez-Faire Agents Be Derived A Priori?

Documents

Transcript of Can Negotiation Breakdown Probabilities of Laissez-Faire Agents Be Derived A Priori?