1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer...

62
1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010

Transcript of 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer...

Page 1: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

1

General Opponent* Modeling for Improving Agent-Human

Interaction

Sarit Kraus Dept. of Computer Science Bar Ilan University

AMECMay 2010

Page 2: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Negotiation is an extremely important form of people interaction

2

Motivation

2

Page 3: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Computers interacting with people

Computer persuadeshuman

Computer has the control

Human has the control

3

Page 4: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

44

Page 5: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Culture sensitive agents

The development of standardized agent to be used in the collection of data for studies on culture and negotiation

Buyer/Seller agents negotiate well across cultures PURB

agent

5

Page 6: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Semi-autonomous cars

6

Page 7: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

7

Medical applications

Gertner Institute for Epidemiology and Health Policy Research

7

Page 8: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Automated care-taker

8

I will be too tired in the afternoon!!!

I scheduled an appointment for you at the physiotherapist this afternoon

Try to reschedule and fail

The physiotherapist has no other available

appointments this week .How about resting before

the appointment?

Page 9: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Security applications

9

•Collect•Update•Analyze•Prioritize

Page 10: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Irrationalities attributed to– sensitivity to context– lack of knowledge of own preferences– the effects of complexity– the interplay between emotion and cognition– the problem of self control – bounded rationality in the bullet

10

People often follow suboptimal decision strategies

10

General opponent* modeling

Page 11: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Small number of examples – difficult to collect data on people

Noisy data – people are inconsistent (the same person may act

differently)– people are diverse

Challenges of human opponent* modeling

11

Page 12: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Multi-attribute multi-round bargaining– KBAgent

Revelation + bargaining – SIGAL

Optimization problems– AAT based learning

Coordination with people: – Focal point based learning

Agenda

12

Page 13: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

QOAgent [LIN08]

Multi-issue, multi-attribute, with incomplete information

Domain independent Implemented several tactics and heuristics

– qualitative in nature Non-deterministic behavior, also via means of

randomization

R. Lin, S. Kraus, J. Wilkenfeld, and J. Barry. Negotiating with bounded rational agents in environments with incomplete information using an automated agent. Artificial Intelligence, 172(6-7):823–851, 2008

Played at least as well as people

Is it possible to improve the QOAgent?

Yes, if you have data

13

Page 14: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

KBAgent [OS09]

Y. Oshrat, R. Lin, and S. Kraus. Facing the challenge of human-agent negotiations via effective general opponent modeling. In AAMAS, 2009

Multi-issue, multi-attribute, with incomplete information

Domain independent Implemented several tactics and heuristics

– qualitative in nature Non-deterministic behavior, also via means

of randomization Using data from previous interactions

14

Page 15: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Example scenario

Employer and job candidate– Objective: reach an

agreement over hiring terms after successful interview

15

Page 16: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

General opponent modeling

• Challenge: sparse data of past negotiation sessions of people negotiation

• Technique: Kernel Density Estimation

16

Page 17: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Estimate likelihood of other party:– accept an offer– make an offer– its expected average utility

The estimation is done separately for each possible agent type:– The type of a negotiator is determined using a simple

Bayes' classifier

Use estimation for decision making

General opponent modeling

17

Page 18: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

KBAgent as the job candidate

Best result: 20,000, Project manager, With leased car; 20% pension funds, fast promotion, 8 hours

20,000Team ManagerWith leased carPension: 20%Slow promotion9 hours

12,000ProgrammerWithout leased carPension: 10%Fast promotion10 hours

20,000Project managerWithout leased carPension: 20%Slow promotion9 hours

KBAgentHuman

18

Page 19: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

KBAgent as the job candidate Best agreement: 20,000, Project manager, With leased car; 20%

pension funds, fast promotion, 8 hours

KBAgentHuman

20,000ProgrammerWith leased carPension: 10%Slow promotion9 hours

Round 7

12,000ProgrammerWithout leased carPension: 10%Fast promotion10 hours

20,000Team ManagerWith leased carPension: 20%Slow promotion9 hours

19

Page 20: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

20

Experiments

172 grad and undergrad students in Computer Science

People were told they may be playing a computer agent or a person.

Scenarios: – Employer-Employee– Tobacco Convention: England vs. Zimbabwe

Learned from 20 games of human-human

Page 21: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

21

Results: Comparing KBAgent to others

Player Type Average Utility Value (std)

KBAgent vs people

Employer

468.9 (37.0)

QOAgent vs peoples 417.4 (135.9)

People vs. People 408.9 (106.7)

People vs. QOAgent 431.8 (80.8)

People vs. KBAgent 380. 4 (48.5)

KBAgent 482.7 (57.5)

QOAgent

Job Candidate

397.8 (86.0)

People vs. People 310.3 (143.6)

People vs. QOAgent 320.5 (112.7)

People vs. KBAgent 370.5 (58.9)

Page 22: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

22

Main results

In comparison to the QOAgent– The KBAgent achieved higher utility values than

QOAgent– More agreements were accepted by people– The sum of utility values (social welfare) were higher

when the KBAgent was involved The KBAgent achieved significantly higher utility

values than people Results demonstrate the proficiency negotiation

done by the KBAgent

General opponent modeling improves agent negotiations

General opponent* modeling improves agent bargaining

Page 23: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Automated care-taker

23

I will be too tired in the afternoon!!!

I arrange for you to go to the physiotherapist in the afternoon

How can I convince him? What argument should I give?

Page 24: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Security applications

24

How should I convince him to provide me with information?

Page 25: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Which information to reveal?

25

Argumentation

Should I tell him that I will lose a project if I don’t hire today?

Should I tell him I was fired from my last job?

Should I tell her that my leg hurts?

Should I tell him that we are running out of antibiotics?

Build a game that combines information revelation and bargaining

25

Page 26: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Color Trails (CT)

An infrastructure for agent design, implementation and evaluation for open environments

Designed with Barbara Grosz (AAMAS 2004)

Implemented by Harvard team and BIU team

26

Page 27: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

An experimental test-ted

Interesting for people to play– analogous to task settings;– vivid representation of strategy

space (not just a list of outcomes).

Possible for computers to playCan vary in complexity

– repeated vs. one-shot setting;– availability of information; – communication protocol.

27

Page 28: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Game description

The game is built from phases:– Revelation phase– First proposal phase– Counter-proposal phase

Joint work with Kobi Gal and Noam Peled28

Page 29: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Two boards

29

Page 30: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

30

Why not equilibrium agents?

Results from the social sciences suggest people do not follow equilibrium strategies: Equilibrium based agents played against

people failed. People rarely design agents to follow equilibrium

strategies (Sarne et al AAMAS 2008).

Equilibrium strategies are usually not cooperative –

all lose.

30

Page 31: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Perfect Equilibrium agent

Solved using Backward induction; no strategic signaling

Phase two:– Second proposer: Find the most beneficial proposal

while the responder benefit remains positive.– Second responder: Accepts any proposal which

gives it a positive benefit.

31

Page 32: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Perfect Equilibrium agent

Phase one:– First proposer: propose the opponent’s counter-

proposal– First responder: Accepts any proposals which gives it

the same or higher benefit from its counter-proposal. In both boards, the PE with goal revelation

yields lower or equal expected utility than non-revelation PE

Revelation: Reveals in half of the games

32

Page 33: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Asymmetric game

33

Page 34: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Performance

34

140 students

Page 35: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Benefits diversity

Average proposed benefit to players from first and second rounds

35

Page 36: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Revelation affect

The effect of revelation on performance:

Only 35% of the games played by humans included revelation

Revelation had a significant effect on human performance but not on agent performance

People were deterred by the strategic machine-generated proposals, which heavily depended on the role of the proposer and the responder.

36

Page 37: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

37

SIGAL agent

Agent based on general opponent modeling:

Genetic algorithm

Logistic Regression

Page 38: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

SIGAL Agent: Acceptance

Learns from previous games Predict the acceptance probability for each

proposal using Logistic regression Features (for both players) relating to

proposals:– Benefit.– Goal revelations.– Players types– Benefit difference

between rounds 2 and 1.

38

Page 39: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

SIGAL Agent: counter proposals

Model the way humans make counter-proposals

39

Page 40: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

SIGAL Agent

Maximizes expected benefit given any state in the game– Round– Player revelation– Behavior in round 1

40

Page 41: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Agent strategies comparison

Round 1 Round 2

Agent Send Receive Send Receive

EQ Green:10 Gray:11

Purple:2 Green:2 Purple:10 Gray:11

SIGAL Green:2 Purple:9 Green:2 Putple:5

41

Page 42: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

SIGAL agent: performance

42

Page 43: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Agents performance comparison

Equilibrium Agent SIGAL Agent

43

General opponent* modeling improves agent negotiations

Page 44: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

GENERAL OPPONENT* MODELING IN MAXIMIZATION PROBLEMS

4444

Page 45: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

45

AAT agent

Agent based on general* opponent modeling

Decision Tree/ Naïve Byes

AAT

45

Page 46: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Aspiration Adaptation Theory (AAT)

Economic theory of people’s behavior (Selten)– No utility function exists for decisions (!)

Relative decisions used instead Retreat and urgency used for goal variables

46

Avi Rosenfeld and Sarit Kraus. Modeling Agents through Bounded Rationality Theories. Proc. of IJCAI 2009., JAAMAS, 2010.

Page 47: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

47

Commodity search

1000

47

Page 48: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

48

Commodity search

1000

900

Page 49: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

49

Commodity search

1000

900

950

If price < 800 buy; otherwise visit 5 stores and buy in the cheapest.

49

Page 50: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

50

Results

Behavioral models used in General opponent* modeling is beneficial

50Sparse Naïve Learning Sparse AAT

65

67

69

71

73

75

77

79

81

83

Using AAT to Quickly Learn

Co

rre

ct

Cla

ss

ific

ati

on

%

Page 51: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

General opponent* modeling in cooperative environments

51

Page 52: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Coordination with limited communication

Communication is not always possible:– High communication costs– Need to act undetected– Damaged communication devices– Language incompatibilities– Goal: Limited interruption of human

activities

I. Zuckerman, S. Kraus and J. S. Rosenschein. Using Focal Points Learning to Improve Human-Machine Tactic Coordination, JAAMAS, 2010.

52

Page 53: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Focal Points (Examples)

Divide £100 into two piles, if your piles are identical to your coordination partner, you get the £100. Otherwise, you get nothing.

101 equilibria53

Page 54: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Focal points (Examples)

9 equilibria16 equilibria

54

Page 55: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Focal Points

Thomas Schelling (63):

Focal Points = Prominent solutions to tactic coordination games.

55

Page 56: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Prior work: Focal Points Based Coordination for closed environments

Domain-independent rules that could be used by automated agents to identify focal points:

Properties: Centrality, Firstness, Extremeness, Singularity. – Logic based model– Decision theory based model

Algorithms for agents coordination.

Kraus and Rosenchein MAAMA 1992Fenster et al ICMAS 1995Annals of Mathematics and Artificial Intelligence 200056

Page 57: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

57

FPL agent

Agent based on general* opponent modeling

Decision Tree/ neural network

Focal Point

57

Page 58: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

58

FPL agent

Agent based on general opponent modeling:

Decision Tree/ neural network

raw data vector

FP vector58

Page 59: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Focal Point Learning

3 experimental domains:

59

Page 60: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Results – cont’

“very similar domain” (VSD) vs “similar domain” (SD) of the “pick the pile” game.

General opponent* modeling improves agent coordination

60

Page 61: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Evaluation of agents (EDA)

Peer Designed Agents (PDA): computer agents developed by humans

Experiment: 300 human subjects, 50 PDAs, 3 EDA

Results: – EDA outperformed PDAs in the same situations in

which they outperformed people, – on average, EDA exhibited the same measure of

generosity

61

Experiments with people is a costly process

R. Lin, S. Kraus, Y. Oshrat and Y. Gal. Facilitating the Evaluation of Automated Negotiators using Peer Designed Agents, in AAAI 2010.

Page 62: 1 General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010.

Negotiation and argumentation with people is required for many applications

General* opponent modeling is beneficial– Machine learning– Behavioral model – Challenge: how to integrate machine learning and

behavioral model

62

Conclusions

62