The Laboratory for Intelligent Processes and Systems Electrical and Computer Engineering The...

36
The Laboratory for Intelligent Processes and Systems Electrical and Computer Engineering The University of Texas at Austin http://www.lips.utexas.edu Combining Job and Team Selection Heuristics Chris L. D. Jones and K. Suzanne Barber

Transcript of The Laboratory for Intelligent Processes and Systems Electrical and Computer Engineering The...

The Laboratory for Intelligent Processes and SystemsElectrical and Computer Engineering

The University of Texas at Austinhttp://www.lips.utexas.edu

Combining Job and Team Selection Heuristics

Chris L. D. Jones and K. Suzanne Barber

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN 2

Selfish Agents Making Strategic Decisions

Selfish Agents

Goal is to maximize profit by maximizing its own estimated payoff function

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN 3

Defining the Environment

Selfish Agents

Dynamic Environments

Non-zero probability that subtasks will be added or subtracted from a job

Unenforceable Contracts

Agents face no penalty from de-committing from a team

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Scenario: Freelancers on the Internet

4

However, lack of a single needed skill can break entire enterprise

Selfish agents have the opportunity to create a lucrative website

Creating the website involves multiple elements: hardware, software, content and advertising

In a dynamic environment, elements of the solution change during development

Because contracts are unenforceable, some freelancers quit without warning

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Selfish Agents and Unenforceable Contracts

5

• Selfish agents exchange goods or services without an enforcement mechanism if exchange parameters are static [Sandholm and Lesser, 1995]

• Selfish agents utilize concepts such as the Core, Kernel, and Shapley value to navigate static coalitional games[Myerson, 1991; Davis et al, 1963; Shapley, 1997]

• Bounded search may be used to search through static coalition space of selfish agents [Sandholm et al, 1999; Rahwan et al, 2007]

All approaches rely on valuations and planning associated with static problems

Selfish Agents

Unenforceable Contracts

Dynamic Environments

• Agents may form institutions or coalitions based on static common goals [Gaertner et al, 2008]

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Dynamic Environments and Unenforceable Contracts

6

• Cooperative agents may be reassigned roles and tasks within a team as circumstances change [Tambe et al, 2000; Nair et al 2003]

• Cooperative agents work in highly dynamic Robocup Rescue domain [Nazemi et al, 2005; Nair et al, 2001; Lau et al 2005]

• Agents cooperate to solve distributed constraint satisfaction problems [Scerri et al, 2005; Modi et al, 2001]

All approaches rely on cooperative agents maximizing team’s utility over their own

Selfish Agents

Unenforceable Contracts

Dynamic Environments

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Selfish Agents in a Dynamic Environment

7

• Selfish agents in dynamic environments utilize contingency contracts to guard against specific events [Raiffa, 1982; Faratin and Klein, 2001]

•Leveled commitment contracts allow agents to leave a team by paying a penalty [Sandholm and Lesser, 1996; Sandholm et al, 1999; Andersson et al, 2001]

• Central fault-tolerance frameworks can enforce penalties on agents which fail to provide contracted services [Smith, 1980; Dellarocas and Klein, 2000; Patel et al, 2005]

All approaches rely on some form of contract enforcement between agents

Selfish Agents

Unenforceable Contracts

Dynamic Environments

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

A Gap in the Prior Work

8

Selfish Agents

Dynamic Environments

Unenforceable Contracts

Selfish agents in a dynamic environment with unenforceable contracts

• Trust and/or reputation information may be used to find agents less likely to defect [Fullam, 2007]

• Agents may follow societal norms which minimize defection [Oren et al, 2008]

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN 9

Strategies to Maximize Payoff

Agent combines a job selection heuristic with a team selection heuristic to form a profit-maximizing strategy

Selfish agents in a dynamic environment with unenforceable contracts

Agent needs to select a profitable job to work on

Agent needs to select a capable team to work with

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Building Strategies by Combining Heuristics

Combining a job selection heuristic with a team selection heuristic produces a strategy• Two job selection heuristics and five team

selection heuristics gives us ten possible strategies

Previous simulation work created multiple classes of agents each of which executed a different strategy

10

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN 11

Greedy job heuristic• Selects job most profitable to foreman while taking completed

work into account

Lean job heuristic• Selects job which can be completed the quickest

Job Selection Heuristics

maxJj J

(TaskLength ij C(Tij ))Tij AssignedInstanceskj

minJj J

(TaskLength ij C(Tij ))Tij AssignedInstanceskj

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Null team heuristic• Randomly selects team from top-ranked job

Fast team heuristic• Minimize time to completion

Redundant team heuristic• Maximize number of duplicate skills

Auxiliary team heuristic• Maximize number of unused skills

MinPartner team heuristic• Minimize number of partners

12

Team Selection Heuristics

kjijj stancesAssignedInT

ijijJJ

TCTaskLength ))((min

1

1

minx maxAk Team x

(TaskLength iw C(Tiw ))Tiw AssignedInstanceskw

maxx

1 Tiw AgentSkillskAk Team x

, 1

Otherwise, 0

Tiw ActiveTasksw

ActiveTasksw

1

min TeamX

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN 13

Benefits of Heuristic-based Strategies

Strategies allow selfish agents to make immediate estimates of how their actions will effect their utility• Immediate estimates of job worth can be used since no decommitment

penalties are possible• Estimates of team worth based on how well teams may adapt to dynamic

circumstances

Use agent simulation to test relative utility of team formation strategies [Jones and Barber, 2007]• Previous work did not explore different mechanisms for utilizing

heuristic information

How should job and team selection heuristics be combined in a strategy?

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN 14

Approach: Separate Job/Team (SJT) Formation

Foreman Agent

Job selection heuristicselects the most

attractive jobs

Team selection heuristicranks the most attractive teams for all selected jobs

Foreman agent selectsthe top-ranked team and sends request

to form team

Agents respond to request based on job

heuristic

Information about job heuristic value does not affect team selection process

Therefore, SJT may prefer more robust teams at the expense of agent profit

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN 15

Approach: Combined Job/Team (CJT) Formation

Foreman Agent

Job selection heuristicselects the most

attractive jobs

Combined job and team selectionheuristics rank the best job/team assignments for all selected jobs

Foreman agent selectsthe top-ranked job/team

pairing and sends request to form team

Agents respond to request based on

combined job and team heuristics

Normalized heuristics are multiplied together, so that job heuristic information influences team selection process

Robust teams are therefore not selected at the expense of agent profit

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Experimental Parameters

16

Parameter ValueHeuristic usage mechanism SJT, CJTNumber of classes 10Agents per class 250Per round chance of agent acting as foreman 1%Jobs 1000|T| 20|AgentSkills| 5Initial size of |ActiveTasks| 10Range of TaskLength 1 to 10 roundsCredit received per round of completed task instance 1

Number of potential teams examined per top-rank job 15

Dynamicism range 0% to 100%, 25% increment

Number of rounds per simulation 2500Number of simulations per dynamicism step 20

2500 agents in simulation

Simulation tests increasingly dynamic

environments by changing required subtasks

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Credit Earned at 0% dynamicism

17

Gre

edyN

ull

Gre

edyA

uxi

liar

LeanFa

st (

LF)

LeanM

inPa

rtner

GreedyNull (GN)

GreedyFast (GF)

GreedyRedundant (GR)

GreedyAuxiliary (GA)

GreedyMinPartners (GMP)

LeanNull (LN)

LeanFast (LF)

LeanRedundant (LR)

LeanAuxilary (LA)

LeanMinPartners (LMP)

CJT agents equal or exceed SJT agents by statistically significant

margins

Agent Strategies

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Credit Earned at 25% dynamicism

18

Gre

edyN

ull

Gre

edyA

uxi

liar

LeanFa

st (

LF)

LeanM

inPa

rtner

GreedyNull (GN)

GreedyFast (GF)

GreedyRedundant (GR)

GreedyAuxiliary (GA)

GreedyMinPartners (GMP)

LeanNull (LN)

LeanFast (LF)

LeanRedundant (LR)

LeanAuxilary (LA)

LeanMinPartners (LMP)

Agent Strategies

GMP strategy works best in relatively static

environments

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Credit Earned at 50% dynamicism

19

Gre

edyN

ull

Gre

edyA

uxi

liar

LeanFa

st (

LF)

LeanM

inPa

rtner

GreedyNull (GN)

GreedyFast (GF)

GreedyRedundant (GR)

GreedyAuxiliary (GA)

GreedyMinPartners (GMP)

LeanNull (LN)

LeanFast (LF)

LeanRedundant (LR)

LeanAuxilary (LA)

LeanMinPartners (LMP)

Agent Strategies

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Credit Earned at 75% dynamicism

20

Gre

edyN

ull

Gre

edyA

uxi

liar

LeanFa

st (

LF)

LeanM

inPa

rtner

GreedyNull (GN)

GreedyFast (GF)

GreedyRedundant (GR)

GreedyAuxiliary (GA)

GreedyMinPartners (GMP)

LeanNull (LN)

LeanFast (LF)

LeanRedundant (LR)

LeanAuxilary (LA)

LeanMinPartners (LMP)

Agent Strategies

GA strategy works best in relatively static

environments

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Credit Earned at 100% dynamicism

21

Gre

edyN

ull

Gre

edyA

uxi

liar

LeanFa

st (

LF)

LeanM

inPa

rtner

GreedyNull (GN)

GreedyFast (GF)

GreedyRedundant (GR)

GreedyAuxiliary (GA)

GreedyMinPartners (GMP)

LeanNull (LN)

LeanFast (LF)

LeanRedundant (LR)

LeanAuxilary (LA)

LeanMinPartners (LMP)

Agent Strategies

The CJT advantage over SJT agents continues over

all sampled dynamicism values

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Jobs Completed at 0% Dynamicism

22

Gre

edyN

ull

Gre

edyA

uxi

liar

LeanFa

st (

LF)

LeanM

inPa

rtner

GreedyNull (GN)

GreedyFast (GF)

GreedyRedundant (GR)

GreedyAuxiliary (GA)

GreedyMinPartners (GMP)

LeanNull (LN)

LeanFast (LF)

LeanRedundant (LR)

LeanAuxilary (LA)

LeanMinPartners (LMP)

Agent Strategies

CJT likewise has a statistically significant advantage over SJT

in percentage of jobs successfully completed

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Jobs Completed at 25% Dynamicism

23

Gre

edyN

ull

Gre

edyA

uxi

liar

LeanFa

st (

LF)

LeanM

inPa

rtner

GreedyNull (GN)

GreedyFast (GF)

GreedyRedundant (GR)

GreedyAuxiliary (GA)

GreedyMinPartners (GMP)

LeanNull (LN)

LeanFast (LF)

LeanRedundant (LR)

LeanAuxilary (LA)

LeanMinPartners (LMP)

Agent Strategies

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Jobs Completed at 50% Dynamicism

24

Gre

edyN

ull

Gre

edyA

uxi

liar

LeanFa

st (

LF)

LeanM

inPa

rtner

GreedyNull (GN)

GreedyFast (GF)

GreedyRedundant (GR)

GreedyAuxiliary (GA)

GreedyMinPartners (GMP)

LeanNull (LN)

LeanFast (LF)

LeanRedundant (LR)

LeanAuxilary (LA)

LeanMinPartners (LMP)

Agent Strategies

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Jobs Completed at 75% Dynamicism

25

Gre

edyN

ull

Gre

edyA

uxi

liar

LeanFa

st (

LF)

LeanM

inPa

rtner

GreedyNull (GN)

GreedyFast (GF)

GreedyRedundant (GR)

GreedyAuxiliary (GA)

GreedyMinPartners (GMP)

LeanNull (LN)

LeanFast (LF)

LeanRedundant (LR)

LeanAuxilary (LA)

LeanMinPartners (LMP)

Agent Strategies

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Jobs Completed at 100% Dynamicism

26

Gre

edyN

ull

Gre

edyA

uxi

liar

LeanFa

st (

LF)

LeanM

inPa

rtner

GreedyNull (GN)

GreedyFast (GF)

GreedyRedundant (GR)

GreedyAuxiliary (GA)

GreedyMinPartners (GMP)

LeanNull (LN)

LeanFast (LF)

LeanRedundant (LR)

LeanAuxilary (LA)

LeanMinPartners (LMP)

Agent Strategies

The CJT advantage over SJT agents in jobs completed continues for all sampled

dynamicism values

CJT LA job completion improves

markedly

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Conclusions and Future Work

Simultaneous usage of job and team selection heuristics improves credit earned and jobs completed

Mechanism works in dynamic environments where thousands of selfish agents work without enforceable contracts

Future work:• Dynamic weighting of job and team-selection

heuristics• Development of theoretical framework for determining

job and team selections

27

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

THANK YOU!QUESTIONS?

28

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

References Kraus, S., O. Shehory, et al. (2003). Coalition formation with uncertain heterogeneous information, ACM Press New

York, NY, USA: 1-8.

Tambe, M., D. V. Pynadath, et al. (2000). Building dynamic agent organizations in cyberspace. 4: 65-73.

Sandholm, T. W. and V. R. Lesser (1995). Equilibrium Analysis of the Possibilities of Unenforced Exchange in Multiagent Systems, University of Massachusetts at Amherst, Computer Science Dept.

Myerson, R. B. (1991). Game theory: analysis of conflict, Harvard University Press.

Davis, M. and M. Maschler (1963). THE KERNEL OF A COOPERATIVE GAME, DTIC Research Report AD0418434.

Shapley, L. S. (1997). A VALUE FOR n-PERSON GAMES, Princeton University Press.

Sandholm, T., S. Sikka, et al. (1999). Algorithms for optimizing leveled commitment contracts: 535-540.

Rahwan, T., S. D. Ramchurn, et al. (2007). Near-optimal anytime coalition structure generation: 2365-2371.

Gaertner, D., Rodrigez, J. A., et al. (2008. Agreeing on Institutional Goals for Multi-agent Societies.

Nair, R., M. Tambe, et al. (2003). Role allocation and reallocation in multiagent teams: towards a practical analysis, ACM Press New York, NY, USA: 552-559.

Nazemi, E., M. Faradad, et al. (2005). SBCe_Saviour Team Description. Tehran, Iran, ShahidBeheshti University: 6.

Nair, R., T. Ito, et al. (2001). Task Allocation in the RoboCup Rescue Simulation Domain: A Short Note, Springer.

Lau, N., L. P. Reis, et al. (2005). FC Portugal 2005 Rescue Team Description: Adapting Simulated Soccer Coordination Methodologies to the Search and Rescue Domain.

Scerri, P., A. Farinelli, et al. (2005). Allocating tasks in extreme teams, ACM Press New York, NY, USA: 727-734.

Modi, P. J., H. Jung, et al. (2001). A dynamic distributed constraint satisfaction approach to resource allocation, Springer.

Raiffa, H. (1982). The art and science of negotiation, Belknap Press of Harvard University Press Cambridge, Mass.

29

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

References Faratin, P. and M. Klein (2001). Automated Contract Negotiation and Execution as a System of Constraints, MIT,

Cambridge.

Sandholm, T. W. and V. R. Lesser (1996). Advantages of a leveled commitment contracting protocol: 126-133.

Sandholm, T., S. Sikka, et al. (1999). Algorithms for optimizing leveled commitment contracts: 535-540.

Andersson, M. R. and T. W. Sandholm (2001). Leveled commitment contracts with myopic and strategic agents, Elsevier. 25: 615-640.

Smith, R. G. (1980). The contract net protocol. 29: 1104-1113.

Dellarocas, C. and M. Klein (2000). An experimental evaluation of domain-independent fault handling services in open multi-agent systems: 95-102.

Patel, J., W. T. L. Teacy, et al. (2005). Agent-based virtual organisations for the Grid, IOS Press. 1: 237-249.

Fullam, K. (2007). Learning Trust Decision Strategies in Emerging Reputation Networks.

Oren, N., Luck, M., et al. (2008). An Argumentation Inspired Heuristic for Resolving Normative Conflict.

Jones, C. L. D. and K. S. Barber (2007). Bottom-up Team Formation Strategies in a Dynamic Environment: 60-74.

Jones, C. L. D. and K. Barber (2008). Combining Job and Team Selection Heuristics. 2008 AAMAS Workshop on Coordination, Organization, Institutions and Norms. Lisbon, Portugal, ACM.

Sutton, R. S. and A. G. Barto (1998). Reinforcement Learning: An Introduction, MIT Press.

Klusch, M. and A. Gerber (2002). Dynamic coalition formation among rational agents. 17: 42-47.

Lin, C., S. Hu, et al. (2007). An Anytime Coalition Restructuring Algorithm in an Open Environment, Springer. 4681: 80.

30

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

BACKUP SLIDES

31

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Causes of Agent and Job dynamicism

Changes to Job requirements• Bounded rationality• Incomplete information• Inherent environmental dynamics

Changes to Team membership• Agent failure• Agent defection

Changes to current job requirements make job less attractive

Changes to alternate job requirements makes job more attractive

Teammate defection decreases robustness of current team, likelihood of expected payoff

32

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Domain Assumptions

Agents are multiskilled• Single-skilled agents would be unable to provided the

reserve of redundant skills needed

Contractless environments feature non-transferable utility• With transferable utility, contracts become possible

Quality and Timeliness are not represented• Both could probably be represented as different types of

subtasks, e.g. subtask requiring 90% QoS is different than subtask requiring 50% QoS

• Quality and Timeliness are both likely to be domain dependent – not a huge difference between 80% and 100% QoS in ditch digging, but huge difference in brain surgery

33

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Comparison to Trust work

This work is complementary to trust• Can be used when trust information is unavailable or

unreliable• Can be used when environmental dynamics make

agents change their trustworthiness over time

Work is supplementary to trust• Can be used in addition to trust, to compensate for

non-trustworthy agents• Can be used when trust system is bootstrapping, and

agents need to explore space of likely untrustworthy agents

34

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Foreman flowchart

35

Opportunity to become foreman?

Idle agent

Determine optimal job and team selections

Team formation

successful?

Work on job

Foreman’s work

complete?

YesNo

Yes

No

No

No

No

Receive job credit

Job complete?

Yes

Yes

Agent failure or new

requirement?

Send offer messages to worker agents

Can allocate (or recruit)

needed skills?

Yes

No

© 2008 THE UNIVERSITY OF TEXAS AT AUSTIN

Worker flowchart

36

Receive and accept new

offer?

Idle agent

Respond with accept message

Team formation

successful?

Work on job

Team fails?

Yes

Yes

Yes

Yes

No

No

No

No

Receive job credit

Job complete?

No

Yes

Receive and accept new

offer?