RoboCup: A Case Study in Multiagent System
description
Transcript of RoboCup: A Case Study in Multiagent System
RoboCup: A Case Study in Multiagent System
Vahid Mokhtari
2
Contents
Trends1
What is an Agent?2
Multiagent System33
Case study in RoboCup44
ERBASE2011
3
Trends in History of Computing
Ubiquity As processing capability spreads, sophistication
becomes ubiquitous
Interconnection Computer systems no
longer stand alone, but are networked into large
distributed systemsIntelligence
The complexity of tasks that we are capable of
automating and delegating to computers
has grown steadilyDelegation
We are giving control to computers, even in safety
critical tasks
Human-orientation
Programmers conceptualize and
implement software in terms of ever higher-level – more human-oriented –
abstractions
4
From Programming Perspective
Programming has progressed through
• machine code• assembly language• machine-independent programming languages• procedures & functions• abstract data types• objects• to agents
ERBASE2011
5
What is an Agent?
“An agent is an encapsulated computer system, situated in some environment, and capable of autonomous action in that environment in order to meet its design objectives”. Mike Wooldridge
ERBASE2011
6
Agent and Environment
ERBASE2011
7
Environment
Accessible vs. Inaccessible - can the agent “see” everything?
Deterministic vs. Non-deterministic - do actions have guaranteed effect?
Static vs. Dynamic - does the environment change on its own?
Discrete vs. Continuous - is the number of actions and percepts finite?
ERBASE2011
8
What is an Intelligent Agent?
“An intelligent agent is a piece of software that is:
• Reactive – respond to changes in its environment
• Proactive – persistently pursues its goals• Social – interacts with other agents.
• Cooperation• Coordination• Negotiation
ERBASE2011
9
Examples of Intelligent Agents Assistant agent in
MS Office
Trading agents
Web spiders
Computer viruses
Characters in computer games
ERBASE2011
10
Agents vs. Objects
Agent•Autonomous - make a decision base on receiving request•Active - can decide when to act and how•Each agent has its own thread of control
Object•Non-autonomous - make a decision base on invoking method•Passive - have no control over a method execution•There is a single thread of control in the system
ERBASE2011
11
Agent Oriented Architectures
First Classification• Reactive – receives input, processes it and produces an
output.• Deliberative – has an internal view of its environment and
is able to follow its own plans.• Hybrid – mixture of reactive and deliberative, that follows
its own plans, but also sometimes directly reacts to external events without deliberation.
BDI Agents Architecture• Beliefs: represent the informational state of the agent.• Desires: represent objectives or situations that the agent
would like to accomplish. find the best price, go to the party or become rich.
• Intentions: represent the specific goal (or set of goals) to commit to.
ERBASE2011
12
Agent Types (1)
Collaborative Agents
• emphasise autonomy and cooperation with other agents• negotiate with their peers to reach mutually acceptable agreements
during co-operative problem solving
Mobile Agents
• roam wide area networks• interact with foreign hosts• perform tasks on behalf of their owners• return ‘home’• collaborative agents that move across networks
ERBASE2011
13
Agent Types (2)
Interface Agents
• emphasise autonomy and learning in order to perform tasks for their owners• support and provide proactive assistance to a user using a particular application• limited cooperation with other agents• interface agent = personal assistant = personal digital assistant = personal agent
Information Agents
• manage, manipulate or collate information from many distributed sources on WANs
• information agents = internet agents ERBASE2011
14
Multiagent System (MAS)
A Multiagent System is one that consists of a number of agents, which interact with one-another. To successfully interact, they will require the ability to cooperate, coordinate, and negotiate with each other.ERBASE2011
15
Agents model each other’s goals, actions, and domain knowledge, which may differ and interact directly
The Fully General Multiagent System
ERBASE2011
16
Why MAS?
Some domains require it Parallelism Robustness Scalability Simpler programming To study intelligence Geographic distribution Cost effectiveness
ERBASE2011
17
MAS Research Area
Distributed Computing: Processors share data, but not control. Focus on low-level parallelization, synchronization.
Distributed AI: Control as well as data is distributed. Focus on problem solving, communication, and coordination. Distributed Problem Solving (DPS): Task
decomposition and/or solution synthesis. Multiagent Systems (MAS): Behavior coordination
or behavior management.ERBASE2011
18
MAS Taxonomy
Degree of Heterogeneity and Communication [Peter Stone,2000]
• The degree to which different agents play different roles is certainly an important MAS issue
• The degree to which the agents communicateERBASE2011
19
Issues in Building MAS
Agent’s Issues
• Reactive vs. Deliberative agents• Local vs. Global perspective• Benevolence vs. Competitiveness
ERBASE2011
20
Homogeneous Non-Communicating Multiagent Systems Several different agents with identical structure
(sensors, effectors, domain knowledge, and decision functions).
Different sensor input and effectors output. Situated differently in the environment and
they make their own decisions regarding which actions to take.
ERBASE2011
21
Heterogeneous Non-Communicating Multiagent Systems Agents are situated differently in the
environment Different sensory inputs and different
actions
ERBASE2011
22
Homogeneous Communicating Multiagent Systems
Agents are identical that they are situated differently in the environment
Agents can communicate together directly
ERBASE2011
23
Heterogeneous Communicating Multiagent Systems
Different sensory data, goals, actions, and domain knowledge
ERBASE2011
24
Learning Opportunities
Problems and Issues
• Modeling other agents• Agent Cooperation
• Collective intelligence• Agent interaction• Teamwork, formation, coordination
• Negotiation• Reactive vs. deliberative approaches• Agreement Technologies• Multiagent Reasoning, Planning, Adaptation• Resource Management
ERBASE2011
25
Importance of MAS
Research in “Distributed AI” started over 30 years ago, but only in the mid of 1990s has it become a major research trend in AI.
Now the main conference (AAMAS) attracts around 800 submissions (of which 20-25% get accepted) each year. In addition, there are dozens of smaller workshops and
conferences.
it’s a large, young and dynamic research community
ERBASE2011
26
RoboCupCase study in Multiagent System
ERBASE2011
27
What is RoboCup?
RoboCup is an international research and education initiative, attempting to foster Artificial Intelligence and Robotics research by providing a standard problem where a wide range of technologies can be integrated and examined.
Real-time sensor fusion
Reactive behavior
Strategy acquisition
Learning
Real-time planning
Multi-agent systems
Context recognition
Vision
Strategic decision-making
Motor control
Intelligent robot control
and many moreERBASE2011
28
Domain Characteristics
Environment
State Change
Info. Accessibility
Sensor Readings
Control
Dynamic
Real time
Incomplete
Non-symbolic
Distributed
ERBASE2011
29
The Standard Problem
RoboCup SoccerThe game of football, where the research
goals concern cooperative multi-robot
and multi-agent systems in dynamic
adversarial environments.
All robots in this league are fully autonomous.
RoboCup RescueIs a project to promote
research and development in
disaster rescue at various levels
involving multi-agent team work
coordination and physical robotic
agents for search and rescue.
30
RoboCup Soccer
Distributed Multiagent Teammates and
adversaries domain
Partial world view Noisy sensors and
actuators Real-time
ERBASE2011
31
Applied Machine Learning
ERBASE2011
The general aim of Machine Learning is to produce intelligent programs, often called agents, through a process of learning and evolving.
Algorithm types
• Supervised learning• Unsupervised learning• Reinforcement learning
32
Reinforcement Learning
Reinforcement Learning (RL) is an area of machine learning, concerned with how an agent ought to take actions in an environment so as to maximize some notion of cumulative reward.
• RL lends itself wonderfully to agent-oriented systems, where agents “have explicit goals, can sense aspects of their environments, and can choose actions to influence their environments”.
ERBASE2011
33
The Agent-Environment Interface
Agent and environment interact at discrete time steps: t=0, 1, 2, … Agent observes state at step t: st S Produces action at step t: at A(st) Gets resulting rewards: rt+1 R And resulting next step: St+1
ERBASE2011
34
General Process of RL
The "cause and effect" idea in RL
• The agent observes an input state.• An action is determined by a decision making function (policy).• The action is performed.• The agent receives a scalar reward or reinforcement from the
environment.• Information about the reward given for that state / action pair is
recorded.ERBASE2011
35
Action Selection Policies
-greedy
• Most of the time the action with the highest estimated reward is chosen, called the greediest action. An action is selected at random with a small probability .
-soft
• The best action is selected with probability 1 - and the rest of the time a random action is chosen uniformly.
softmax
• Softmax assigns a rank or weight to each of the actions, according to their action-value estimate. A random action is selected with regards to the weight associated with each action, meaning the worst actions are unlikely to be chosen.
36
Exploration and Exploitation
Exploiting (Off-policy)
• The action selection policy is so strict that it always chooses an action that gives the most reward previously.
Exploring (On-policy)
• The policy is so trying other possibilities may produce a better reward.
ERBASE2011
37
Subtask of RoboCup Soccer
Keepaway Soccer [UT Austin Vila, 2005]
• Keepaway, in which one team, the keepers, tries to maintain possession of the ball within a limited region, while the opposing team, the takers, attempts to gain possession.
ERBASE2011
38
SARSA (State-Action-Reward-State-Action)
SARSA is a learning algorithm in the reinforcement learning area of machine learning. On-policy learning method, It learns state-action values (Q values).
Qt+1(st , at) Qt(st , at) + [rt+1 + Qt(st+1 , at+1) - Qt(st , at)]
• Qt (st , at): old value• (0 ≤ ≤ 1): learning rate• (0 ≤ ≤ 1): discount factor• rt+1: rewardERBASE2011
39
SARSA Algorithm
Procedural steps
• Initialize the Q-values table, Q(s, a).• Observe the current state, s.• Choose an action, a, for that state based on one of the action selection policies.• Take the action, and observe the reward, r, as well as the new state, s'.• Update the Q-value for the state using the observed reward and the maximum
reward possible according to the formula for the next state.• Set the state to the new state, and repeat the process until a terminal state is
reached.ERBASE2011
40
Mapping Keepaway to SARSA
Episode
• An episode begins when the player is first asked to make a decision and ends when possession of the ball is lost by the keepers.
Actions
• HoldBall()• PassBall(k)• GetOpen()• GoToBall()• BlockPass(k)
Reward
• The reward ri is the number of primitive time steps that elapsed while following action ai−1: ri = ti − ti−1.
ERBASE2011
41
Hand-coded Algorithm
The keepers’ policy space
ERBASE2011
42
Result
Learned Agents Hand-coded Agents
keepers hold the ball for about 12 seconds on
average
keepers hold the ball for about 8.2 seconds on
averageERBASE2011
43
Summary
ERBASE2011
Programmers would like to develop software in more human oriented (agent oriented)
An agent is an intelligent software program which is: Situated, Autonomous, Flexible and Robust
Multiagent System is comprised of a number of agents that they are able to cooperate, coordinate and negotiate with each other
RoboCup is an international standard project which fosters artificial intelligence
Reinforcement Learning is one of the machine learning approaches that is adapted excellently for agent oriented environments
44
References
An Introduction to MultiAgent Systems
• Michael Wooldridge (Author)
Reinforcement Learning: An Introduction
• Richard S. Sutton and Andrew G. Barto
ERBASE2011
45
Other Articles
RoboCup Publications
• http://www.robocup.org/category/papers-publications/
Peter Stone
• http://www.cs.utexas.edu/~pstone/
ERBASE2011
46
Thanks For Your Attention
ERBASE2011