Agent-Oriented Techniques for Programming Robots Hans-Dieter Burkhard Humboldt University Berlin.

Post on 19-Dec-2015

220 views 1 download

Tags:

Transcript of Agent-Oriented Techniques for Programming Robots Hans-Dieter Burkhard Humboldt University Berlin.

Agent-Oriented Techniques for Programming Robots

Hans-Dieter BurkhardHumboldt University Berlin

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 2

What is an Agent?

Someone who acts autonomously on behalf of others

• Sales agent• Insurance agent• Undercover agent• .....

Software Agents

• Assistance Systems• Search engines• ChatterBots• …

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 3

Open Systems

Definition (Hewitt)

• Continuous availability• Extensibility • Decentralized control • Asynchronous work• Inconsistent information • Arm length relationships

Consider: P2P

Agents arrived with open systems

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 4

What is an Agent?

A program that acts autonomously on behalf of its user

Further Attributes:Intelligent, social, reactive, proactive, adaptive, …

An agent is a long running program, where the work can be meaningfully described as autonomous completion of orders or goals while interacting with the environment.

AI as research on intelligent agents.

(cf. Textbook Russell/Norvig: Artificial Intelligence)

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 5

Agents (Autonomous Systems) in Real World

• Natural language understanding• Image interpretation• Driver assistance systems• Traffic control • Space discovery• Autonomous robots:

– Service robots– Rescue robots– Entertainment robots– Industrial robots– Agricultural robots– …

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 6

Autonomous Systems in Real World

Robot soccer as testbed

(How to build and program soccer robots?)

Annual world championships and conference

Long term goal: Play like FIFA champion in 2050

Robot “Vision” from Team Osaka

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 7

Chess vs. Soccer

Chess:• Static• 3 Minutes per move• Single action• Single player• Information:

• reliable• complete

1997: Deep Blue wins against human champion Kasparov

Soccer:• Dynamic• Milliseconds• Sequences of actions• Team• Information:

• unreliable• incomplete

Robot“Nao” from Aldebaran

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 8

RoboCup

Melbourne 2000 Bremen 2006

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 9

Service Robots

Alternatives:

- from the refrigerator

- from the cellar

- from the neighbor

- from the shop

- from the internet

- …

Which alternative to choose?

What else is needed (glass, …)?

Willie, bring me a beer

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 10

Robot Needs a World Model

Facts about the world– maps, positions of objects, descriptions, …

Methods for processing sensory inputs– language processing, image processing

Methods for integrating sensory data– new world model from old model and new sensory data

Memory of environment:Part of state in the program

there was a beer in the refrigerator

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 11

World Model

Problems:

Environment is only partially observable

Observations are insecure and noisy

Scene interpretation with Bayesian methods, e.g. Probability to be at location s given an observation z: P(s|z) = P(z|s)·P(s) / P(z)

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 12

World Model

World model need not be true knowledge,

only belief of the agent.Someone took the beer from

the refrigerator!

Plans may fail.Need methods for revision.

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 13

Memory of Commitments

Tasks/Goals: Desired world states

Plans (Sequence of actions)

Rationality: Agents should only pursue

goals/plans that can be achieved

Why did I go to the refrigerator

Commitments:Part of state in the program

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 14

Goal Oriented Agents

Deliberation: Select goal to achieve

e.g. by calculating utilities

Means-ends reasoning: Planning method

e.g. by search in the action space

Rationality. Needs measures of success/quality/benefits.

“Bounded rationality”:Success w.r.t. to available resources (information, time, …)

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 15

Utility Estimations

Different options oAchievable by different plans pWith different results r

Value of result r : v(r)Probability for achieving r using plan p: (r | p) Utility of plan p (expectation) : u(p) = r result of p (r | p) · v(r)Utility of option o: u(o) = Max{ u(p) | p plan for o }

Decision process (used for simulated soccer player ATH98):Estimate utilities for options oSelect best option o as goal gBuild plan p for g

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 16

Rationality (Realism) Goals must be feasible

Selection process:

1. Rough estimation (utilities)

2. In case of error in means-ends reasoning (planning)

Revision of goal selection

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 17

Refinement of GoalsRefinement as iterated decision-process:

Long term goal intermediate goals ...

intermediate goals actions

Analogy: Stack of procedure calls

Least commitment: Specification only as far as necessary.

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 18

Maintaining Multiple Goals: BDI-Approach

Belief (world model)

Desire (desirable future world states)

Intentions (world states to be achieved)

Desires may be in conflict

Intentions must not be in conflict (rationality)

Mental states based on models of human acting (especially w.r.t. bounded rationality)

M.E. Bratman: Intentions, Plans, and Practical Reason, Harvard University Press, Massachusetts, 1987.

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 19

Adaptation vs. Stability

Conflicts between old intentions

and potential new intentions (desires)

Adaptation: select always best intentions

Stability: continue old intentions

Advantages of stability:

Reliability (important for cooperation)

Reduce overhead for changes

Avoid oscillations

Disadvantages of stability:

Stick too long on unsatisfactory behavior (fanatism)

There is a beeron the table!

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 20

BDI: Screen of Admissibility

Bratman’s solution

for conflicts between old and potential new intentions:

Old intentions restrict admissibility of new intentions,

i.e. set a filter for

- additional intentions- for refinement of intentions

Efficiency:

Reduce repeated evaluation of adopted intentions.

Bounded Rationality

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 21

BDI Agents

BDI architectures widely used

Implementation in different variations

Often only in simplified manner

desire = goal

intention = plan

without parallel intentions

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 22

Putting Together: Sense-think-act Cycle

Logical ordering of intern processing of the agent

1. Sense („input“) + perception (interpretation, world model)

2. Think (“decision”: evaluation, planning)

3. Act („output“)

thinkact

sense

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 23

Sense-think-act Cycle

Synchronisation (sequential)

thinkact

sense

think

act

sense

time

input

output

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 24

Sense-think-act Cycle

Synchronisation (concurrent)

thinkact

sense

think

act

sense

time

input

output

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 25

Sense-think-act Cycle

Synchronisation problems

thinkact

sense

think

act

sense

time

input

output

?For complicated deliberation processes

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 26

Different Deliberation Times

Layered architectures with different deliberation cycles, e.g.- Immediate reactions (avoid obstacles)- Short term planning- Long term planning

AIBO: 30 images per second125 motor commands per second

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 27

Structures: Layered Architectures

Synchronization

Conflicts

Concurrency

Layer n

Layer 2

Layer 1

sense

act

. . . . . .

AgentEnvironment

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 28

Layered Architectures with Mediator

Layer n

Layer 2

Layer 1

sense

act

. . . . . .

AgentEnvironment

Mediator

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 29

1-Pass-Architecture

Layer n

Layer 2

Layer 1

sense

act

. . . . . .

AgentEnvironment

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 30

2-Pass-Architecture

Layer n

Layer 2

Layer 1

sense

act

. . . . . .

AgentEnvironment

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 31

How to Deal with Dynamic World

Changing situations

Changing expectations

Unexpected situations (e.g. obstacles)

Changing plans

Conflict handling by BDI-approach

Least Commitment: Deliberate as far as necessary

Double pass architecture (DPA)

Plans may fail.Need methods for revision.

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 32

Option Hierarchies

Servebeer

Get bottle

Get glass

Open bottle

Fill glass

Bring Glass

fromRefr.

fromShop

Go toRefr.

OpenRefr.

TakeBottle

GetMoney

GotoShop

BuyBottle

Gohome

“And-branches”- all suboptions have to be achieved

“Or-branches” (Alternatives)- one suboption has to be achieved

. . . . . .

. . . . . . . . . . . .

. . .. . . . . .. . . . . . . . .

. . . . . . . . .

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 33

Intention Tree

Servebeer

Get bottle

Get glass

Open bottle

Fill glass

Bring Glass

fromRefr.

fromShop

Go toRefr.

OpenRefr.

TakeBottle

GetMoney

GotoShop

BuyBottle

Gohome

Options may be in

different states, e.g.

- intended

- active

- done

. . . . . . . . .

. . . . . .

. . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . .

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 34

Intention Tree

Servebeer

Get bottle

Get glass

Open bottle

Fill glass

Bring Glass

fromRefr.

Go toRefr.

OpenRefr.

TakeBottle

Options may be in

different states, e.g.

- intended

- active

- done. . . . . .

. . .

. . . . . .

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 35

Activation Path

Servebeer

Get bottle

Get glass

Open bottle

Fill glass

Bring Glass

fromRefr.

Go toRefr.

OpenRefr.

TakeBottle

Options may be in

different states, e.g.

- intended

- active

- done

Part of intention tree

. . . . . .

. . .

. . . . . .

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 36

Plan Fails

Servebeer

Get bottle

Get glass

Open bottle

Fill glass

Bring Glass

fromRefr.

Go toRefr.

OpenRefr.

TakeBottle

Need for re-deliberation:

Look for alternativesNo Beer inside

. . .

. . .

. . .

. . .

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 37

Repair: Intention Tree

Servebeer

Get bottle

Get glass

Open bottle

Fill glass

Bring Glass

fromRefr.

fromShop

Go toRefr.

OpenRefr.

TakeBottle

GetMoney

GotoShop

BuyBottle

Gohome

Re-deliberation

not by chronological

backtracking

. . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . .

. . .

. . . . . . . . .

. . .

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 38

Double Pass Architecture (DPA)

2 Passes:- Deliberation determines intention tree

modification if necessary (re-deliberation)

- Executor works over intention tree

maintains activity pass (top-down processing)

controls actuators

Advantages over stack oriented approaches:

Procedure stack has access only to last recent call

Implementations: XABSL, DPA

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 39

Still: Classical Approach (“Dualism”)

Robot = Agent (Brain) augmented by Sensors + Actuators

EnvironmentS

enso

rs

Act

uato

rs

Robot

Agent(program)

Input Output

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 40

Limitations for Complex Actuators

Vehicles have simpler actuation than legged robots

Vehicles:• Accelerate• Drive• Turn• Stop

Legged robots:• Coordination of limbs• Complex kinematics• Stability maintenance (even in stop state)

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 41

Machine LearningUse „trial and error“.

•Evolutionary algorithms•Reinforcement learning•Case based reasoning•Neural networks

http://www.robocup.de/AT-Humboldt/simloid-evo.shtml?de

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 42

Proprioception: Feeling the own Body

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 43

Biologically Inspired Robotics

Emergent behavior using situatedness in physical world

Intelligence emerges by “clever connections”

New insights for Artificial Intelligence:Intelligence needs a body for experiencing the real world.

Many sensors

Local processing

Coupling with actuators

Neural Networks

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 44

Acceleration Sensors at our RobotsAcceleration Sensors at our Robots

Accelboards: Accelboards: • real time (10ms cycle)• C/Assembler program• local processing

ABHL

ABML

ABAL

ABSR

ABFL

ABAR

ABHR

ABFR

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 45

Recent Experiments

Local control by Recurrent Neural NetworkNetworks developed by evolution

H.D.Burkhard, HU BerlinAOT for Programming Robots, Durres, Sept. 10, 2008 46

See you at RoboCup 2009 in Graz!

Thank you!