DARPA ITO/MARS Project Update Vanderbilt University

30
DARPA ITO/MARS Project Update Vanderbilt University A Software Architecture and Tools for Autonomous Robots that Learn on Mission K. Kawamura, M. Wilkes, R. A. Peters II, D. Gaines Vanderbilt University Center for Intelligent Systems http://shogun.vuse.vanderbilt.edu/CIS/IRL / 12 January 2000

description

DARPA ITO/MARS Project Update Vanderbilt University. A Software Architecture and Tools for Autonomous Robots that Learn on Mission. K. Kawamura, M. Wilkes, R. A. Peters II, D. Gaines Vanderbilt University Center for Intelligent Systems http://shogun.vuse.vanderbilt.edu/CIS/IRL/. 12 January 2000. - PowerPoint PPT Presentation

Transcript of DARPA ITO/MARS Project Update Vanderbilt University

Page 1: DARPA ITO/MARS Project Update Vanderbilt University

DARPA ITO/MARS Project UpdateVanderbilt University

A Software Architecture and Tools for Autonomous

Robots that Learn on MissionK. Kawamura, M. Wilkes, R. A. Peters II, D. Gaines

Vanderbilt UniversityCenter for Intelligent Systems

http://shogun.vuse.vanderbilt.edu/CIS/IRL/

12 January 2000

Page 2: DARPA ITO/MARS Project Update Vanderbilt University

Vanderbilt MARS Team

• Kaz Kawamura, Professor of Electrical & Computer Engineering. MARS responsibility - PI, Integration

• Dan Gaines, Asst. Professor of Computer Science. MARS responsibility - Reinforcement Learning

• Alan Peters, Assoc. Professor of Electrical Engineering. MARS responsibility - DataBase Associative Memory, Sensory EgoSphere

• Mitch Wilkes, Assoc. Professor of Electrical Engineering. MARS responsibility - System Status Evaluation

• Jim Baumann, Nichols ResearchMARS responsibility - Technical Consultant

Sponsoring AgencyArmy Strategic Defense Command

Page 3: DARPA ITO/MARS Project Update Vanderbilt University

IMPACT:

NEW IDEAS:GRAPHIC:

SCHEDULE:

Learning with a DataBase Associative Memory

Sensory EgoSphere

Attentional Network

Robust System Status Evaluation

Mission-level interaction between the robot and a Human Commander.

Enable automatic acquisition of skills and strategies.

Simplify robot training via intuitive interfaces - program by example.

A Software Architecture and Tools for Autonomous Mobile Robots That Learn on Mission

Year 1 Year 2

IMA agents and schema

Learning algorithms

Test Demo

Final Demo

Demo III

COMM

LEARNING

CMDR SQUAD 1

SQUAD 2

SQUAD N

...SELF

ENVIR

IMA

Page 4: DARPA ITO/MARS Project Update Vanderbilt University

Project Goal

1. Develop a software control system for autonomous mobile robots that can:

2. accept mission-level plans from a human commander,

3. learn from experience to modify existing behaviors or to add new behaviors, and

4. share that knowledge with other robots.

Page 5: DARPA ITO/MARS Project Update Vanderbilt University

Project Approach

• Use IMA, to map the problem to a set of agents.

• Develop System Status Evaluation (SSE) for self diagnosis and to assess task outcomes for learning.

• Develop learning algorithms that use and adapt prior knowledge and behaviors and acquire new ones.

• Develop Sensory EgoSphere, behavior and task descriptions, and memory association algorithms that enable learning on mission.

Page 6: DARPA ITO/MARS Project Update Vanderbilt University

MARS Project: The Robots

ISAC HelpMate

ATRV-Jr.

Page 7: DARPA ITO/MARS Project Update Vanderbilt University

CommunicationsAgent

Act./Learning Agent

Commander Agent

Squad Agent1

Squad Agent2

Squad Agentn

...Self

Agent

EnvironmentAgent

IMA

The IMA Software Agent Structure of a Single Robot

Page 8: DARPA ITO/MARS Project Update Vanderbilt University

Robust System Status Analysis

• Timing information from communication between components and agents will be used.

• Timing patterns will be modeled.

• Deviations from normal indicate “discomfort.”

• Discomfort measures will be combined to provide system status information.

Page 9: DARPA ITO/MARS Project Update Vanderbilt University

What Do We Measure?

• Visual Servoing Component– error vs. time

• Arm Agent– error vs. time, proximity to unstable points

• Camera Head Agent– 3D gaze point vs. time

• Tracking Agent– target location vs. time

• Vector Signals/Motion Links– log when data is updated

Page 10: DARPA ITO/MARS Project Update Vanderbilt University

Update Delay Histogram (Arm Agent)

0

100

200

300

4001 9 17 25 33 41 49 57 65 73 81 89 97

Delay (10ms)

Freq

uenc

yUpdate Delay Histogram (Arm Agent)

0

50

100

150

200

1 9 17 25 33 41 49 57 65 73 81 89 97

Delay (10ms)

Freq

uenc

y

Update Delay Histogram (Arm Agent)

0

50

100

150

1 9 17 25 33 41 49 57 65 73 81 89 97

Delay (10ms)

Freq

uenc

y

Update Delay Histogram (Hand Agent)

0

500

1000

1500

1 10 19 28 37 46 55 64 73 82 91 100

Delay (10ms)

Freq

uenc

y

Page 11: DARPA ITO/MARS Project Update Vanderbilt University

Commander Interface

Page 12: DARPA ITO/MARS Project Update Vanderbilt University

Commander Interface

Page 13: DARPA ITO/MARS Project Update Vanderbilt University

Commander Interface

Page 14: DARPA ITO/MARS Project Update Vanderbilt University

Obstacle Avoidance

Page 15: DARPA ITO/MARS Project Update Vanderbilt University

Planning/Learning Objectives• Integrated Learning and Planning

– learn skills, strategies and world dynamics

– handle large state spaces

– transfer learned knowledge to new tasks

– exploit a priori knowledge

• Combine Deliberative and Reactive Planning

– exploit predictive models and a priori knowledge

– adapt given actual experiences

– make cost-utility trade-offs

Page 16: DARPA ITO/MARS Project Update Vanderbilt University

Overview of Approach

Page 17: DARPA ITO/MARS Project Update Vanderbilt University

Example: Different Terrains

Page 18: DARPA ITO/MARS Project Update Vanderbilt University

Generate Abstract Map

• Nodes selected based on learned action models • Each node represents a navigation skill

Page 19: DARPA ITO/MARS Project Update Vanderbilt University

Generate Plan in Abstract Network

• Plan makes cost-utility trade-offs

• Plans updated during execution

Page 20: DARPA ITO/MARS Project Update Vanderbilt University

• Action Model Learning– adapted MissionLab to allow experimentation (terrain conditions)– using regression trees to build action models

• Plan Generation– developed prototype Spreading Activation Network– using to evaluate potential of SAN for plan generation

Planning/Learning Status

Page 21: DARPA ITO/MARS Project Update Vanderbilt University

Role of ISAC in MARS

• Inspired by the structure of vertebrate brains

• a fundamental human-robot interaction model

• sensory attention and memory association

• learning sensory-motor coordination (SMC) patterns

• learning the attributes of objects through SMC

ISAC is a testbed for learning complex, autonomous behaviors by a robot under human tutelage.

Page 22: DARPA ITO/MARS Project Update Vanderbilt University

System Architecture

AA

A

AA

A

A

A

HumanAgent

RobotHuman

RobotSelfAgent

Software System

IMA PrimitiveAgent

HardwareI/O

Page 23: DARPA ITO/MARS Project Update Vanderbilt University

Next Up: Peer Agent

We are currently developing the peer agent.

The peer agent encapsulates the robot’s understanding of and interaction with other (peer) robots.

Page 24: DARPA ITO/MARS Project Update Vanderbilt University

System Architecture: High Level Agents

humanagent

selfagent

peeragent

peeragent

environmentagent

objectagent

objectagent

Due to the flat connectivity of IMA primitives, all high level agents can communicate directly if desired.

Page 25: DARPA ITO/MARS Project Update Vanderbilt University

Robot Learning Procedure• The human programs a task by sequencing component

behaviors via speech and gesture commands.

• The robot records the behavior sequence as a finite state machine (FSM) and all sensory-motor time-series (SMTS).

• Repeated trials are run. The human provides reinforcement feedback.

• The robot uses Hebbian learning to find correlations in the SMTS and to delete spurious info.

Page 26: DARPA ITO/MARS Project Update Vanderbilt University

Robot Learning (cont’d)• The robot extracts task dependent SMC info from the

behavior sequence and the Hebbian-thinned data.

• SMC occurs by associating sensory-motor events with behaviors nodes in the FSMs.

• The FSM is transformed into a spreading activation network (SAN).

• The SAN becomes a task record in the database associated memory (DBAM) and is subject to further refinements.

Page 27: DARPA ITO/MARS Project Update Vanderbilt University

Human Agent: Human Detection

Page 28: DARPA ITO/MARS Project Update Vanderbilt University

Human Agent: Recognition

Page 29: DARPA ITO/MARS Project Update Vanderbilt University

Human Agent: Face Tracking

Page 30: DARPA ITO/MARS Project Update Vanderbilt University

Schedule

YEAR ONE 1 2 3 4 5 6 7 8 9 10 11 12

Requirement Analysis/Concept Development

IMA (A/C) Deployment for HelpMate

IMA (A/C) Deployment for ATRV Jr.

Robust System Status Analysis

Reinforcement Learning

Develop Egosphere and DBAM

Demo Scenario – Simple HR interaction