Testbed for Integrating and Evaluating Learning Techniques
description
Transcript of Testbed for Integrating and Evaluating Learning Techniques
1Integrating Learning in Interactive Gaming Simulators
Testbed for Integrating and Evaluating Learning Techniques
David W. Aha1 & Matthew Molineaux2
1Intelligent Decision Aids GroupNavy Center for Applied Research in AI
Naval Research Laboratory; Washington, DC2ITT Industries; AES Division; Alexandria, VA
17 November 2004
TIELT
2Testbed for Integrating and Evaluating Learning Techniques
Outline
1. Motivation: Learning in cognitive systems2. Objectives:
• Encourage machine learning research on complex tasks that require knowledge-intensive approaches
• Provide industry & military with access to the results
3. Design: TIELT functionality & components4. Example: Knowledge base content5. Status:
• Implementation & documentation• Collaborations & events• Task list
6. Summary
1. Motivation: Learning in cognitive systems2. Objectives:
• Encourage machine learning research on complex tasks that require knowledge-intensive approaches
• Provide industry & military with access to the results
3. Design: TIELT functionality & components4. Example: Knowledge base content5. Status:
• Implementation & documentation• Collaborations & events• Task list
6. Summary
Thanks to our sponsor:
3Testbed for Integrating and Evaluating Learning Techniques
DARPA
Defense Advanced Research Projects Agency (~$2.3B/yr)
IPTO IXO MTO…
Information Processing Technology Office
• Selected previous achievements– Timesharing, Internet, Email, Speech Understanding, LISP, …
• Current focus: Cognitive Systems
4Testbed for Integrating and Evaluating Learning Techniques
Cognitive Systems
• A cognitive system is one that– can reason, using substantial amounts of
appropriately represented knowledge– can learn from its experience so that it performs
better tomorrow than it did today– can explain itself and be told what to do– can be aware of its own capabilities and reflect on its
own behavior – can respond robustly to surprise
• A cognitive system is one that– can reason, using substantial amounts of
appropriately represented knowledge– can learn from its experience so that it performs
better tomorrow than it did today– can explain itself and be told what to do– can be aware of its own capabilities and reflect on its
own behavior – can respond robustly to surprise
“Systems that know what they’re doing”
5Testbed for Integrating and Evaluating Learning Techniques
Anatomy of a Cognitive Agent
External Environment External Environment
Communication(language,gesture,image)
Prediction,planning
Deliberative Processes
Reflective Processes
Reactive Processes
Perception Action
STM
Sensors Effectors
Other reasoning
LTM
Concepts
SentencesCog
nit
ive
Ag
en
t
Affect
Attention
Learning
Learning
(Brachman, 2003)
6Testbed for Integrating and Evaluating Learning Techniques
Learning in Cognitive Systems(Langley & Laird, 2002)
Action preconditions
Plan adaptor
Action effects
Resource allocater
Information fuser
Decision application procedure
Decision selector, Conflict resolver
Explanation generation
Recall procedureRemembering & Reflection
Dialogue coordination
NL interpretationInteraction & Communication
Action utility
Action executerExecution & Action
Inferencing knowledge and procedures
Beliefs & belief relationsReasoning & Belief Maintenance
Plans, Plan generator (e.g., search method)Problem Solving & Planning
Monitoring focus
Environment modelPrediction & Monitoring
Situation categories, Situation categorizationPerception & Situation Assessment
Space of possible decisionsDecision Making & Choice
Categories, Pattern Categorizer
Patterns, Pattern recognizer Recognition & Categorization
Knowledge Container(s)Capability
Action preconditions
Plan adaptor
Action effects
Resource allocater
Information fuser
Decision application procedure
Decision selector, Conflict resolver
Explanation generation
Recall procedureRemembering & Reflection
Dialogue coordination
NL interpretationInteraction & Communication
Action utility
Action executerExecution & Action
Inferencing knowledge and procedures
Beliefs & belief relationsReasoning & Belief Maintenance
Plans, Plan generator (e.g., search method)Problem Solving & Planning
Monitoring focus
Environment modelPrediction & Monitoring
Situation categories, Situation categorizationPerception & Situation Assessment
Space of possible decisionsDecision Making & Choice
Categories, Pattern Categorizer
Patterns, Pattern recognizer Recognition & Categorization
Knowledge Container(s)Capability
Many opportunities
exist for learning in cognitive
systems
7Testbed for Integrating and Evaluating Learning Techniques
Problem
Status of Learning in Cognitive Systems
Few deployed cognitive systems integrate techniques that exhibit rapid & enduring learning behavior on complex tasks
– It’s costly to integrate & evaluate embedded learning techniques
Few deployed cognitive systems integrate techniques that exhibit rapid & enduring learning behavior on complex tasks
– It’s costly to integrate & evaluate embedded learning techniques
Complication
Machine learning (ML) researchers tend to investigate:¬Rapid: Knowledge poor algorithms
¬Enduring: Learning over a short time period
¬Embedded: Stand-alone evaluations
Machine learning (ML) researchers tend to investigate:¬Rapid: Knowledge poor algorithms
¬Enduring: Learning over a short time period
¬Embedded: Stand-alone evaluations
8Testbed for Integrating and Evaluating Learning Techniques
TIELT Motivation
We want Cognitive Agents that Learn• Rapidly,• in context, and• over the long-term.
We have few (if any) of them
We want Cognitive Agents that Learn• Rapidly,• in context, and• over the long-term.
We have few (if any) of them
9Testbed for Integrating and Evaluating Learning Techniques
TIELT Objective
Encourage the study of research on learning in cognitive systems, with subsequent transition goals
ML Researchers
Learning Modules
Cognitive Agents
Cognitive Agents
ThatLearn
Military
Industry
10Testbed for Integrating and Evaluating Learning Techniques
Current ML Research Focus
Benchmark studies of multiple algorithms on simple (e.g., supervised) learning tasks from many static datasets
ML Researcher
ML System1Database1
This was encouraged (in part) by the availability of datasets in a standard (interface) format
Database2
Databasem
ML System2
ML Systemn
......
m results on System1
m results on System2
m results on Systemn
...Analysis
BenchmarkAnalysis
11Testbed for Integrating and Evaluating Learning Techniques
Previous API for ML Investigations
Supervised Learning ML
SystemDatabase Interface
(standard format)
(e.g., UCI Repository)
ReasoningSystem
Supervised Learning ML
SystemDatabase Interface
(standard format)
(e.g., UCI Repository)
ReasoningSystem
Supervised Learning ML
Systemj
DatabaseiInterface
(standard format)
DecisionSystemk
Inspiration
UC Irvine Repository of Machine Learning (ML) Databases • An interface for empirical benchmarking studies on supervised learning• 1525 citations (and many publications use it w/o citing) since 1986
UC Irvine Repository of Machine Learning (ML) Databases • An interface for empirical benchmarking studies on supervised learning• 1525 citations (and many publications use it w/o citing) since 1986
Limitation
• Only useful for isolated ML studies• Has not encouraged studies of ML in cognitive systems
• Only useful for isolated ML studies• Has not encouraged studies of ML in cognitive systems
12Testbed for Integrating and Evaluating Learning Techniques
Accomplishing TIELT’s Objective
One approach: Shift ML research focus from static datasets to dynamic simulators of rich environments
Supervised Learning ML
SystemDatabase Interface
(standard format)
(e.g., UCI Repository)
ReasoningSystem
Supervised Learning ML
SystemDatabase Interface
(standard format)
(e.g., UCI Repository)
ReasoningSystem
Supervised Learning ML
Systemj
DatabaseiInterface
(standard format)
(e.g., UCI Repository of ML Databases)
DecisionSystemk
Cognitive Learning Reasoning Modules
Wor
ld(S
imu
late
d/R
ea
l)
Sensors ML Module
Interface(standard API)
ML Module
ML Module(e.g., TIELT)Effectors
Cognitive Learning Reasoning Modules
Wor
ld(S
imu
late
d/R
ea
l)
Sensors ML Module
Interface(standard API)
ML Module
ML Module(e.g., TIELT)Effectors
Cognitive Learning Decision Systemk
Wor
ldi
(Sim
ula
ted/
Re
al)
Sensors ML Module
Interface(standard API)
ML Module
ML Modulej
(e.g., TIELT)Effectors
13Testbed for Integrating and Evaluating Learning Techniques
Refining TIELT’s Objective
Objective
Develop a tool for evaluating decision systems in simulators– Specific support for evaluating learning techniques– Demonstrate research utility prior to approaching industry/military
Develop a tool for evaluating decision systems in simulators– Specific support for evaluating learning techniques– Demonstrate research utility prior to approaching industry/military
Benefits
1. Reduce system-simulator integration costs from m*n to m+n (see next)
2. Permits benchmark studies on selected simulator tasks
3. Encourages study of ML for knowledge-intensive problems
4. Provide support for DARPA Challenge Problems on Cognitive Learning
1. Reduce system-simulator integration costs from m*n to m+n (see next)
2. Permits benchmark studies on selected simulator tasks
3. Encourages study of ML for knowledge-intensive problems
4. Provide support for DARPA Challenge Problems on Cognitive Learning
14Testbed for Integrating and Evaluating Learning Techniques
Reducing Integration Costs
Integrating a simulator & cognitive system: Its expensive! (time, $)
Simulator1 Cognitive System1
Problem: Prohibitive integration costs retard research progress
m*n integrations
Simulator1
Simulatorm
Cognitive System1
Cognitive Systemn
......
Proposed Solution: Standardize integrations to reduce costs
m+n integrations
Simulator1
Simulatorm
Cognitive System1
Cognitive Systemn
TIELT......
15Testbed for Integrating and Evaluating Learning Techniques
What Domain?
Desiderata
1. Available implementations (cheap to acquire & run)2. Challenging problems for CogSys/ML research 3. Significant interest (academia, military, industry, funding, public)
1. Available implementations (cheap to acquire & run)2. Challenging problems for CogSys/ML research 3. Significant interest (academia, military, industry, funding, public)
Simulation Games?
16Testbed for Integrating and Evaluating Learning Techniques
Gaming Genres of Interest(modified from (Laird & van Lent, 2001))
Control units and strategic enemy (i.e., other coach), commentator
Act as coach and a key player
Madden NFL Football
Team Sports
Control enemy1st vs. 3rd personIndividual competitionMany (e.g., driving games)
Individual Sports
Control all units and strategic enemies
God, first-person perspectives
Controlling at multiple levels (e.g., strategic, tactical warfare)
Empire Earth 2, AoE, Civilization
Strategy (real-time, discrete)
Control enemies, partners, and supporting characters
Solo vs. (massively) multi-player
Be a character (includes puzzle solving, etc.)
Temple of Elemental Evil
Role-Playing
Control enemies1st vs. 3rd person, solo vs team play
Control a characterQuake, UnrealAction
AI RolesSub-GenresDescriptionExampleGenre
17Testbed for Integrating and Evaluating Learning Techniques
Some Game Environment Challenges
• Significant background knowledge available– e.g., Processes, tasks, objects, actions– Use: Provide opportunities for rapid learning
• Adversarial• Collaborative• Multiple reasoning levels (e.g., strategic, tactical)• Real-time• Uncertainty (“Fog of War”)• Noise (e.g., imprecision)• Relational (e.g., social networks)• Temporal• Spatial
• Significant background knowledge available– e.g., Processes, tasks, objects, actions– Use: Provide opportunities for rapid learning
• Adversarial• Collaborative• Multiple reasoning levels (e.g., strategic, tactical)• Real-time• Uncertainty (“Fog of War”)• Noise (e.g., imprecision)• Relational (e.g., social networks)• Temporal• Spatial
18Testbed for Integrating and Evaluating Learning Techniques
Focus: Broad interests
Academia: Learning in Simulation Games
Evidence of commitment
• Interactive Computer Games: Human-Level AI’s Killer Application (Laird & van Lent, AAAI’00 Invited Talk)
• MeetingsAAAI symposia (several in recent years)International Conference on Computers and GamesAAAI’04 Workshop on Challenges in Game AIAI in Interactive Digital Entertainment Conference (2005-) …
• New journals focusing on (e.g., real-time) simulation gamesJ. of Game DevelopmentInt. J. of Intelligent Games and Simulation
• Interactive Computer Games: Human-Level AI’s Killer Application (Laird & van Lent, AAAI’00 Invited Talk)
• MeetingsAAAI symposia (several in recent years)International Conference on Computers and GamesAAAI’04 Workshop on Challenges in Game AIAI in Interactive Digital Entertainment Conference (2005-) …
• New journals focusing on (e.g., real-time) simulation gamesJ. of Game DevelopmentInt. J. of Intelligent Games and Simulation
• Game engines (e.g., GameBots, ORTS, RoboCup Soccer Server)Use (other) open source engines (e.g., FreeCiv, Stratagus)
• Representation (e.g., Forbus et al., 2001; Houk, 2004; Munoz-Avila & Fisher, 2004)
• Learning opponent unit models (e.g., Laird, 2001; Hill et al., 2002)• … (see table)
• Game engines (e.g., GameBots, ORTS, RoboCup Soccer Server)Use (other) open source engines (e.g., FreeCiv, Stratagus)
• Representation (e.g., Forbus et al., 2001; Houk, 2004; Munoz-Avila & Fisher, 2004)
• Learning opponent unit models (e.g., Laird, 2001; Hill et al., 2002)• … (see table)
19Testbed for Integrating and Evaluating Learning Techniques
Name + Reference Method
TasksTest Plan & Metrics
(independent variables to vary and dependents to measure)
Learning Performance(Goodman, AAAI’93) Projective Visualization 1 TDIDT per
feature clusterPredict amount of
inflicted damageVary training amount & projection length;
predict summed pain
MAYOR (Fasciano, 1996 M.S. Thesis)
Case-based planning Plan Execution Conds.
Maximize SimCity Game Score
Online: Vary whether learning was used;measure % successful plan executions
(Fogel et al., CCGFBR’96) Genetic Alg Rule learning 1x1 tank battles Vary locations/space of routes; measure damage
KnoMic (van Lent & Laird, ICML’98) Production Rules Rule Conds. & Goals
Racetrack Mission for TacAir SOAR
Measure speed in which KnoMic learned correct control rules
(Agogino et al., 1999 NPL) Neuro-evolution Wt & genetic learning
30 gold-collecting peons vs. 1 human
Vary learning methodology; measure survival rate of peons
(Laird, ICAA’01) SOAR Chunking Rule learning Predict enemy beh. None; would focus on speedup
(Geisler, 2002 M.S. Thesis) NB, TDIDT, BP, ensembles
Depends on the method
4 simple classification tasks
Vary training set size & #ensembles;measure classification accuracy
Bryant & Mikkulainen, CEC’03) Neuroevolution NN wts, etc. Discrete Legions vs. Barbarians
Offline: Vary training set size; measure a game-specific fn.
(Chia & Williams, BRIMS’03) Naïve Bayes Learning to add/del. rules
1x1 tank battles Vary adversarial aggressiveness & whether learning occurs; measure #wins
(Fagan & Cunningham, ICCBR’03) Case-based prediction Selecting plans to save
Predict a player’s action Vary the #stored plans and the user;measure acc. & prediction freq.
(Guestrin et al., IJCAI’03) Relational MDPs Partition objects Beat enemy in 3x3 Freecraft games
Simplistic: one run.
(Sweetser & Dennis, 2003 Ent. Computing: Tech. & Applications)
Advice giving Regression wts Just-in-time Hints to Human Player
Vary with vs. without providing hints; measure % hints that were useful
(Spronck et al., 2004 IJIGS) Dynamic Scripting Rule wts Beat NWN AI in simple scenarios
Offline: Measure average turning pt & speed, effectiveness, robustness, & efficiency
(Ponsen, 2004 M.S. Thesis) Dynamic Scripting & GA for rule learning
Rule wts and new rules
Defeat Wargus opponent
Offline: Vary map size, learning algorithm, and opponent control alg; measure % wins
(Ulam et al., AAAI’04 Workshop) Self-adaptation Task Edits Defend city (FreeCiv) Offline: Vary trace size; measure % successes
Survey: Selected Previous Work onLearning & Gaming Simulators
20Testbed for Integrating and Evaluating Learning Techniques
Industry: Learning in Simulation Games
Focus: Increase sales via enhanced gaming experience
• USA: $7B in sales in 2003 (ESA, 2004)– Strategy games: $0.3B
• Simulators: Many! (e.g., SimCity, Quake, SoF, UT)• Target: Control avatars, unit behaviors
• USA: $7B in sales in 2003 (ESA, 2004)– Strategy games: $0.3B
• Simulators: Many! (e.g., SimCity, Quake, SoF, UT)• Target: Control avatars, unit behaviors
Evidence of commitment
• Developers: “keenly interested in building AIs that might learn, both from the player & environment around them.” (GDC’03 Roundtable Report)
• Middleware products that support learning (e.g., MASA, SHAI, LearningMachine)• Long-term investments in learning (e.g., iKuni, Inc.)• Conferences:
– Game Developer’s Conference– Computer Game Technology Conference
• Developers: “keenly interested in building AIs that might learn, both from the player & environment around them.” (GDC’03 Roundtable Report)
• Middleware products that support learning (e.g., MASA, SHAI, LearningMachine)• Long-term investments in learning (e.g., iKuni, Inc.)• Conferences:
– Game Developer’s Conference– Computer Game Technology Conference
21Testbed for Integrating and Evaluating Learning Techniques
Industry: Learning in Simulation Games
Some Promising Techniques (Rabin, 2004)• Belief networks for probabilistic inference• Decision tree learning• Genetic algorithms (e.g., for offline parameter tuning)• Statistical prediction (e.g., using N-grams to predict future events)• Neural networks (e.g., for offline applications)• Player modeling (e.g., to regulate game difficulty, model reputation)• Reinforcement learning• Weakness modification learning (e.g., don’t repeat failed
strategies)
• Belief networks for probabilistic inference• Decision tree learning• Genetic algorithms (e.g., for offline parameter tuning)• Statistical prediction (e.g., using N-grams to predict future events)• Neural networks (e.g., for offline applications)• Player modeling (e.g., to regulate game difficulty, model reputation)• Reinforcement learning• Weakness modification learning (e.g., don’t repeat failed
strategies)
Status
• Few deployed systems have used learning (Kirby, 2004): e.g.,1. Black & White: on-line, explicit (player immediately reinforces behavior)2. C&C Renegade: on-line, implicit (agent updates set of legal paths)3. Re-volt: off-line, implicit (GA tunes racecar behaviors prior to shipping)
• Problems: Performance, constraints (preventing learning “something dumb”), trust in learning system
• Few deployed systems have used learning (Kirby, 2004): e.g.,1. Black & White: on-line, explicit (player immediately reinforces behavior)2. C&C Renegade: on-line, implicit (agent updates set of legal paths)3. Re-volt: off-line, implicit (GA tunes racecar behaviors prior to shipping)
• Problems: Performance, constraints (preventing learning “something dumb”), trust in learning system
22Testbed for Integrating and Evaluating Learning Techniques
Military: Learning in Simulation Games
Focus: Training, analysis, & experimentation
• Learning: Acquisition of new knowledge or behaviors• Simulators: JWARS, OneSAF, Full Spectrum Command, etc.• Target: Control strategic opponent or own units
• Learning: Acquisition of new knowledge or behaviors• Simulators: JWARS, OneSAF, Full Spectrum Command, etc.• Target: Control strategic opponent or own units
Evidence of commitment
• “Learning is an essential ability of intelligent systems” (NRC, 1998)• “To realize the full benefit of a human behavior model within an intelligent
simulator,…the model should incorporate learning” (Hunter et al., CCGBR’00)• “Successful employment of human behavior models…requires that [they] possess
the ability to integrate learning” (Banks & Stytz, CCGBR’00)• Conferences: BRIMS, I/ITSEC
• “Learning is an essential ability of intelligent systems” (NRC, 1998)• “To realize the full benefit of a human behavior model within an intelligent
simulator,…the model should incorporate learning” (Hunter et al., CCGBR’00)• “Successful employment of human behavior models…requires that [they] possess
the ability to integrate learning” (Banks & Stytz, CCGBR’00)• Conferences: BRIMS, I/ITSEC
Status: No CGF simulator has been deployed with learning (D. Reece, 2003)
Some problems (Petty, CGFBR’01):• Cost of training phase• Loss of training control• Learning non-doctrinal behaviors• Learning unpredictable behaviors
Some problems (Petty, CGFBR’01):• Cost of training phase• Loss of training control• Learning non-doctrinal behaviors• Learning unpredictable behaviors
23Testbed for Integrating and Evaluating Learning Techniques
Analysis: Conclusions
State-of-the-art
1. Research on learning in complex gaming simulators is in its infancy• Knowledge-poor approaches are limited to simple performance tasks• Knowledge-intensive approaches require huge knowledge bases, which to date
have been manually encoded
2. Existing approaches have many simplifying assumptions• Scenario limitations (e.g., on number and/or capabilities of adversaries)• Learning is (usually) performed only off-line • Learned knowledge is not transferred (e.g., to playing other games)
1. Research on learning in complex gaming simulators is in its infancy• Knowledge-poor approaches are limited to simple performance tasks• Knowledge-intensive approaches require huge knowledge bases, which to date
have been manually encoded
2. Existing approaches have many simplifying assumptions• Scenario limitations (e.g., on number and/or capabilities of adversaries)• Learning is (usually) performed only off-line • Learned knowledge is not transferred (e.g., to playing other games)
Significant advances would include:
1. Fast acquisition approaches for a large amount of domain knowledge • This would enable rapid learning without requiring manual encoding
2. Demonstrations of on-line learning (i.e., within a single simulation run)3. Increasing knowledge transfer among tasks & simulators over time
• e.g., knowledge of processes, strategies, tasks, roles, objects, & actions
1. Fast acquisition approaches for a large amount of domain knowledge • This would enable rapid learning without requiring manual encoding
2. Demonstrations of on-line learning (i.e., within a single simulation run)3. Increasing knowledge transfer among tasks & simulators over time
• e.g., knowledge of processes, strategies, tasks, roles, objects, & actions
24Testbed for Integrating and Evaluating Learning Techniques
TIELT Specification
1. Simplifies integration & evaluation!• Learning-embedded decision systems & gaming simulators• Supports communications, game model, perf. task, evaluation• Free & available
2. Learning foci• Task (e.g., learn how to execute, or advise on, a task)• Player (e.g., accept advice, predict a player’s strategies)• Game (e.g., learn/refine its objects, their relations, & behaviors)
3. Learning methods• Supervised/unsupervised, immediate/delayed feedback, analytic,
active/passive, online/offline, direct/indirect, automated/interactive• Learning results should be available for inspection
4. Gaming simulators: Those with challenging learning tasks5. Reuse:
• Communications are separated from the game model & perf. task• Provide access to libraries of simulators & decision systems
1. Simplifies integration & evaluation!• Learning-embedded decision systems & gaming simulators• Supports communications, game model, perf. task, evaluation• Free & available
2. Learning foci• Task (e.g., learn how to execute, or advise on, a task)• Player (e.g., accept advice, predict a player’s strategies)• Game (e.g., learn/refine its objects, their relations, & behaviors)
3. Learning methods• Supervised/unsupervised, immediate/delayed feedback, analytic,
active/passive, online/offline, direct/indirect, automated/interactive• Learning results should be available for inspection
4. Gaming simulators: Those with challenging learning tasks5. Reuse:
• Communications are separated from the game model & perf. task• Provide access to libraries of simulators & decision systems
25Testbed for Integrating and Evaluating Learning Techniques
Distinguishing TIELT
System Focus $ Game Engine(s)
Prominent Feature
Reasoning Activity
DirectIA (MASA)
AI SDK FPS, RTS, etc.
Behavior authoring Sense-act, …
SimBionic (SHAI)
AI SDK FPS, etc. Behavior authoring Sense-act, …
FEAR AI SDK Quake 2, etc. Behavior authoring Sense-act, …
RoboCup Research Testbed
RoboCup Soccer game play Sense-act, coaching, etc.
GameBots Research Testbed
UT (FPS) UT game play Sense-act
ORTS Research Testbed
RTS games Hack-free MM RTS Sense-act, strategy
TIELT Research Testbed
Several genres
Experimentation for evaluating learning & learned behaviors
Sense-act, advice processing, prediction, model updating, etc.
1. Provides an interface for message-passing interfaces2. Supports composable system-level interfaces
26Testbed for Integrating and Evaluating Learning Techniques
TIELT’sInternal
CommunicationModules
TIELT’s KBEditors
TIELT’s KBEditors
Selected/Developed Knowledge BasesSelected/Developed Knowledge Bases
GamePlayer(s)
GameEngineLibrary
GameEngineLibrary
Stratagus
Full Spectrum Command
TIELT’s User Interface TIELT’s User Interface
EvaluationInterface
TIELTUser
TIELTUser
SelectedGameEngine
SelectedGameEngine
DecisionSystemLibrary
DecisionSystemLibrary
ReasoningSystem
ReasoningSystem
Learning Module
. . .
Learning Module
ReasoningSystem
ReasoningSystem
Learning Module
. . .
Learning Module
DecisionSystem
DecisionSystem
Learning Module
. . .
Learning Module
SelectedDecisionSystem
SelectedDecisionSystem
TIELT: Integration Architecture
Knowledge Base Libraries
Knowledge Base Libraries
...
Learned Knowledge
(inspectable)
PredictionInterface
CoordinationInterface
AdviceInterface
GameModel
Agent Description
GameInterface
Model
Decision System
InterfaceModel
ExperimentMethodology
GIMGIM
GIM
DSIMDSIM
DSIM
GMGM
GM
ADAD
AD
EMEM
EM
27Testbed for Integrating and Evaluating Learning Techniques
TIELT’s Knowledge Bases
GameModel
Agent Description
GameInterface
Model
Decision System
InterfaceModel
ExperimentMethodology
Defines communication processes with the game engine
Defines communication processes with the decision system
Defines interpretation of the game• e.g., initial state, classes, operators, behaviors (rules)• Behaviors could be used to provide constraints on learning
Defines what decision tasks (if any) TIELT must support
Defines selected performance tasks (taken from Game Model Description) and the experiment to conduct
28Testbed for Integrating and Evaluating Learning Techniques
TIELT: Supported Performance Tasks
Types of Problem Solving Tasks
Analysis Synthesis
Classification
Diagnosis
Planning Design Scheduling
StructuralParametric
DecisionSupport
Performance vs. learning tasks
Performance: Application of the learned knowledge (e.g., classification)Learning: Activity of learning system (e.g., update weights in a neural net)
TIELT users will define complex, user-configurable performance tasks
29Testbed for Integrating and Evaluating Learning Techniques
An Example Complex Learning Task
Subtasks and supporting operations
1. Diagnosis: Identify (computer and/or human) opponent strategies & goals• Classification: Opponent recognition • Recording: Actions of opponents and their effects
–This repeatedly involves classification• Diagnosis: Identify goal(s) being solved by these effects• Classification: Identify goal(s), if solved, that prevents opponent goals
2. Planning: Select/adapt or create plan to achieve goals and win the game• Classification: Select top-level actions to achieve goals
– Iteratively identify necessary sub-goals and, finally, primitive actions• Design (parametric): Identify good initial layout of controllable assets
3. Execute plan• Recording: Collect measures of effectiveness, to provide feedback• Planning: If needed, re-plan, based on feedback, at Step 2
Task description
Win a real-time strategy gameThis involves several
challenging learning tasks
30Testbed for Integrating and Evaluating Learning Techniques
TIELT’sInternal
CommunicationModules
TIELT’s KBEditors
TIELT’s KBEditors
Selected/Developed Knowledge BasesSelected/Developed Knowledge Bases
GamePlayer(s)
GameEngineLibrary
GameEngineLibrary
Stratagus
Full Spectrum Command
TIELT’s User Interface TIELT’s User Interface
EvaluationInterface
TIELTUser
TIELTUser
SelectedGame
Engine
SelectedGame
Engine
DecisionSystemLibrary
DecisionSystemLibrary
ReasoningSystem
ReasoningSystem
Learning Module
. . .
Learning Module
ReasoningSystem
ReasoningSystem
Learning Module
. . .
Learning Module
DecisionSystem
DecisionSystem
Learning Module
. . .
Learning Module
SelectedDecisionSystem
SelectedDecisionSystem
Use: Controlling a Game Character
Raw StateProcessed
State
DecisionAction
Knowledge Base Libraries
Knowledge Base Libraries
Learned Knowledge
(inspectable)
PredictionInterface
CoordinationInterface
AdviceInterface
GameModel
Agent Description
GameInterface
Model
Decision System
InterfaceModel
ExperimentMethodology
GIMGIM
GIM
DSIMDSIM
DSIM
GMGM
GM
ADAD
AD
EMEM
EM
31Testbed for Integrating and Evaluating Learning Techniques
UT Example: Game Model
Classes
Player Team: String Number: Integer Position: Location
Location x: Integer y: Integer z: Integer
State DescriptionPlayers : Array[ ] of PlayerSelf : PlayerScore : Integer…
OperatorsShoot(Player) Preconditions: Player.isVisible Effects: Player.Health -= rand(10)MoveTo(Location) Preconditions: Location.isReachable() Effects: Self.position == Location…
RulesGetShotBy(Player) Preconditions: Player.hasLineOfSight(Self) Effects: Self.Health -= rand(10)EnemyMovements(Enemy, Location1, Location2) Preconditions: Location2.isReachableFrom(Location1)
Enemy.position == Location1 Effects: Enemy.position == Location2…
32Testbed for Integrating and Evaluating Learning Techniques
UT Example: Game Interface Model
Action TemplatesTURN(Pitch: real, Yaw: real, Roll: real)SETWALK(Walk: boolean) //Start walking or runningRUNTO(Target: integer) //ID of object in world…
Sensor TemplatesCWP(Weapon: integer) //Change Weapon to Weapon with this IdFLG(Id: integer, Reachable: boolean, State: Symbol <held, dropped, home>…
Examples interface messages from the GameBots API• http://www.planetunreal.com/gamebots/docapi.html
CommunicationMedium: TCP/IP, Port 3000Message Format: <name> {<attr1> <value1>} {<attr2> <value2>} …
33Testbed for Integrating and Evaluating Learning Techniques
UT Example: Decision System Interface Model
Example Template Messages Sent By TIELT
InitializeGameRules(ruleSet: Array [ ] of Rule)SendStateUpdates(CurrentState: Array [ ] of Object)LoadScenario(SavedGameFilename: String)…
Communication Medium: Standard I/OMessage Format: (<name> <value1> <value2> <value3> … )
Template Messages Received By TIELT
GiveAdvice(AdviceMessage: String)PerformAction(OperatorName: String, Parameters: Array [ ] of String)AskForValue(AttributeName: String)…
34Testbed for Integrating and Evaluating Learning Techniques
UT Example: Agent Description
Think-Act Cycle
Shoot Something Pick up a HealthpackGo Somewhere Else
Call ShootOperator
Ask Decision System:Where Do I Go?
Ask Decision System:Where Do I Go?
Call PickupOperator
35Testbed for Integrating and Evaluating Learning Techniques
UT Example: Experiment Methodology
InitializationGame Model: Unreal Tournament.xmlGame Interface: GameBots.xmlDecision System: MyUTBot.xmlRuns: 100Call slowdown(0.5)
MetricsFragCount: Self.killsFragsPerSecond: Self.kills/LengthOfGameAverageHealth: Self.health x
PlotFragCount vs. RunsAverageHealth vs. # of playersFragsPerSecond vs. Outdegree of net nodes
36Testbed for Integrating and Evaluating Learning Techniques
TIELT’sInternal
CommunicationModules
TIELT’s KBEditors
TIELT’s KBEditors
Selected/Developed Knowledge BasesSelected/Developed Knowledge Bases
GamePlayer(s)
GameEngineLibrary
GameEngineLibrary
Stratagus
Full Spectrum Command
TIELT’s User Interface TIELT’s User Interface
EvaluationInterface
TIELTUser
TIELTUser
SelectedGame
Engine
SelectedGame
Engine
DecisionSystemLibrary
DecisionSystemLibrary
ReasoningSystem
ReasoningSystem
Learning Module
. . .
Learning Module
ReasoningSystem
ReasoningSystem
Learning Module
. . .
Learning Module
DecisionSystem
DecisionSystem
Learning Module
. . .
Learning Module
SelectedDecisionSystem
SelectedDecisionSystem
Use: Predicting Opponent Actions
Raw StateProcessed
StatePrediction
Prediction
Knowledge Base Libraries
Knowledge Base Libraries
Learned Knowledge
(inspectable)
PredictionInterface
CoordinationInterface
AdviceInterface
GameModel
Agent Description
GameInterface
Model
Decision System
InterfaceModel
ExperimentMethodology
GIMGIM
GIM
DSIMDSIM
DSIM
GMGM
GM
ADAD
AD
EMEM
EM
37Testbed for Integrating and Evaluating Learning Techniques
TIELT’sInternal
CommunicationModules
TIELT’s KBEditors
TIELT’s KBEditors
Selected/Developed Knowledge BasesSelected/Developed Knowledge Bases
GamePlayer(s)
GameEngineLibrary
GameEngineLibrary
Stratagus
Full Spectrum Command
TIELT’s User Interface TIELT’s User Interface
EvaluationInterface
TIELTUser
TIELTUser
SelectedGame
Engine
SelectedGame
Engine
DecisionSystemLibrary
DecisionSystemLibrary
ReasoningSystem
ReasoningSystem
Learning Module
. . .
Learning Module
ReasoningSystem
ReasoningSystem
Learning Module
. . .
Learning Module
DecisionSystem
DecisionSystem
Learning Module
. . .
Learning Module
SelectedDecisionSystem
SelectedDecisionSystem
Use: Updating a Game Model
Raw StateProcessed
State
Edit
Edit
Knowledge Base Libraries
Knowledge Base Libraries
Learned Knowledge
(inspectable)
PredictionInterface
CoordinationInterface
AdviceInterface
GameModel
Agent Description
GameInterface
Model
Decision System
InterfaceModel
ExperimentMethodology
GIMGIM
GIM
DSIMDSIM
DSIM
GMGM
GM
ADAD
AD
EMEM
EM
38Testbed for Integrating and Evaluating Learning Techniques
TIELT: A Researcher Use Case
Selected/Developed Knowledge BasesSelected/Developed Knowledge Bases
GameModel
GameInterface
Model
GameEngineLibrary
GameEngineLibrary
Stratagus
Full Spectrum Command
DecisionSystemLibrary
DecisionSystemLibrary
ReasoningSystem
ReasoningSystem
Learning Module
. . .
Learning Module
ReasoningSystem
ReasoningSystem
Learning Module
. . .
Learning Module
DecisionSystem
DecisionSystem
Learning Module
. . .
Learning Module
SelectedDecisionSystem
SelectedDecisionSystem
SelectedGameEngine
SelectedGameEngine
1. Define/store decision system interface model
2. Select game simulator & interface3. Select game model4. Select/define performance task(s)5. Define/select expt. methodology6. Run experiments7. Analyze displayed results
1. Define/store decision system interface model
2. Select game simulator & interface3. Select game model4. Select/define performance task(s)5. Define/select expt. methodology6. Run experiments7. Analyze displayed results
Decision System
InterfaceModel
Agent Description
ExperimentMethodology
Knowledge Base Libraries
Knowledge Base Libraries
GIMGIM
GIM
DSIMDSIM
DSIM
GMGM
GM
ADAD
AD
EMEM
EM
39Testbed for Integrating and Evaluating Learning Techniques
TIELT: A Game Developer Use Case
Selected/Developed Knowledge BasesSelected/Developed Knowledge Bases
GameEngineLibrary
GameEngineLibrary
Stratagus
Full Spectrum Command
DecisionSystemLibrary
DecisionSystemLibrary
ReasoningSystem
ReasoningSystem
Learning Module
. . .
Learning Module
ReasoningSystem
ReasoningSystem
Learning Module
. . .
Learning Module
DecisionSystem
DecisionSystem
Learning Module
. . .
Learning Module
Decision System
InterfaceModel
SelectedDecisionSystem
SelectedDecisionSystem
1. Define/store game interface model
2. Define/store game model3. Select decision system/interface4. Define performance task(s)5. Define/select expt. methodology6. Run experiments7. Analyze displayed results
1. Define/store game interface model
2. Define/store game model3. Select decision system/interface4. Define performance task(s)5. Define/select expt. methodology6. Run experiments7. Analyze displayed results
GameInterface
Model
SelectedGameEngine
SelectedGameEngine
GameModel
Agent Description
ExperimentMethodology
Knowledge Base Libraries
Knowledge Base Libraries
GIMGIM
GIM
DSIMDSIM
DSIM
GMGM
GM
ADAD
AD
EMEM
EM
40Testbed for Integrating and Evaluating Learning Techniques
SelectedGame
Engine
Expt. Method.Editor
Game InterfaceModel Editor
Percepts
User
Decision SystemInterface Model Editor
Game ModelEditor
Agent Descr.Editor
GameModel
AgentDescription
Pe
rf.
Ta
sk
EvaluationInterface
Evaluator
Action / Control
Translator(Mapper)
Learning OutputsActions
Translated Model (Subset)
Learning Task
GameInterface
Model
Decision System
InterfaceModel
LearningTranslator(Mapper)
CurrentState
ModelUpdater
Database
ExperimentMethodology
StoredState
AdviceInterface Database
EngineState
Controller
SelectedDecisionSystem
TIELT’s Internal Communication Modules
41Testbed for Integrating and Evaluating Learning Techniques
Sensing the Game State(City placement example, inspired by Alpha Centauri, etc.)
TIELT
Game Interface
Model Editor
Sensors
User
Game ModelEditor
GameModel
Up
da
tes
GameInterface
Model
ActionTranslator
Actions
GameEngine
GameEngine
CurrentState
1
2
4 3
4
In Game Engine, the game begins; a colony pod is created and placed.
1
The Game Engine sends a “See” sensor message identifying the pod’s location.
This message template provides updates (instructions) to the Current State, telling it that there is a pod at the location See describes.
4
2
The Model Updater receives the sensor message and finds the corresponding message template in the Game Interface Model.
3
ControllerModel
Updater
3
The Model Updater notifies the Controller that the See action event has occurred.
5
5
42Testbed for Integrating and Evaluating Learning Techniques
Fetching Decisions from the Decision System(City placement example)
TIELT
SelectedDecisionSystem
SelectedDecisionSystem
LearningModule #1
LearningModule #n
. . .
User
Decision SystemInterface Model Editor
Agent Desc.Editor
AgentDescription
LearningTranslator
Translated Model (Subset)
Decision System
InterfaceModel
ActionTranslator
LearningOutputs
The Controller notifies the Learning Translator that it has received a See message.
The Learning Translator finds a city location task, which is triggered by the See message. It queries the controller for the learning mode, then creates a TestInput message to send to the reasoning system with information on the pod’s location and the map from the Current State.
The Decision System transmits output to the Action Translator.
The Learning Translator transmits the TestInput message to the Decision System.
1
2 23
4
Controller
CurrentState
1
4
2
3
43Testbed for Integrating and Evaluating Learning Techniques
TIELT
Game InterfaceModel Editor
User
ActionTranslator
Actions
GameEngine
GameEngine
1
2
4.a
The Action Translator receives a TestOutput message from the Decision System.
The Action Translator finds the TestOutput message template, determines it is associated with the city location task, and builds a MovePod operator (defined by the Current State) with the parameters of TestOutput.
The Game Engine receives Move and updates the game to move the pod toward its destination, or
The Action Translator determines that the Move Action from the Game Interface Model is triggered by the MovePod Operator and binds Move using information from MovePod.
Decision SystemInterface Model Editor
3
GameInterface
Model
Decision System
InterfaceModel
AdviceInterface
The Advice Interface receives Move and displays advice to a human player on what to do next, or makes a Prediction.
4.b, c1
4.a
2
3
Acting in the Game World(City placement example)
4.b
CurrentState
2
PredictionInterface
4.c
3
44Testbed for Integrating and Evaluating Learning Techniques
Implementation
TIELT Status (November 2004)
• TIELT (v0.5) available• Features
– Message protocols• Current: Console I/O, TCP/IP, UDP• Future: Library calls, HLA interface, RMI (possibly)
– Message content: Configurable• Instantiated templates tell it how to communicate with other modules
– Initialization messages: Start, Stop, Load Scenario, Set Speed– Game Model representations (w/ Lehigh University)
• Simple programs• TMK process models• PDDL (language used in planning competitions)
• TIELT (v0.5) available• Features
– Message protocols• Current: Console I/O, TCP/IP, UDP• Future: Library calls, HLA interface, RMI (possibly)
– Message content: Configurable• Instantiated templates tell it how to communicate with other modules
– Initialization messages: Start, Stop, Load Scenario, Set Speed– Game Model representations (w/ Lehigh University)
• Simple programs• TMK process models• PDDL (language used in planning competitions)
45Testbed for Integrating and Evaluating Learning Techniques
TIELT Status (November 2004)
Documentation
• TIELT User’s Manual (82 pages)1. TIELT Overview2. The TIELT User Interface3. Scripting in TIELT4. Theory of the Game Model5. Communications6. TMK Models7. Experiments
• TIELT Tutorial (45 pages)1. The Game Model2. The Game Interface Model3. Decision System Interface Model4. Agent Description5. Experiment Methodology
• TIELT User’s Manual (82 pages)1. TIELT Overview2. The TIELT User Interface3. Scripting in TIELT4. Theory of the Game Model5. Communications6. TMK Models7. Experiments
• TIELT Tutorial (45 pages)1. The Game Model2. The Game Interface Model3. Decision System Interface Model4. Agent Description5. Experiment Methodology
46Testbed for Integrating and Evaluating Learning Techniques
TIELT Status (November 2004)
Access
• TIELT www site (new) • Selected Components
– Documents: Documentation, publications, XML Spec– Status– Forum: A full-featured web forum/bulletin board – Bug Tracker: TIELT bug/feature tracking facility – FAQ-o-Matic: Questions and problem solutions; user-driven– Download
• TIELT www site (new) • Selected Components
– Documents: Documentation, publications, XML Spec– Status– Forum: A full-featured web forum/bulletin board – Bug Tracker: TIELT bug/feature tracking facility – FAQ-o-Matic: Questions and problem solutions; user-driven– Download
47Testbed for Integrating and Evaluating Learning Techniques
1. Communication
You Are
Here
2. Resources for learning to use TIELT
• TIELT Scripting syntax highlighting• Map of TIELT Component Interactions
– Thanks, Megan• Typed script interface
• TIELT Scripting syntax highlighting• Map of TIELT Component Interactions
– Thanks, Megan• Typed script interface
TIELT Issues (November 2004)
TCP/IP
TIELT
Library Calls
SWIG
TIELT is a multilingual application; this provides interfacing with many different games.
TIELT is a multilingual application; this provides interfacing with many different games.
48Testbed for Integrating and Evaluating Learning Techniques
Game Model3. Formatting
To no one’s surprise, everyone agrees that TIELT’s Game Model representation is inadequate.
Requests have been made for:• 3D Maps (Quake)• A different programming language• A relational operator representation• Standardized events
To no one’s surprise, everyone agrees that TIELT’s Game Model representation is inadequate.
Requests have been made for:• 3D Maps (Quake)• A different programming language• A relational operator representation• Standardized events
TIELT Issues (November 2004)
“We’re working on it”
49Testbed for Integrating and Evaluating Learning Techniques
TIELT’sInternal
CommunicationModules
TIELT’s KBEditors
TIELT’s KBEditors
Selected/Developed Knowledge BasesSelected/Developed Knowledge Bases
GameModel
Task Descriptions
GameInterface
Model
Decision System
InterfaceModel
ExperimentMethodology
TIELT’s User Interface TIELT’s User Interface
PredictionInterface
EvaluationInterface
CoordinationInterface
AdviceInterface
TIELT User
TIELTUser
TIELT Collaborations (2004-05)
Knowledge Base LibrariesKnowledge Base Libraries
Game LibraryGame Library
Mad Doc
EE2 ToEE
Troika
FreeCiv
NWU ISLE
Platform LibraryPlatform Library
Stratagus
Lehigh U.
UT Arl.
FSC/R
USC/ICT
UrbanTerror
U. Minn-D.
RoboCup
Decision System LibraryDecision System Library
Learning Modules
Soar: U.MichICARUS: ISLE
DCA: UT Arlington
Neuroevolution: UT Austin
Others: Many
LU, USC Mich/ISLEU. Mich.Many Many
U.Minn-D. USC/ICTU.Mich.
50Testbed for Integrating and Evaluating Learning Techniques
TIELT Collaboration Projects (2004-05)
Organization Game Interface and Model
Decision System
Tasks and Evaluation Methodology
Mad Doc Software Empire Earth 2 (RTS)
Troika Games Temple of Elemental Evil (RPG)
ISLE SimCity (~RTS) ICARUS ICARUS w/ FreeCiv, design
Lehigh U. Stratagus/Wargus (RTS), and HTN/TMK designs
Case-based planner (CBP)
Wargus/CBP
NWU FreeCiv (discrete strategy), and qualitative game representations
U. Michigan SOAR SOAR w/ 2 games(e.g., FSW, ToEE), design
U. Minnesota-Duluth RoboCup (team sports) Advice-taking components
Advice processing
USC/ICT Full Spectrum Command(RTS)
SOAR with FSC
UT Arlington Urban Terror (FPS) DCA (lite version)
UT Austin Neuroevolution e.g., Neuroevolution/EE2
51Testbed for Integrating and Evaluating Learning Techniques
Games Being Integrated with TIELT
Category Gaming Simulator DescriptionGenre Foci Perspective
Commercial 1. Empire Earth II (Mad Doc S/W)
2. Temple of Elemental Evil (Toika)
3. SimCity (ISLE)
• RTS• Role-
playing• RTS
• Civilization• Solve
quests• City
manager
• God• 1st person
• God
Freeware 1. FreeCiv (NWU) (~Civilization)
2. Wargus (Lehigh U.) (~Warcraft II)
3. Urban Terror (UT Arlington)
4. RoboCup Soccer (UW)
• Discrete strategy
• RTS• FPS• Team
sports
• Civilization
• Civilization• Shooter• Team of
agents
• God
• God• 1st person• Behavior
designer
Military • Full Spectrum Command (USC/Inst. Creative Technologies)
• RTS • Leading an Army Light Infantry Company
• 1st person
52Testbed for Integrating and Evaluating Learning Techniques
Promising Learning Strategies
Learning Strategy
Description When to Use Justification
Advice Giving Expert explains how to perform in a given state (this is the only interactive strategy listed here)
Speedup needed & expert is available
Permits quick acquisition of specific and general domain knowledge
Backpropagation Trains a 3-layer neural network (NN) of sigmoidal hidden units
Target is a non-linear function; offline training is ok
Many learning tasks are non-linear and some can be performed off-line
Case-Based Reasoning
Use/adapt solutions from experiences to solve similar problems
Cases complement incomplete domain model; problem-solving speed is crucial.
Quicker to adapt cases than reason from scratch, but requires domain-specific adaptation knowledge
Chunking Compile a sequence of steps into a macro For tasks requiring speedup Transforms a complex reasoning task into a fast retrieval task
Dynamic Scripting
RL for tasks with large state spaces that w/ domain knowledge can be collapsed into a smaller set
Small set of states exist, with a set of rules for each
Greatly speeds up RL approach, but requires analysis of task states
Evolutionary Computation
Evolutionary (genetic) selection on a population of genomes, where application dictates their rep’n
Search space is huge, and training can be done offline
Genome rep’ns can be task specific, so this powerful search method can be tuned for the task
Meta Reasoning After a failure, this identifies its type & task that failed, it retrieves a task-specific strategy to avoid this failure, and updates its model
To support self-adaptation Although knowledge intensive, this is an excellent method for changing problem-solving strategies
Neuroevolution Using a separate genetic algorithm population for learning each hidden unit’s weight in a NN
To support cooperating heterogeneous agents
A good offline agent-based learning approach for multi-agent gaming
Reinforcement Learning (RL)
Reinforce sequence of decisions after problem solving is completed
Reward is known only after sequence ends, and blame can be ascribed
Well-understood paradigm for learning action policies (i.e., what action to perform in a given state)
Relational MDPs Learn a Markov decision process re: objects & their relations using probabilistic relational models
Seeking knowledge transfer (KT) to similar environments
KT is crucial for learning quickly, and feasibly, for some tasks
53Testbed for Integrating and Evaluating Learning Techniques
TIELT-General Game Player Integration(with Stanford University’s Michael Genesereth)
GGP
TIELT GGP-TIELT
• Logical game formalisms• Access to remote players• WWW access
• Experiment design/control capabilities• Common game engine interface• Support for several learning approaches
• Play entire class of general games as well as TIELT-integrated gaming simulators.
• Compete remotely against reference players and other GGP systems.
• Define evaluation methodologies for learning experimentation.
• Participate in AAAI’05 GGP Competition.
Integration Architecture
GGPTest Bed
GGP Competitors
TIELT
TIELT-Ready GGP Competitors
W W W
Reference Opponents
54Testbed for Integrating and Evaluating Learning Techniques
Upcoming Events
1. National Conference on AI (AAAI’05; 24-28 July; Pittsburgh)– General Game Playing Competition ($10K prize)
2. Int. Joint Conference on AI (IJCAI’05; 30 July-5 August; Edinburgh)– Workshop: Reasoning, Representation, and Learning in Gaming
Simulation Tasks (Tentative title)
3. Int. Conference on ML (ICML’05; 7-11 August; Bonn)– Workshop submission in progress
4. Int. Conference on CBR (ICCBR’05; 23-26 August; Chicago)– Workshop & Competition: CBR in Games
1. National Conference on AI (AAAI’05; 24-28 July; Pittsburgh)– General Game Playing Competition ($10K prize)
2. Int. Joint Conference on AI (IJCAI’05; 30 July-5 August; Edinburgh)– Workshop: Reasoning, Representation, and Learning in Gaming
Simulation Tasks (Tentative title)
3. Int. Conference on ML (ICML’05; 7-11 August; Bonn)– Workshop submission in progress
4. Int. Conference on CBR (ICCBR’05; 23-26 August; Chicago)– Workshop & Competition: CBR in Games
55Testbed for Integrating and Evaluating Learning Techniques
Summary
TIELT: Mediates between a (gaming) simulator and a learning-embedded decision system
• Goals: – Simplify running learning expts with cognitive systems– Support DARPA challenge problems in learning
• Designed to work with many types of simulators & decision systems
TIELT: Mediates between a (gaming) simulator and a learning-embedded decision system
• Goals: – Simplify running learning expts with cognitive systems– Support DARPA challenge problems in learning
• Designed to work with many types of simulators & decision systems
Status: • TIELT (v0.5 Alpha) completed in 10/04
– User’s Manual, Tutorial, & www site exist• 10 collaborating organizations (1-year contracts)
– Enhances probability that TIELT will achieve its goals• We’re planning several TIELT-related events
Status: • TIELT (v0.5 Alpha) completed in 10/04
– User’s Manual, Tutorial, & www site exist• 10 collaborating organizations (1-year contracts)
– Enhances probability that TIELT will achieve its goals• We’re planning several TIELT-related events
56Testbed for Integrating and Evaluating Learning Techniques
Backup Slides
57Testbed for Integrating and Evaluating Learning Techniques
Metrics
Industry perspective
1. Ability to develop learned/learning behaviors of interest2. Time required to
• develop game interface & model KBs, and • these behaviors
3. Availability of learning-embedded reasoning systems4. Support for both off-line and on-line learning
1. Ability to develop learned/learning behaviors of interest2. Time required to
• develop game interface & model KBs, and • these behaviors
3. Availability of learning-embedded reasoning systems4. Support for both off-line and on-line learning
Research perspective
1. Time required to develop reasoning interface KB2. Ability to design/facilitate selected evaluation methodology3. Expressiveness of KB representation4. Breadth of learning techniques supported5. Breadth of learning and performance tasks supported6. Availability of integrated gaming simulators & challenges
1. Time required to develop reasoning interface KB2. Ability to design/facilitate selected evaluation methodology3. Expressiveness of KB representation4. Breadth of learning techniques supported5. Breadth of learning and performance tasks supported6. Availability of integrated gaming simulators & challenges
58Testbed for Integrating and Evaluating Learning Techniques
Some Expected User Metrics
Performance tasks
1. Some standards• e.g., classification accuracy, ROC analyses, precision & recall
2. Decision making speed and accuracy3. Plan execution quality (e.g., time to execute, mission-
specific Measures of Effectiveness)4. Number of constraint violations5. Ability to transfer learned knowledge
1. Some standards• e.g., classification accuracy, ROC analyses, precision & recall
2. Decision making speed and accuracy3. Plan execution quality (e.g., time to execute, mission-
specific Measures of Effectiveness)4. Number of constraint violations5. Ability to transfer learned knowledge
59Testbed for Integrating and Evaluating Learning Techniques
TIELT: Potential Learning Challenge Problems
1. Learn to win a game (i.e., accomplish an objective)• e.g., solve a challenging diplomacy task, provide a realistic
military training course facing intelligent adversaries, or help users to develop real-time cognitive reasoning skills for a defined role in support of a multi-echelon mission
1. Learn to win a game (i.e., accomplish an objective)• e.g., solve a challenging diplomacy task, provide a realistic
military training course facing intelligent adversaries, or help users to develop real-time cognitive reasoning skills for a defined role in support of a multi-echelon mission
2. Learn an adversary’s strategy• e.g., predict a terrorist group’s plan and/or tactics, suggest
appropriate responses to prevent adversarial goals, help users identify characteristics of adversarial strategies
2. Learn an adversary’s strategy• e.g., predict a terrorist group’s plan and/or tactics, suggest
appropriate responses to prevent adversarial goals, help users identify characteristics of adversarial strategies
3. Learn crucial processes of an environment• e.g., learn to improve an incorrect/incomplete game model so that
it more accurately/reliably defines objects/agents in the game, their behaviors, their capabilities, and their limitations
3. Learn crucial processes of an environment• e.g., learn to improve an incorrect/incomplete game model so that
it more accurately/reliably defines objects/agents in the game, their behaviors, their capabilities, and their limitations
4. Intelligent situation assessment• e.g., learn which factors in the simulation require attention to
accomplish different types of tasks
4. Intelligent situation assessment• e.g., learn which factors in the simulation require attention to
accomplish different types of tasks
60Testbed for Integrating and Evaluating Learning Techniques
Example Game: FreeCiv(Discrete-time strategy)
http://www.freeciv.org
Civilization II (MicroProse)
•Civilization II (1996-): 850K+ copies sold – PC Gamer: Game of the Year Award winner– Many other awards
•Civilization series (1991-): Introduced the civilization-based game genre
•Civilization II (1996-): 850K+ copies sold – PC Gamer: Game of the Year Award winner– Many other awards
•Civilization series (1991-): Introduced the civilization-based game genre
FreeCiv (Civ II clone)
•Open source freeware•Discrete strategy game•Goal: Defeat opponents, or build a spaceship
•Resource management – Economy, diplomacy,
science, cities, buildings, world wonders
– Units (e.g., for combat)•Up to 7 opponent civs•Partial observability
•Open source freeware•Discrete strategy game•Goal: Defeat opponents, or build a spaceship
•Resource management – Economy, diplomacy,
science, cities, buildings, world wonders
– Units (e.g., for combat)•Up to 7 opponent civs•Partial observability
61Testbed for Integrating and Evaluating Learning Techniques
Previous FreeCiv/Learning Research
(Ulam et al., AAAI’04 Workshop on Challenges in Game AI)
• Title: Reflection in Action: Model-Based Self-Adaptation in Game Playing Agents • Scenarios:
– City defense: Defend a city for 3000 years
• Title: Reflection in Action: Model-Based Self-Adaptation in Game Playing Agents • Scenarios:
– City defense: Defend a city for 3000 years
62Testbed for Integrating and Evaluating Learning Techniques
FreeCiv CP Scenario
General description
• Game initialization: Your only unit, a “settler”, is placed randomly on a random world (see Game Options below). Players cyclically alternate play
• Objective: Obtain highest score, conquer all opponents, or build first spaceship• Scoring: “Basic” goal is to obtain 1000 points. Game options affect the score.
– Citizens: 2 pts per happy citizen, 1 per content citizen– Advances: 20 pts per World Wonder, 5 per “futuristic” advance– Peace: 3 pts per turn of world peace (no wars or combat)– Pollution: -10pts per square currently polluted
• Top-level tasks (to achieve a high score): – Develop an economy– Increase population– Pursue research advances– Opponent interactions: Diplomacy and defense/combat
• Game initialization: Your only unit, a “settler”, is placed randomly on a random world (see Game Options below). Players cyclically alternate play
• Objective: Obtain highest score, conquer all opponents, or build first spaceship• Scoring: “Basic” goal is to obtain 1000 points. Game options affect the score.
– Citizens: 2 pts per happy citizen, 1 per content citizen– Advances: 20 pts per World Wonder, 5 per “futuristic” advance– Peace: 3 pts per turn of world peace (no wars or combat)– Pollution: -10pts per square currently polluted
• Top-level tasks (to achieve a high score): – Develop an economy– Increase population– Pursue research advances– Opponent interactions: Diplomacy and defense/combat
Game Option Y1 Y2 Y3
World size Small Normal Large
Difficulty level Warlord (2/6) Prince (3/6) King (4/6)
#Opponent civilizations 5 5 7
Level of barbarian activity Low Medium High
63Testbed for Integrating and Evaluating Learning Techniques
FreeCiv CP Information Sources
Concepts in an Initial Knowledge Base
• Resources: Collection and useo Food, production, trade (money)
• Terrain: o Resources gained per turno Movement requirements
• Units:o Type (Military, trade, diplomatic, settlers, explorers)o Healtho Combat: Offense & defenseo Movement constraints (e.g., Land, sea, air)
• Government Types (e.g., anarchy, despotism, monarchy, democracy)• Research network: Identifies constraints on what can be studied at any time• Buildings (e.g., cost, capabilities)• Cities
o Population Growtho Happinesso Pollution
• Civilizations (e.g., military strength, aggressiveness, finances, cities, units)• Diplomatic states & negotiations
• Resources: Collection and useo Food, production, trade (money)
• Terrain: o Resources gained per turno Movement requirements
• Units:o Type (Military, trade, diplomatic, settlers, explorers)o Healtho Combat: Offense & defenseo Movement constraints (e.g., Land, sea, air)
• Government Types (e.g., anarchy, despotism, monarchy, democracy)• Research network: Identifies constraints on what can be studied at any time• Buildings (e.g., cost, capabilities)• Cities
o Population Growtho Happinesso Pollution
• Civilizations (e.g., military strength, aggressiveness, finances, cities, units)• Diplomatic states & negotiations
64Testbed for Integrating and Evaluating Learning Techniques
FreeCiv CP Decisions
Civilization decisions • Choice of government type (e.g., democracy)• Distribution of income devoted to research, entertainment, and wealth goals• Strategic decisions affecting other decisions (e.g., coordinated unit movement for trade)
• Choice of government type (e.g., democracy)• Distribution of income devoted to research, entertainment, and wealth goals• Strategic decisions affecting other decisions (e.g., coordinated unit movement for trade)
City decisions
Unit decisions
Diplomacy decisions
• Production choice (i.e., what to create, including city buildings and units)• Citizen roles (e.g., laborers, entertainers, or specialists), and laborer placement
– Note: Locations vary in their terrain, which generate different amounts of food, income, and production capability
• Production choice (i.e., what to create, including city buildings and units)• Citizen roles (e.g., laborers, entertainers, or specialists), and laborer placement
– Note: Locations vary in their terrain, which generate different amounts of food, income, and production capability
• Task (e.g., where to build a city, whether/where to engage in combat, espionage)• Movement
• Task (e.g., where to build a city, whether/where to engage in combat, espionage)• Movement
• Whether to sign a proffered peace treaty with another civilization• Whether to offer a gift
• Whether to sign a proffered peace treaty with another civilization• Whether to offer a gift
65Testbed for Integrating and Evaluating Learning Techniques
FreeCiv CP Decision Space
Variables • Civilization-wide variables
o N: Number of civilizations encounteredo D: Number of diplomatic states (that you can have with an opponent)o G: Number of government types available to youo R: Number of research advances that can be pursued o I: Number of partitions of income into entertainment, money, & research
• U: #Unitso L: Number of locations a unit can move to in a turn
• C: #Citieso Z: Number of citizens per cityo S: Citizen status (i.e., laborer, entertainer, doctor)o B: Number of choices for city production
• Civilization-wide variableso N: Number of civilizations encounteredo D: Number of diplomatic states (that you can have with an opponent)o G: Number of government types available to youo R: Number of research advances that can be pursued o I: Number of partitions of income into entertainment, money, & research
• U: #Unitso L: Number of locations a unit can move to in a turn
• C: #Citieso Z: Number of citizens per cityo S: Citizen status (i.e., laborer, entertainer, doctor)o B: Number of choices for city production
Decision complexity per turn (for a typical game state)
• O(DNGRI*LU*(SZB)C) ; this ignores both other variables and domain knowledgeo This becomes large with the number of units and citieso Example: N=3; D=5; G=3; R=4; I=10; U=25; L=4; C=8; Z=10; S=3; B=10o Size of decision space (i.e., possible next states): 2.5*1065 (in one turn!)
o Comparison: Decision space of chess per turn is well below 140 (e.g., 20 at first move)
• O(DNGRI*LU*(SZB)C) ; this ignores both other variables and domain knowledgeo This becomes large with the number of units and citieso Example: N=3; D=5; G=3; R=4; I=10; U=25; L=4; C=8; Z=10; S=3; B=10o Size of decision space (i.e., possible next states): 2.5*1065 (in one turn!)
o Comparison: Decision space of chess per turn is well below 140 (e.g., 20 at first move)
66Testbed for Integrating and Evaluating Learning Techniques
FreeCiv CP: A Simple Example Learning Task
Situation• We’re England (e.g., London)• Barbarians are north (in red)• Two other civs exist• Our military is weak
• We’re England (e.g., London)• Barbarians are north (in red)• Two other civs exist• Our military is weak
What should we do?• Ally with Wales? If so, how?• Build a military unit? Which?• Improve defenses? • Increase city’s production rate?• Build a new city to the south? Where?• Research “Gun Powder”? Or…?• Move our diplomat back to London?• A combination of these?
• Ally with Wales? If so, how?• Build a military unit? Which?• Improve defenses? • Increase city’s production rate?• Build a new city to the south? Where?• Research “Gun Powder”? Or…?• Move our diplomat back to London?• A combination of these?
What information could help with this decision?• Previous similar experiences• Generalizations of those experiences• Similarity knowledge
• Previous similar experiences• Generalizations of those experiences• Similarity knowledge
• Adaptation knowledge• Opponent model• Statistics on barbarian strength, etc.
• Adaptation knowledge• Opponent model• Statistics on barbarian strength, etc.
67Testbed for Integrating and Evaluating Learning Techniques
Decision Space Size
Analysis of the Example Learning Task
Situation• D: 3 (war, neutral, peace)• N: Only 1 other civilization
contacted (i.e., Wales)• G: 2 government types known• R: 4 research advances available• I: 5 partitions of income available• L: ~14 per unit• U: 3 Units (1 external, 2 in city)• C: 1 City
– S: 3 (entertainer, laborer, doctor)– Z: 6 citizens– B: 5 units/buildings it can produce
• D: 3 (war, neutral, peace)• N: Only 1 other civilization
contacted (i.e., Wales)• G: 2 government types known• R: 4 research advances available• I: 5 partitions of income available• L: ~14 per unit• U: 3 Units (1 external, 2 in city)• C: 1 City
– S: 3 (entertainer, laborer, doctor)– Z: 6 citizens– B: 5 units/buildings it can produce
• 1.2*109
• This reduces to ~32 sensible choices after applying some domain knowledge– e.g., don’t change diplomatic status now, keep units in city for defense, don’t change government
now (because it’ll slow production), keep external unit away from danger
• 1.2*109
• This reduces to ~32 sensible choices after applying some domain knowledge– e.g., don’t change diplomatic status now, keep units in city for defense, don’t change government
now (because it’ll slow production), keep external unit away from danger
Complexity function
• O(DNGRI*LU*(SZB)C)• O(DNGRI*LU*(SZB)C)
68Testbed for Integrating and Evaluating Learning Techniques
FreeCiv CP: Learning Opportunities
Learn to keep citizens happy • Citizens in a city who are unhappy will revolt; this temporarily eliminates city production• Several factors influence happiness (e.g., entertainment, military presence, gov’t type)
• Citizens in a city who are unhappy will revolt; this temporarily eliminates city production• Several factors influence happiness (e.g., entertainment, military presence, gov’t type)
Learn to obtain diplomatic advantages
Learn how to wage war successfully
Learn how to increase territory size
• Countries at war tend to have decreased trade, lose units and cities, etc.• Diplomats can sometimes obtain peace treaties or otherwise end wars• Unit movement decisions can also impact opponents’ diplomatic decisions
• Countries at war tend to have decreased trade, lose units and cities, etc.• Diplomats can sometimes obtain peace treaties or otherwise end wars• Unit movement decisions can also impact opponents’ diplomatic decisions
• Good military decisions can yield new cities/citizens/trade, but losses can be huge• Unit decisions can benefit from learning tactical coordinated behaviors• The selection of a military unit(s) for a task depends on the opponent’s capabilities
• Good military decisions can yield new cities/citizens/trade, but losses can be huge• Unit decisions can benefit from learning tactical coordinated behaviors• The selection of a military unit(s) for a task depends on the opponent’s capabilities
• Initially, unexplored areas are unknown; their resources (e.g., gold) cannot be harvested• Exploration needs to be balanced with security• City placement decisions influence territory expansion
• Initially, unexplored areas are unknown; their resources (e.g., gold) cannot be harvested• Exploration needs to be balanced with security• City placement decisions influence territory expansion
69Testbed for Integrating and Evaluating Learning Techniques
FreeCiv CP: Example Learned Knowledge
Learn what playing strategy to use in each adversarial situation
Combat Strength
Advantage
Current Diplomatic Status with Opponent
Allied Peace Neutral Distrustful War
None
Unfavorable
Favorable
Attack
Retreat!
Fortify
Legend
Trade
Seek Peace
Bribe
Strategy to use per adversarial situation
• Situations are defined by relative military strength, diplomatic status, whether the opponent has strong alliances, locations of forces, etc.
• Selecting a good playing strategy depends on many of these variables
• Situations are defined by relative military strength, diplomatic status, whether the opponent has strong alliances, locations of forces, etc.
• Selecting a good playing strategy depends on many of these variables
70Testbed for Integrating and Evaluating Learning Techniques
What Techniques Could Learn the Task of Selecting a Playing Strategy?
Meta-reasoning (e.g., Ulam et al., AAAI’04 Wkshp on Challenges in Game AI)
• Requires knowledge on:1. Tasks being performed
2. Types of failures that can occur when performing these tasks• T2: Overestimate own strength, underestimate enemy strength, …• T3: Incorrect assessment of enemy’s diplomatic status, …
3. Strategies for adapting these tasks• S1: Increase military strength • S2: Assess distribution of enemy forces• S3: Consider enemy’s diplomatic history
4. Mapping of failure types in (2) to adaptation strategies in (3)• Example: We decided to Attack, but underestimated enemy strength.
This was indexed by strategy S2, which we’ll do from now on in T2.
• Requires knowledge on:1. Tasks being performed
2. Types of failures that can occur when performing these tasks• T2: Overestimate own strength, underestimate enemy strength, …• T3: Incorrect assessment of enemy’s diplomatic status, …
3. Strategies for adapting these tasks• S1: Increase military strength • S2: Assess distribution of enemy forces• S3: Consider enemy’s diplomatic history
4. Mapping of failure types in (2) to adaptation strategies in (3)• Example: We decided to Attack, but underestimated enemy strength.
This was indexed by strategy S2, which we’ll do from now on in T2.
T1: Determine Playing Strategy
T3: Assess Diplomatic Status
T4: Select Strategy
Attack Retreat!Fortify TradeSeek PeaceBribe
T2: Assess Military Advantage
71Testbed for Integrating and Evaluating Learning Techniques
Challenges for Using Learning via Meta-Reasoning
How can its background knowledge be learned (efficiently)?
• i.e., tasks, failure types, failure adaptation strategies, mappings• Also, the agent needs to understand how to diagnosis an error
(i.e., identify which task failed and its failure type)
• i.e., tasks, failure types, failure adaptation strategies, mappings• Also, the agent needs to understand how to diagnosis an error
(i.e., identify which task failed and its failure type)
Can we scale it to more challenging learning problems?
• Currently, it has only been applied to simpler tasks– “Defend a City” (in FreeCiv)
• More difficult would be “Play Entire Game”
• Currently, it has only been applied to simpler tasks– “Defend a City” (in FreeCiv)
• More difficult would be “Play Entire Game”
What if only incomplete background knowledge exists?
• Could complementary learning techniques apply it?– e.g., Relational MDPs (which handle uncertainty)
• Could learning techniques be used to extend/correct it?– e.g., Learning from advice, case-based reasoning
• Could complementary learning techniques apply it?– e.g., Relational MDPs (which handle uncertainty)
• Could learning techniques be used to extend/correct it?– e.g., Learning from advice, case-based reasoning
72Testbed for Integrating and Evaluating Learning Techniques
Full Spectrum Command & Warrior(http://www.ict.usc.edu/disp.php?bd=proj_games)
Focus: US Army training tools (deployed @ Ft Benning & Afghanistan)
1. Full Spectrum Command (PC-based simulator)– Role: Commander of a U.S. Army light infantry Company (120 soldiers)– Tasks: Interpret the assigned mission, organize the force, plan
strategically, & coordinate the actions of the Company2. Full Spectrum Warrior (MS Xbox-based simulator)
– Role: Light infantry squad leader – Tasks: Complete assigned missions safely
1. Full Spectrum Command (PC-based simulator)– Role: Commander of a U.S. Army light infantry Company (120 soldiers)– Tasks: Interpret the assigned mission, organize the force, plan
strategically, & coordinate the actions of the Company2. Full Spectrum Warrior (MS Xbox-based simulator)
– Role: Light infantry squad leader – Tasks: Complete assigned missions safely
Organization: USC’s Institute for Creative Technologies
• POC: Michael van Lent (Editor-in-Chief, Journal of Game Development)• Goal: Develop immersive, interactive, real time training simulations to
help the Army create decision-making & leadership-development tools
• POC: Michael van Lent (Editor-in-Chief, Journal of Game Development)• Goal: Develop immersive, interactive, real time training simulations to
help the Army create decision-making & leadership-development tools
73Testbed for Integrating and Evaluating Learning Techniques
METAGAME(Pell, 1992)
Focus: Learn strategies to win any game in a pre-defined category
• Initial category: “Chess-like” games– Games are produced by a game generator
• Input: Rules on how to play the game– Move grammar is used to communicate actions
• Output (desired): A winning playing strategy
• Initial category: “Chess-like” games– Games are produced by a game generator
• Input: Rules on how to play the game– Move grammar is used to communicate actions
• Output (desired): A winning playing strategy
e.g., Knight-Zone Chess
Annual Competition based on METAGAME
• Title: General Game Playing (games.stanford.edu)• Champion: Michael Genesereth (Stanford U.)• AAAI’05 Prize: $10K
• Title: General Game Playing (games.stanford.edu)• Champion: Michael Genesereth (Stanford U.)• AAAI’05 Prize: $10K
Game Manager
Player
Games
RecordsTemporary State Data
Graphics for
Spectatorsperceptactionsclocks
action
74Testbed for Integrating and Evaluating Learning Techniques
Collaborator: Mad Doc Software
Summary
• PI: Ron Rosenberg (Producer)• Experience:
• Mad Doc is a leader in real-time strategy games; Empire Earth II is expected to sell in the millions of copies
• CEO Ian Davis (CMU PhD in Robotics) is a well known collaborator with the AI research community, and gave an invited presentation at AAAI’04. He will work with Ron on this contract.
• Deliverables: Mad Doc (RTS) game simulator API–This will be used by multiple other collaborators
• PI: Ron Rosenberg (Producer)• Experience:
• Mad Doc is a leader in real-time strategy games; Empire Earth II is expected to sell in the millions of copies
• CEO Ian Davis (CMU PhD in Robotics) is a well known collaborator with the AI research community, and gave an invited presentation at AAAI’04. He will work with Ron on this contract.
• Deliverables: Mad Doc (RTS) game simulator API–This will be used by multiple other collaborators
75Testbed for Integrating and Evaluating Learning Techniques
Collaborator: Troika Games
Summary
• PI: Tim Cain, Joint-CEO• Experience:
• Troika has outstanding experience with developing state-of-the-art role playing games, including Temple of Elemental Evil (ToEE)
• A game developer since 1982, Tim obtained an M.S. with a focus on machine learning at UC Irvine in the late 1980’s.
• Deliverables: ToEE (RPG) game simulator API–This will be used by some other collaborators (e.g., U. Michigan)
• PI: Tim Cain, Joint-CEO• Experience:
• Troika has outstanding experience with developing state-of-the-art role playing games, including Temple of Elemental Evil (ToEE)
• A game developer since 1982, Tim obtained an M.S. with a focus on machine learning at UC Irvine in the late 1980’s.
• Deliverables: ToEE (RPG) game simulator API–This will be used by some other collaborators (e.g., U. Michigan)
76Testbed for Integrating and Evaluating Learning Techniques
Collaborator: ISLE
Summary
• PIs: Dr. Seth Rogers, Dr. Pat Langley• Experience:
• ISLE (Institute for the Study of Learning and Expertise) is known for its ICARUS cognitive architecture, which is distinguished in part by its commitment to ground every symbol with a physical world object
• Pat Langley, founder of the journal Machine Learning, is known for his expertise in cognitive architectures and evaluation methodologies of learning systems.
• Deliverables: • ICARUS reasoning system API
• FreeCiv agent (with assistance from NWU) and SimCity agent• This will also be used by USC/ICT
• SimCity (RTS) game simulator API
• PIs: Dr. Seth Rogers, Dr. Pat Langley• Experience:
• ISLE (Institute for the Study of Learning and Expertise) is known for its ICARUS cognitive architecture, which is distinguished in part by its commitment to ground every symbol with a physical world object
• Pat Langley, founder of the journal Machine Learning, is known for his expertise in cognitive architectures and evaluation methodologies of learning systems.
• Deliverables: • ICARUS reasoning system API
• FreeCiv agent (with assistance from NWU) and SimCity agent• This will also be used by USC/ICT
• SimCity (RTS) game simulator API
77Testbed for Integrating and Evaluating Learning Techniques
Collaborator: Lehigh U.
Summary
• PI: Prof. Héctor Muñoz-Avila• Experience:
• Héctor is an expert on hierarchical planning technology, and in particular has expertise in case-based planning
• Collaborating with NRL on TIELT during CY04 on (1) Game Model description representations, (2) Stratagus/Wargus game simulator API, and (3) feedback on TIELT usage
• Deliverables: • Software for translating among Game Model representations• Stratagus/Wargus (RTS) game simulator API
– This may be used by UT Austin• Case-based planning reasoning system API
• PI: Prof. Héctor Muñoz-Avila• Experience:
• Héctor is an expert on hierarchical planning technology, and in particular has expertise in case-based planning
• Collaborating with NRL on TIELT during CY04 on (1) Game Model description representations, (2) Stratagus/Wargus game simulator API, and (3) feedback on TIELT usage
• Deliverables: • Software for translating among Game Model representations• Stratagus/Wargus (RTS) game simulator API
– This may be used by UT Austin• Case-based planning reasoning system API
78Testbed for Integrating and Evaluating Learning Techniques
Collaborator: NWU
Summary
• PIs: Prof. Ken Forbus, Prof. Tom Hinrichs• Experience:
• Ken is a leading AI/games researcher. He is also the leading worldwide researcher in computational approaches to reasoning by analogy.
• Ken’s group has extensive experience with qualitative reasoning approaches and with using the FreeCiv gaming simulator.
• Deliverables: • FreeCiv (Discrete Strategy) game simulator API
– This will be used by ISLE• Qualitative spatial reasoning system for FreeCiv API
• PIs: Prof. Ken Forbus, Prof. Tom Hinrichs• Experience:
• Ken is a leading AI/games researcher. He is also the leading worldwide researcher in computational approaches to reasoning by analogy.
• Ken’s group has extensive experience with qualitative reasoning approaches and with using the FreeCiv gaming simulator.
• Deliverables: • FreeCiv (Discrete Strategy) game simulator API
– This will be used by ISLE• Qualitative spatial reasoning system for FreeCiv API
79Testbed for Integrating and Evaluating Learning Techniques
Collaborator: U. Michigan
Summary
• PI: Prof. John Laird• Experience:
• John is the best-known AI/games researcher, and has extensive experience with integrating many commerical, freeware, and military game simulators with the Soar cognitive architecture.
• Deliverables: • Soar reasoning system API
– This will be used by USC/ICT• Applications of Soar to two game simulators (e.g., ToEE, Wargus)
• PI: Prof. John Laird• Experience:
• John is the best-known AI/games researcher, and has extensive experience with integrating many commerical, freeware, and military game simulators with the Soar cognitive architecture.
• Deliverables: • Soar reasoning system API
– This will be used by USC/ICT• Applications of Soar to two game simulators (e.g., ToEE, Wargus)
80Testbed for Integrating and Evaluating Learning Techniques
Collaborator: USC/ICT
Summary
• PI: Dr. Michael van Lent• Experience:
• Extensive implementation experience with AI/game research; PhD advisor was John Laird.
• Lead ICT’s development of Full Spectrum Warrior and Full Spectrum Command (FSC) in collaboration with Quicksilver Software and the Army’s PEO STRI. FSC is deployed at Ft. Benning and Afghanistan.
• Editor-in-Chief, Journal of Game Development• Deliverables:
• FSC (RTS) game simulator API• Applications of FSC with U. Michigan’s Soar and ISLE’s ICARUS
• PI: Dr. Michael van Lent• Experience:
• Extensive implementation experience with AI/game research; PhD advisor was John Laird.
• Lead ICT’s development of Full Spectrum Warrior and Full Spectrum Command (FSC) in collaboration with Quicksilver Software and the Army’s PEO STRI. FSC is deployed at Ft. Benning and Afghanistan.
• Editor-in-Chief, Journal of Game Development• Deliverables:
• FSC (RTS) game simulator API• Applications of FSC with U. Michigan’s Soar and ISLE’s ICARUS
81Testbed for Integrating and Evaluating Learning Techniques
Collaborator: UT Arlington
Summary
• PIs: Prof. Larry Holder, G. Michael Youngblood• Experience:
• Larry has extensive experience with developing unsupervised machine learning systems that use relational representations, and has lead efforts on developing the D’Artagnan cognitive architecture.
• Deliverables: • Urban Terror (FPS) game simulator API• D’Artagnan reasoning system API (partial)
• PIs: Prof. Larry Holder, G. Michael Youngblood• Experience:
• Larry has extensive experience with developing unsupervised machine learning systems that use relational representations, and has lead efforts on developing the D’Artagnan cognitive architecture.
• Deliverables: • Urban Terror (FPS) game simulator API• D’Artagnan reasoning system API (partial)
82Testbed for Integrating and Evaluating Learning Techniques
Collaborator: UT Austin
Summary
• PI: Prof. Risto Miikkulainen• Experience:
• Risto has significant experience with integrating neuro-evolution and similar approaches with game simulators.
• Collaborating with UT Austin’s Digital Media Laboratory’s development of the NERO (FPS) game simulator
• Deliverables: • Knowledge-intensive neuro-evolution reasoning system API• Application of this API using other simulators (e.g., FSC, Wargus)
and U. Wisconsin’s advice processing module
• PI: Prof. Risto Miikkulainen• Experience:
• Risto has significant experience with integrating neuro-evolution and similar approaches with game simulators.
• Collaborating with UT Austin’s Digital Media Laboratory’s development of the NERO (FPS) game simulator
• Deliverables: • Knowledge-intensive neuro-evolution reasoning system API• Application of this API using other simulators (e.g., FSC, Wargus)
and U. Wisconsin’s advice processing module
83Testbed for Integrating and Evaluating Learning Techniques
Collaborator: U. Wisconsin
Summary
• PI(s): Prof. Jude Shavlik (UW), Prof. Richard Maclin (U. Minn-Duluth)• Experience:
• Jude advised the first significant M.S. Thesis on applying machine learning to FPS game simulators (Geisler, 2002)
• Maclin, who will be on sabbatical at U. Wisconsin during this project, has performed extensive work with applying AI techinques (e.g., advice processing) to the RoboCup game simulator
• Deliverables: • RoboCup (team sports) game simulator API• Advice processing module• WWW-based repository for TIELT software components (e.g., APIs)
• PI(s): Prof. Jude Shavlik (UW), Prof. Richard Maclin (U. Minn-Duluth)• Experience:
• Jude advised the first significant M.S. Thesis on applying machine learning to FPS game simulators (Geisler, 2002)
• Maclin, who will be on sabbatical at U. Wisconsin during this project, has performed extensive work with applying AI techinques (e.g., advice processing) to the RoboCup game simulator
• Deliverables: • RoboCup (team sports) game simulator API• Advice processing module• WWW-based repository for TIELT software components (e.g., APIs)