Adaptive Game AI (based on slides of Pieter Spronck UvT)
Transcript of Adaptive Game AI (based on slides of Pieter Spronck UvT)
![Page 1: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/1.jpg)
Adaptive Game AI
(based on slides of Pieter Spronck UvT)
![Page 2: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/2.jpg)
Goals of Adaptive AI • Self-correction
– Automatic reparation of “exploits” – Online or offline – Direct or indirect
• Creativity – Being able to deal with new situations – Usually online – Direct
• Scalability – Appropriate challenge for novices and experts – Online – Direct or indirect
![Page 3: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/3.jpg)
Is It Needed?
• No benefits? – Then don’t – How smart should an enemy be if he lives for only
15 seconds?
• Can it be faked? – Then fake it
• Will the publisher accept it?
![Page 4: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/4.jpg)
Why the Hesitation?
• Learning the wrong lessons • Difficulty to get desired results
– Suitable fitness function – Increased entertainment
• Difficult to modify, test, debug, and understand • Low efficiency of common techniques
– Neural networks – Evolutionary algorithms
• It does not always make sense
![Page 5: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/5.jpg)
Problem of Complexity
• Huge state-action space
• Uncertainty – Non-deterministic – Incomplete
information – Multiple parallel
agents
• Often real-time
“When dealing with problems such as Stratagus you might as well throw the three chapters on search in my book in the garbage because these are irrelevant.” (Stuart Russell, IJCAI05)
![Page 6: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/6.jpg)
Requirements • Computer-controlled • Online computational requirements
– Speed – Effectiveness – Robustness – Efficiency
• Functional requirements – Clarity – Variety – Consistency – Scalability
![Page 7: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/7.jpg)
Necessities • Use as much prior knowledge as possible
– Improves all computational requirements • Good performance measure (fitness) • Learning by
– Optimisation – Reinforcement – Imitation
• Avoid overfitting • Minimise dependencies between behavioural aspects
– E.g. learning of “kill hotspots” and “weapon preference” • Balance between exploration and exploitation
![Page 8: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/8.jpg)
Elements of Adaptation
• Domain knowledge • Player model
– Explicit or implicit
Game AI
Knowledge Base
Player Model
A fireball is effective when thrown in the
middle of a closely-knit group
of enemies
The opponent always starts a fight with all
enemies huddled together
At the start of a fight, throw a fireball in the middle of the
group of enemies
Game World
Agent
![Page 9: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/9.jpg)
Adaptation Techniques
![Page 10: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/10.jpg)
Decision Trees
• Learn by building and tuning a decision tree • Advantages
– Robust – Simple – Readable
• Disadvantages – Need tuning – Only suitable for relatively simple reasoning
problems • Can be applied both offline and online
– Depending on the precise implementation • Used in Black & White
Eating?
-0.5
inanimate object
animate object
+0.5
-0.9 +0.5
human animal
![Page 11: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/11.jpg)
Neural Networks • Learn by training weights in a simulated brain • Advantages
– Flexible – Can learn any non-linear function
• Disadvantages – Need tuning – Only a few outputs possible – Hard to choose input variables – Slow to train
• For creating new behaviour offline only – Can be further tuned online
• Used in Creatures, Black & White, Colin McRae Rally 2.0
![Page 12: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/12.jpg)
Evolutionary Algorithms • Learning inspired natural selection and natural genetics • Advantages
– Flexible – Robust – Can explore large, noisy search spaces
• Disadvantages – Need tuning – Slow to evolve – Not guaranteed to find a solution
• Offline only • Used in Creatures
![Page 13: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/13.jpg)
Reinforcement Learning • Learning by rewarding good behaviour and punishing bad
behaviour • Advantages
– Simple, elegant – Usually easy to implement – Can solve wide variety of problems – Can find close to optimal solution – Learns during interaction with environment
• Disadvantages – Suitable state-space representation hard to determine
• Can be applied both offline and online – Depending on the precise implementation
![Page 14: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/14.jpg)
Evolutionary Learning for RTS Games
![Page 15: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/15.jpg)
Evolving Winning Strategies
Note: a chromosome represents a strategy
2nd note: This research was performed by Marc Ponsen and Pieter Spronck
![Page 16: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/16.jpg)
Wargus Game States
• The possible tactics during a game mainly depend on available units and technology
• The availability of units and technology depends on the buildings the player possesses
• Therefore, the utility of tactics depends on the available buildings
• The 20 states in Wargus are manually predefined and correspond to the set of buildings a player possesses
![Page 17: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/17.jpg)
Chromosome
Start State 1 State 2 End
State marker Rule x.1 Rule x.2 Rule x.n
State m ⋅⋅⋅
⋅⋅⋅
Chromosome
State
Rule ID Parameter 1 Parameter p ⋅⋅⋅ Rule Parameter 2
Start S C1 ⋅⋅⋅ 2 5 def B S E 8 R 15 B 4 3 S
State number x
1 3 4
Rule 1.1 Rule 1.2 Rule 3.1 Rule 3.2 Rule 3.3
State 1 State 3
![Page 18: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/18.jpg)
Strategy-Genes Translation • Chromosome = {"Start","S",1,"C1",1,9,"attack",
"C1",8,2,"defend","B",2,",…. • Corresponding strategy in Lua:
--We have reached state 1 function() return AiForce(1, {AiSoldier(), 9}) end, function() return AiWaitForce(1) end, function() return AiAttackWithForce(1) end, function() return AiForce(8, {AiSoldier(), 2}) end, function() return AiForceRole(8,"defend" ) end, function() return AiNeed(AiBarracks()) end, function() return AiWait(AiBarracks()) end,
TACTICS FOR STATE 1
![Page 19: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/19.jpg)
Genetic Operators
• State Crossover • Rule Replace Mutation • Rule Biased Mutation • Randomization
Start State 1 State 2 End Parent 1 State 6 State 8 State 12 State 13 State 16 State 19 State 20
Start State 1 State 2 End Child State 6 State 8 State 12 State 13 State 16 State 19 State 20
Start State 1 State 3 End Parent 2 State 4 State 8 State 12 State 13 State 14 State 17 State 20
![Page 20: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/20.jpg)
Tactics
• Two ‘balanced’ tactics – Small Balanced Land Attack (SBLA) – Large Balanced Land Attack (LBLA)
• Two ‘rush’ tactics – Soldier Rush (SR) – Knight Rush (KR)
![Page 21: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/21.jpg)
Fitness Determination
• Based on success against ‘rush’ tactics (soldier rush and knights rush)
• Fitness is based on fraction of military points scored by counter-tactic (win>0.5)
• Goal: target-fitness (SR: 0.75, KR: 0.70) • Abort: target-fitness reached or 5 generations
![Page 22: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/22.jpg)
Evolution Results
Tactic Tests Low High Avg. >250
SR 10 0.73 0.85 0.78 2
KR 10 0.71 0.84 0.75 0
![Page 23: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/23.jpg)
Analysis of Counter-Tactics
• Counter soldier rush with enhanced soldier rush
• Against knights rush – Expand territory when defendable – Quickly develop strong units – Use catapults
![Page 24: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/24.jpg)
Reinforcement Learning with Dynamic Scripting
![Page 25: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/25.jpg)
Dynamic Scripting
computer-controlled team
human- controlled team
Knowledge
Base A
Knowledge
Base B
generate script
generate script
Script A
Script B
script control
script control
human control
human control
weight updates
Combat
![Page 26: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/26.jpg)
Validation Experiments
• How quick is dynamic scripting able to adapt to an unchanging tactic?
• Or, can dynamic scripting quickly force a (human) player to change tactics?
![Page 27: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/27.jpg)
Encounter Setup
• Two parties of four characters – Two fighters – Two wizards
• Each character – Static armament and weaponry – Two (out of three possible) potions – Potion assignment according to script
• Each wizard – Seven (out of 21 possible) spells – Spell assignment according to script
![Page 28: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/28.jpg)
Script
![Page 29: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/29.jpg)
Rulebase
• One rulebase for every opponent – 50 rules for wizards – 20 rules for fighters
• Script extracted from rulebase – 10 rules + 2 static rules for wizards – 5 rules + 1 static rule for fighters
• Selection according to rule weights • Ordering according to priority/weight
![Page 30: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/30.jpg)
Party Fitness
• Indicates how well the party functions as a whole
• Takes into account – Win or loss – Number of surviving party members – Total health of surviving party members
• Is used to decide when one party outperforms the other
![Page 31: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/31.jpg)
Individual Fitness
• Indicates how well an individual functions in the party
• Takes into account – Party fitness – Individual’s health at the end of the encounter – Individual’s time of death – Damage done to enemies
• Determines how the weights in the individual’s rulebase will be changed
![Page 32: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/32.jpg)
Reward/Penalty Determination
• Maximum reward at individual fitness 1
• Maximum penalty at individual fitness 0
• No reward or penalty at fitness b (break-even)
• Linear extrapolation
Rmax
−Pmax
0 b 1
ΔW
F
![Page 33: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/33.jpg)
Tactics
• Basic – Offensive – Disabling – Cursing – Defensive
Composite Random team Random agent Consecutive
![Page 34: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/34.jpg)
Party Fitness Progression
• Absolute
• Average over last 10 encounters
![Page 35: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/35.jpg)
Pmax=30 Pmax=70
Tactic Avg. StD. High Top5 Avg. StD. High Top5
Offensive 58 35 314 155 53 25 120 107
Disabling 12 5 51 31 13 8 79 39
Cursing 137 334 1767 1461 44 50 304 222
Defensive 31 19 93 77 24 15 79 67
Rnd. Team 56 74 595 310 51 65 480 271
Rnd. Agent 53 67 398 289 41 41 251 178
Consecutive 72 100 716 424 52 56 393 238
![Page 36: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/36.jpg)
Turning Points P max=30
0
20
40
60
80
100
120
140
160
180
200
Offensive Disabling Cursing Defensive Rnd.Team Rnd.Agent Consecutive
Turning Points P max=70
0
20
40
60
80
100
120
140
160
180
200
Offensive Disabling Cursing Defensive Rnd.Team Rnd.Agent Consecutive
Low turning points In the dozens instead
of thousands Generated with initial
weights all equal Turning points and
standard deviations even much lower with biased initial weights
Good Results?
![Page 37: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/37.jpg)
Reinforcement Learning with Monte Carlo Control
![Page 38: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/38.jpg)
Q-learning and Games
• Game states do not satisfy the Markov property* – Agents in games cannot fully observe the behaviour of other
agents – Agents are not fully aware of other agents’ actions – State transitions depend on actions of all agents
• Therefore Q-learning and related reinforcement learning algorithms cannot be expected to function well for games (Sutton & Barto, 1998) – We confirmed this for our simulation
* Markov property: the conditional probability distribution of future states of the
process, given the present state, depends only upon the current state
![Page 39: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/39.jpg)
Monte Carlo Control
• Sutton & Barto (1998): Monte Carlo Control is – Particularly suitable for non-Markovian
environments – Able to quickly start exploiting learned behaviour
• Monte Carlo Control should therefore be ideally suitable for learning in games
![Page 40: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/40.jpg)
MC Control Algorithm
• Identify suitable states • Create state-action space • Assign random Q-value to each action in each state • Loop
– Choose actions • Most of the time (e.g., 95%): action with highest Q-value in
state (exploitation) • Else: random action in state (exploration)
– When fitness can be calculated • For each selected action: calculate new Q-value by “averaging
in” the calculated fitness • Converges to Q-values that represent fitness of actions in
defined states
![Page 41: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/41.jpg)
Simulation State-Action Space
• States identified by – #agents alive in friendly team – #agents alive in enemy team
• Total of 16 states (1-4 agents in each team) – Simulation terminates when at least one team has
zero agents • Actions defined as rules from the rulebases
– Removing tests on “initial encounter” – Removing all rules with random behaviour (30%) – Allow agents to select spells “on the fly”
![Page 42: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/42.jpg)
Comparison DS/MCC
0
50
100
150
200
Offensive Disabling Cursing Defensive Rnd.team Rnd.agent Consec.
Monte Carlo Dynamic scripting Biased Rulebases
![Page 43: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/43.jpg)
Properties of MCC
• Disadvantages compared to DS – (Much) higher turning points – Cannot deal with changing tactics
• Advantages over DS – Can discover effective state-action pairs
unforeseen by game developer – Does not need priority mechanism – Will converge in a static environment
![Page 44: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/44.jpg)
Dynamic Player Modeling
![Page 45: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/45.jpg)
Two Forms of Player Modeling
• Implicit – AI adapts while playing against a specific opponent – Example: dynamic scripting
• Explicit – AI creates a model of the opponent, then chooses actions which
it knows work well against such an opponent – Naturally, it can learn those actions, and can still adapt online – Very suitable for multi-player games
![Page 46: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/46.jpg)
Player Model
• Modeling – Player actions – Player preferences
• Model types – Evaluation functions – Neural networks – Finite state machines – Probabilistic models – Case-based models
• Model application – Strong AI – Appropriate AI
Game AI Game World
Human Player
Player Model
Actions Preferences
Style Experiences
Feedback
actions
actions
observations
observations predictions updates
similarity?
![Page 47: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/47.jpg)
Formations
• Cohesive team behaviour – V-shape – Phalanx – Wedge – Shield wall – Etc...
• Should fit the enemy’s tactics
![Page 48: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/48.jpg)
Formations in RTS Games
• Static: each unit assigned a specific slot; or • Emergent: units positioned relative to each other
![Page 49: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/49.jpg)
Dynamic Formations
• General formation framework • Learning algorithm to create suitable formations
– Online adaptation of 8 formation parameters – Offline learning of suitable parameters against player models
• Fast modeling algorithm to determine a model for the current player – Player model based on 3 observations
Note: This research was performed by Marcel van der Heijden, Sander Bakkes, and Pieter Spronck
![Page 50: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/50.jpg)
Dynamic Formation Shape
• Centered around leader • Parameters 1-5
– 1-n lines – 1-m units
per line – α, β, γ
![Page 51: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/51.jpg)
Replacement of Destroyed Units
• Neigbouring units take place of destroyed units – Fast and cheap
![Page 52: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/52.jpg)
Movement
• Parameter 6: speed • Leader moves • Formation moves with leader • Units attempt to stick to their place in the
formation • If units fall seriously behind, the speed of the
formation is adjusted
![Page 53: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/53.jpg)
Target Selection (Parameter 7)
![Page 54: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/54.jpg)
Combat Behaviour (Parameter 8)
• Choice of five – Overrun – keep pushing forward – Hold – stop when opponent is within weapon range – Retreat – hit and run – Bounce – hit an run, but return after half of time-out – Border- stay out of weapon range during time-out
![Page 55: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/55.jpg)
Performance Measures
• Absolute performance: percentage of games won during last 50 games – Sequence of 200 games against each opponent
• Relative performance: fitness value obtained – Range: [-50,50]
• Turning point: win-loss ratio of last 20 games becomes 15-5 or better – Statistically, it is 98% certain that learning
opponent is now better than static opponent
![Page 56: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/56.jpg)
Dynamic Formations w/o Modelling
![Page 57: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/57.jpg)
Player Model in 3 Parameters
• Number of formations – Determined with k-means clustering
• Unit distribution – Proportion of width and height of rectangle
around all units
• Unit distance (density) – Average distance of each unit to its closest
neighbour
![Page 58: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/58.jpg)
Player Modelling During Game
• Measure the three parameters ~100 timesteps after the game started
• Calculate the combined likelihood of observed parameters for each known model (Bayes’ theorem)
• Select the model that best explains the observations – For the first few trials – After that, do regular adaptation
![Page 59: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/59.jpg)
Performance Comparison
No player modelling With player modelling
Opponent Abs.perf. Rel.perf. TP Abs.perf. Rel.perf. TP
Blekinge 100% 37 22 Immediately classified correctly
NUS 12% -31 >200 4% -37 >200
UBC 94% 15 71 Immediately classified correctly
UM 66% 4 >174 84% 9 23
WarsawA 44% -1 >174 66% 5 22
WarsawB 98% 17 57 Immediately classified correctly
![Page 60: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/60.jpg)
Adapting to Entertain
![Page 61: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/61.jpg)
Learning and Enjoying Chess
Deep Blue (1997)
Chess Challenger (1978)
![Page 62: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/62.jpg)
Iida’s Theory of Refinement
DB
Complexity Fairness
Refinement
Noble uncertainty: New tactics are possible
Draw ratio: Matching opponents
Seesaw game: Optimal length of time outcome is uncertain
= 0.07
![Page 63: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/63.jpg)
Enjoying Games
• Computer should be able to play stronger than the human player
• Computer should adapt to the level of skill of the human player
• Computer should constantly offer new challenges
In short: Computer and human increase their playing skill in parallel
![Page 64: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/64.jpg)
• Manual • Coarse • Simple
![Page 65: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/65.jpg)
Automatic Scaling of Game AI
• High-fitness penalising – Award the highest fitness to the “most equal” AI, instead of to
the “best” AI
• Weight clipping – Increase AI variety when the computer plays too well – Decrease AI variety when the computer plays badly
• Top culling – Remove the currently “best” knowledge when the AI plays too
well – Reactivate the “best” knowledge when the AI plays badly
![Page 66: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/66.jpg)
Difficulty Scaling Tests
• Against five different basic tactics • Against three different composite tactics • Against Neverwinter Nights game AI
• Novice tactic
– Imitates novice player – Knows obvious good tactics – Does not know subtle good tactics
![Page 67: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/67.jpg)
Histograms Optimisation
0
5
10
15
0 10 20 30 40 50 60 70 80 90 100
High Fitness Penalising
0
5
10
15
0 10 20 30 40 50 60 70 80 90 100
Weight Clipping
0
5
10
15
0 10 20 30 40 50 60 70 80 90 100
Top Culling
0
5
10
15
0 10 20 30 40 50 60 70 80 90 100
![Page 68: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/68.jpg)
Scaling Results
• Without automatic scaling, dynamic scripting wins against all tactics
• With high-fitness penalising an even game is achieved against 2 out of 8 tactics
• With weight clipping an even game is achieved against 7 out of 8 tactics
• With top culling an even game is achieved against 8 out of 8 tactics, combined with the lowest standard deviation
![Page 69: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/69.jpg)
Difficulty Scaling
• Adaptive effectiveness can be easily converted to difficulty scaling
• Often not useful against strong players – Losing a sense of accomplishment – Best results against novices
• For highest enjoyment – Should the AI play stronger than the human player? – How much stronger? – To what extent does this depend on the type of game? – Is this equal for all human players?
![Page 70: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/70.jpg)
Adapting to the trainee
![Page 71: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/71.jpg)
Serious Gaming
![Page 72: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/72.jpg)
Agent Approach • Agents part of the design process • Reasoning agents • Adapting agents Example: • Trainee is fire commander • 2 fireman agents • 1 agent controlling spreading of fires • 1 victim agent
![Page 73: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/73.jpg)
Adapt the game to the user
Beginner: Frank Expert: Joost
A user only learns if performing on his own level
![Page 74: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/74.jpg)
Current approaches
• Fixed difficulties • Central control or no coordination • Mainly adjust simple subtasks
![Page 75: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/75.jpg)
Dynamic Difficulty Adjustment
• Online adaptation: Continuously balance challenges in the game
with (developing) skills of the trainee
![Page 76: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/76.jpg)
Aspects • User
– Evolving skills (when learning) • Characters
– Characters adapt independently – Characters active for long periods, so, adaptation should
be believable • Keep storyline
– Learning goals have to be maintained! • Adaptation must be coordinated! • Performance can not be measured separately for each
skill and influence of each agent
![Page 77: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/77.jpg)
Adapt the game to the user Use agents to have characters adapt natural and independent
![Page 78: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/78.jpg)
Adapt the game to the user Use agents to have characters adapt natural and independent
![Page 79: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/79.jpg)
Adapt the game to the user Use an agent organization approach to ensure
the goals of the overall system
![Page 80: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/80.jpg)
Story-line • Guarantee certain states are reached • Subtasks defined by scene scripts and
landmarks • Connected by interaction structure
– Describes game progress – Connecting scenes – Tasks in parallel
Start Get to site
Gather info
Secure area
Search building
Evacuate victims
Extinguish fire Clear area End
![Page 81: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/81.jpg)
Agent Implementation • Adaptable BDI-agents (2APL extension)
– Equivalent plans – Preference relation – Environmental information also used
Search building
Evacuate victims
Pairwise search
Parallel search
Thorough serch
One by one
Groupwise
Front/back rooms floors
Front/back parallel Rooms parallel Floors parallel
Top floor first Utility rooms first Living quarter first
![Page 82: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/82.jpg)
Adaptation Engine • Coordinates task difficulty • Check with game model • Combinatorial auction
– User model – Agent preferences
Agent model • 2APL Agent
Agent model • Agent Bidding
• User Model
• Adaptation Engine
Update
Plans Bid
Task Weights Skill Levels
Selection
Preferences & Temination
Scene States Applicable plans
• Game Model
Start Get to site
Gather info
Secure area
Search building
Evacuate victims
Extinguish fire Clear area End
![Page 83: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/83.jpg)
Task Difficulty
• Task dependent on behavior of the agents • Behavior variations are fixed • Estimated by domain expert • Updated by offline learning • Much faster adaptation
0,3 0,5 0,7
![Page 84: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/84.jpg)
Agent Perspective
• Agents Propose actions to adaptation engine at “natural” synchronization points
• Created to facilitate trainee’s objectives (optimize agent behavior relative to trainee’s performance!)
• Not responsible for suitable combination • Conflict:
– Stay as consistent as possible – Propose enough actions
• Adaptation engine can request agents to terminate behavior if necessary for coordination
![Page 85: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/85.jpg)
Conclusion
• Continuous adaptation to the trainee • Agent based approach
– Complex individual behavior and adaptation possible • Agent organisation for coordination
– Balance between individual flexibility and global story line maintaining learning goals
– Constrain based only on global landmarks and learning goals
– Minimal central control for more efficiency and more flexibility
![Page 86: Adaptive Game AI (based on slides of Pieter Spronck UvT)](https://reader031.fdocuments.us/reader031/viewer/2022020213/586ccdde1a28abc7348b7f2f/html5/thumbnails/86.jpg)
Overall conclusions
• Adaptation and learning can take place in many parts and levels of the games
• Traditional techniques are used for tuning parts of the game
• Agent based approach is usefull for adapting the story line of a game to the trainee