Evolving Heuristics for Searching Games

29
Evolving Heuristics for Searching Games Evolutionary Computation and Artificial Life Supervisor: Moshe Sipper Achiya Elyasaf June, 2010

description

Evolving Heuristics for Searching Games. Evolutionary Computation and Artificial Life Supervisor : Moshe Sipper Achiya Elyasaf June, 2010. Overview. Searching Games State-Graphs Representation Uninformed Search Heuristics Informed Search Rush Hour Domain Specific Heuristic - PowerPoint PPT Presentation

Transcript of Evolving Heuristics for Searching Games

Page 1: Evolving Heuristics for Searching Games

Evolving Heuristics forSearching Games

Evolutionary Computation and Artificial Life

Supervisor: Moshe Sipper

Achiya ElyasafJune, 2010

Page 2: Evolving Heuristics for Searching Games

2

Overview

Searching Games State-Graphs• Representation• Uninformed Search• Heuristics• Informed Search

Rush Hour• Domain Specific Heuristic• Evolving Heuristics• Coevolving Game Boards• Results

Freecell• Domain Specific Heuristic• Coevolving Game Boards• Learning Methods• Results

Page 3: Evolving Heuristics for Searching Games

3

Every puzzle/game can be represented as a state graph:

• Single player games such as puzzles, board games etc.: every piece move can be counted as a different state

• Multi player games such as chess, robocode etc. – the place of the player / the enemy, rest of the parameters (health, shield…) define a state

Searching Games State-GraphsRepresentation

Page 4: Evolving Heuristics for Searching Games

4

Searching Games State-GraphsRepresentation

Rush Hour:

Page 5: Evolving Heuristics for Searching Games

5

Searching Games State-GraphsRepresentation

Blocksworld:

Page 6: Evolving Heuristics for Searching Games

6

Searching Games State-GraphsUninformed Search

BFS – Exponential in the search depth DFS – Linear in the length of the current search

path. BUT:• We might “never” track down the right path.• Usually games contain cycles

Iterative Deepening: Combination of BFS & DFS• Each iteration DFS with a depth limit is performed.• Limit grows from one iteration to another

• Worst case - traverse the entire graph

Page 7: Evolving Heuristics for Searching Games

7

Searching Games State-GraphsUninformed Search

Most of the game domains are PSPACE-Complete!

Worst case - traverse the entire graph We need an informed-search!

Page 8: Evolving Heuristics for Searching Games

8

Searching Games State-GraphsHeuristics

h:states -> Real. • For every state s, h(s) is an estimation of the

minimal distance/cost from s to a solution• h is perfect: an informed search that tries states

with highest h-score first – will simply stroll to solution

• Bad heuristic means the search might never get to answer

• For hard problems, finding h is hard

We need a good heuristic function to guide informed search

Page 9: Evolving Heuristics for Searching Games

10

Searching Games State-Graphs Informed Search (Cont.)

IDA*: Iterative-Deepening with A*• The expanded nodes are pushed to the DFS stack

by descending heuristic values• Let g(si) be the min depth of state si: Only nodes

with f(s)=g(s)+h(s)<depth-limit are visited

Near optimal solution (depends on path-limit) The heuristic need to be admissible

Page 10: Evolving Heuristics for Searching Games

14

Overview

Searching Games State-Graphs• Representation• Uninformed Search• Heuristics• Informed Search

Rush Hour• Domain Specific Heuristic• Evolving Heuristics• Coevolving Game Boards• Results

Freecell• Domain Specific Heuristic• Coevolving Game Boards• Learning Methods• Results

Page 11: Evolving Heuristics for Searching Games

15

Rush HourDomain Specific Heuristic

GP-Rush [Hauptman et al, 2009]Hand Crafted heuristics: Goal distance – Manhattan distance Blocker estimation – lower bound

(Admissble) Hybrid blockers distance – combine the two

above Is Move To Secluded – did the car enter a

secluded area Is Releasing move

Page 12: Evolving Heuristics for Searching Games

20

For H1, … , Hn – building blocksHow should we choose the fittest heuristic?• Minimum? Maximum? Linear combination?

GA/GP may be used for:1. Building new heuristics from existing building blocks2. Finding weights for each heuristic (for applying

linear combination)3. Finding conditions for applying each

• Probably, H should fit stage of search• E.g. “goal” heuristics when assuming we’re close

GA/GP

Page 13: Evolving Heuristics for Searching Games

21

GA/GP (Cont.)

If

And

H1 0.4

H2 0.7

+

H3 *

H1 0.5

*

H5 /

H1 0.1

Condition True

False

Page 14: Evolving Heuristics for Searching Games

22

GA/GP (Cont.)Back to Rush Hour

Functions & Terminals:

Genetic Operators: Cross-Over & Mutation on trees as Koza describes

Conditions ResultsTerminals IsMoveToSecluded, isReleasingMove, g,

PhaseByDistance, PhaseByBlockers, NumberOfSyblings, DifficultyLevel,

BlockersLowerBound, GoalDistance, Hybrid, 0, 0.1, … , 0.9 , 1

BlockersLowerBound, GoalDistance, Hybrid,

0, 0.1, … , 0.9 , 1

Sets If, AND , OR , ≤ , ≥ + , *

Page 15: Evolving Heuristics for Searching Games

23

Fitness measure? Cross-over? Mutation?

GA/GP (Cont.)Policies

Condition ResultCondition 1 Heuristics Weights 1Condition 2 Heuristics Weights 2

Condition n Heuristics Weights nDefault Heuristics Weights

.

.

.

.

.

.

Page 16: Evolving Heuristics for Searching Games

24

Co-Evolving Difficult Solvable 8x8 Boards

Our enhanced IDA* search solved over 90% of the 6x6 problems

We wanted to demonstrate our method’s scalability to larger boards

24

Page 17: Evolving Heuristics for Searching Games

25

Co-Evolving Difficult Solvable 8x8 Boards

Fitness measure? Cross-over? Mutation?

25

C

B AP

M

IK

S F GH

F

C

B AP

M

IK

S F GH

F

Page 18: Evolving Heuristics for Searching Games

26

Rush Hour Results

Average percentage of nodes required to solve test problems, with respect to the number of nodes scanned by a blind search:

Page 19: Evolving Heuristics for Searching Games

27

Rush Hour Results (Cont.)

Time (in seconds) required to solve problems JAM01 . . . JAM40:

Page 20: Evolving Heuristics for Searching Games

28

Overview

Searching Games State-Graphs• Representation• Uninformed Search• Heuristics• Informed Search

Rush Hour• Domain Specific Heuristic• Evolving Heuristics• Coevolving Game Boards• Results

Freecell• Domain Specific Heuristic• Coevolving Game Boards• Learning Methods• Results

Page 21: Evolving Heuristics for Searching Games

29

FreecellIntro

FreeCell remained relatively obscure until Windows 95

There are 32,000 solvable problems (known as Microsoft 32K), except for game #11982, which has eluded solution so far

Page 22: Evolving Heuristics for Searching Games

3030

Freecells Foundations

Cascades

FreecellIntro (Cont.)

Page 23: Evolving Heuristics for Searching Games

31

Lowest card at Foundations Number of well placed cards Num of cards not at Foundations Num of Freecells and free Cascades Sum of the Cascades bottom cards Highest home card – lowest home card

31

FreecellHeuristics

Page 24: Evolving Heuristics for Searching Games

32

As opposed to Rush-Hour, blind search could not solve even one problem

The best solver to date solves 89% of Microsoft 32K

Reasons:• High branching factor• Hard to generate a good heuristic

FreecellLearning methods

Page 25: Evolving Heuristics for Searching Games

33

In Rush Hour:• Hyper-Heuristics population• Each generation – all individuals solve 5

different randomly selected instances• Test set - 20% of the problems• Training set – the rest

In Freecell:• This method failed

FreecellLearning methods

Page 26: Evolving Heuristics for Searching Games

34

First try:

Sort the problems by difficulty Learn gradually the whole training set

FAILED:• Days of training• Over fitting and forgetness

FreecellLearning methods

Page 27: Evolving Heuristics for Searching Games

35

Second try:

Co-evolution:• First population – Hyper-Heuristics• Second population – Game boards with Hillis

“Hall of Fame”

FAILD:• Ambiguous reason for low fitness

FreecellLearning methods

Page 28: Evolving Heuristics for Searching Games

36

Third try:

Co-evolution:• First population – Hyper-Heuristics• Second population – Group of 8 game boards

SUCCESS:• Fast learning process• No ambiguity• We create the right competioin

FreecellLearning methods

Page 29: Evolving Heuristics for Searching Games

37

Freecell Results

Reduction

RunNode

reductionTime

reductionSolution Length

% of solved problems

HSD 100% 100% 100% 89%GA-1 23% 31% 1% 71%GA-2 23% 30% -3% 70%GP - - - -

Policy 28% 36% 6% 74%GA with

Co-Evolution 60% 69% 37% 98%

Policy withCo-Evolution 59% 69% 30% 99%