State Space Search

STATE SPACE SEARCHNattee Niparnan

OPTIMIZATION EXAMPLE: FINDING MAX VALUE IN AN ARRAY

25 2 34 43 4 9 0 -5 87 0 5 6 1

There are N possible answersThe first elementThe second element3rd, 4th …

Try all of themRemember the best one

STATE SPACE SEARCH FRAMEWORK

Define the set of admissible solution Generate all of them (generating) For each generated solution

Test whether it is the one we want By the “evaluation” function (testing)

for optimization: remember the best one so far For decision: report when we found the “correct” one

VERY IMPORTANT!!

SOLVING PROBLEM BY STATE SPACE SEARCH

1. Define the set of admissible solution (the search space)

What is it? How large? How to represent the solution

2. Determine how to generate all solutions3. Determine how to check each solution

GENERATING IN STEPS

In most problem, admissible solutions can be generated iteratively Step-by-step

Example Maximum Sum of Subsequence Minimal Spanning Tree Pachinko Problem

SEARCH: MAXIMUM SUM OF SUBSEQUENCE

Search Space Every possible sequence Described by a pair of starting,ending element

Generating all solutions Step 1: select the position of the first element Step 2: select the position of the last element

Checking each solution Sum and test whether it is maximum

SEARCH: MINIMAL SPANNING TREE

Search Space Every subset of edges Described by a set of edges

Generating all solutions For every edge

Either include or exclude it from the set

Checking each solution Sum the cost in the set

and test whether it is maximum Test whether it is a tree Test whether it is connected

SEARCH: MINIMAL SPANNING TREE 2ND ATTEMP

Search Space Every subset of edges of size |V|-1 Described by a set of edges

Generating all solutions For every edge

Either include or exclude it from the set Do not select more than |V|-1 edges


and test whether it is maximum Test whether it is a tree

SEARCH: MINIMAL SPANNING TREE 3RD ATTEMP

Search Space Every subgraph of size |V|-1 edge that is a tree Described by a set of edges

Generating all solutions For every edge in X in “cut property”

Either include or exclude it from the set Update the tree

Do not select more than |V|-1 edges


and test whether it is maximum

SEARCH: PACHINKO

Search Space Every path from the top pin to the bottom pin Described by a sequence of direction

(left,left,right,left)

Generating all solutions For every level

Choose either the left side or right side

Checking each solution Sum the cost in the path

and test whether it is maximum

GENERATING ALL POSSIBLE ANSWERS

COMBINATION AND PERMUTATION

In many case, the set of the admissible solutions is a set of “combination” or “permutation” of something

We need to knows how to generate all permutations and combinations

COMBINATION

Given N things Generate all possible selections of K things from N

things

Ex. N = 3, k = 2

COMBINATION WITH REPLACEMENT

Given N things Generate all possible selections of K things from N

things When something is selected, we are permit to select that

things again (we replace the selected thing in the pool)

Ex. N = 3, k = 2

BREAKING THE PADLOCK

BREAKING THE PADLOCK Assuming we have four rings Assuming each ring has following mark

We try

…. Undone the

second step, switch to

another value

KEY IDEA

A problem consists of several similar steps Choosing a things from the pool

We need to remember the things we’ve done so far

GENERAL FRAMEWORK

Storage

Gemerated (partial) solution

Engine

Initial Step

1. Get a step that is not complete

2. Try all possible choice in next step

3. Store each newly generated next step

SOLUTION GENERATION

For all symbol i New = curr+I Storage.push(new)

If length(curr) == 4

Storage ss ‘’While s is not empty

Curr s.getIf Curr is the last step

evaluateElse

Generate all next step from Currput them into S

SEARCH SPACE

Set of all admissible solutions E.g., Combination Padlock

Search space = 0000 3333

SEARCH SPACE ENUMERATION

Key Idea: generate step-by-step, Undo previous step when necessary

Usually using some data structure to implement the storage E.g., using stack, either explicitly or implicitly from the

processor stack i.e., using the “recursive” paradigm

Queue is a possible choice

EXAMPLE: PADLOCK

Generate all combinations of lock key Represent key by int

=0=1=2=3

Step = sequence of selected symbolsE.g.,

’01’ ‘0003’

EXAMPLE: PADLOCK

The process automatically remember the step (by sol array) and by stack (int i)

void search(int step,int *sol){ if (step < num_step) { for (int i = 0; i < num_symbol; i++) {

sol[step] = i; search(step + 1,sol);}

} else { check(sol); }}

SEARCH TREE

A tree representing every step in the seach Similar to Recursion Tree

Actually, it is related Node:

Each step in solution generation Edge:

Connects two nodes such that one node is generated from applying one step to the other node

SEARCH TREE

0 1 2 3

00 01 02 03 …

… … … …… … … …… … … …

SEARCH TREE Sols in each step

0 00 000 0000 check 000 0001 check 000 0002 check 000 0003 check 000 00 001 0010 check

8-QUEEN PROBLEM

8-QUEEN PROBLEM

Given a chess board with 8 queens

X X X

X X X

X X X X

X X X

X X X

8-QUEEN PROBLEM

Try to place the queens so that they don’t get in the others’ ways

Q

Q

Q

Q

Q

Q

Q

Q

8-QUEEN PROBLEM

Input:None!

Output:Every possible placement of 8-queens that does

not jeopardize each other

SOLVING THE PROBLEM

Define the search space What is the search space of this problem? How large it is?

Choose an appropriate representation

1ST ATTEMP

Every possible placement of queens Size: 648

Representation: a set of queens position E.g., (1,1) (1,2) (2,5) (4,1) (1,2) (3,4) (8,8) (7,6)

This includes overlapping placement!!!

2ND ATTEMP

Another representation Try to exclude overlapping

Use combination without replacement This is a combination

Selecting 8 positions out of 64 positions Size: (64)! / (64 – 8)! * 8!

Implementation: in the “generating next step”, check for overlapping

2ND ATTEMP (FIRST IMPLEMENTATION)

We go over all positions For each position, we either “choose” or “skip” that position

for the queen

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

COMBINATION WITHOUT REPLACEMENT

void e_queen(int step,int *mark_on_board) { if (step < 64) { mark_on_board[step] = 0; e_queen(step + 1,mark_on_board); mark_on_board[step] = 1; e_queen(step + 1,mark_on_board); } else { check(mark_on_board); }}

* Check if OK* Also has to check whether we mark

exactly 8 queens

Mark_on_board is a binary array to indicate whether the position is

selected

2ND ATTEMP ISSUE

The generated mark_on_board includes 000000000000 select no position (obviously not the

answer) 111111000000 select 6 positions (obviously not the

answer)

We must limit our selection to be exactly 8

2ND ATTEMP (SECOND IMPLEMENTATION)void e_queen(int step,int *mark_on_board,int chosen) { if (step < 64) { if ((64 – 8) – (step – chosen) > 0) { mark_on_board[step] = 0; e_queen(step + 1,mark_on_board,chosen); } if (8 - chosen > 0) { mark_on_board[step] = 1; e_queen(step + 1,mark_on_board,chosen+1); } } else { check(mark_on_board); }}

Number of possible 0


e_queen(0,mark,0);

2ND ATTEMP (SECOND IMPLEMENTATION) REV 2.0void e_queen(int step,int *mark_on_board,int one,int zero) { if (step < 64) { if (zero > 0) { mark_on_board[step] = 0; e_queen(step + 1,mark_on_board,one,zero - 1); } if (one > 0) { mark_on_board[step] = 1; e_queen(step + 1,mark_on_board,one – 1,zero); } } else { check(mark_on_board); }}



e_queen(0,mark,8,64 - 8);

3RD ATTEMP

Any better way? For each row, there should be only one queen

The problem consists of 8 step Placing each queen

Size: 88

Representation: sequence of columns E.g., (1,2,3,4,5,6,7,8)

3RD ATTEMP IMPLEMENTATION

There are eight possible ways in each step There are eight steps Very similar to the combination problem

void e_queen(int step,int *queen_pos) { if (step < 8) { for (int i = 0; i < 8; i++) {

queen_pos[step] = i; e_queen(step + 1, queen_pos);}

} else { check(queen_pos); }}

4TH ATTEMP

Queen should not be in the same column The solution should never have any column repeated

E.g., (1,2,3,4,5,6,7,1) is bad (column collision (1,1,3,4,5,6,7,5) is bad as well….

(1,2,3,4,5,6,7,8) is good There should be no duplicate column index!!!

PERMUTATION Given N symbols A permutation is the element arrange in any order

E.g., 1 2 3 4 Shows

1 2 3 4 1 2 4 3 1 3 2 4 1 3 4 2 … 4 3 2 1

For each step, we have to known which one is used

PERMUTATION The problem consists of several similar steps Special condition

Symbols never repeat How to do?

Easy way: Generate all combination (as done before)

Check for ones that symbols do not repeat Better way:

Remember what symbols are used

PERMUTATIONvoid search(int step,int *sol) { if (step < num_step) { for (int i = 0; i < num_symbol; i++) { if not_used(sol,i,step) {

sol[step] = i; search(step,sol);

}}

} else { check(sol); }}

Bool not_used(int *sol,int value,int step) { for (int i = 0;i < step; i++) { if (sol[i] == value) return false; } return true;}

PERMUTATION

More proper ways

void search(int step,int *sol,bool *used){ if (step < num_step) { for (int i = 0; i < num_symbol; i++) { if (!used[i]) {

used[i] = true;sol[step] = i;

search(step,sol,used); used[i] = false; }

} } else { check(sol); }}

INCRESING SEQUENCE

Given N Find any sequence of (a1,a2,a3,…) such that

a1 +a2 + a3 +… + ak = N ai > 0 ai <= aj for all i < j ai is an integer

EXAMPLE

N = 4 1 + 1 + 1 + 1 1 + 1 + 2 1 + 3 2 + 2 4

DFS AND BFS ALGORITHM

BREADTH FIRST SEARCH AND DEPTH FIRST SEARCH

Can we draw a search tree from this pseudo-code? Assume that we know how to generate next step?

Do we know the growth (in term of structure, not size) of the search tree?



evaluateElse


BREADTH FIRST SEARCH AND DEPTH FIRST SEARCH

Search Tree grows according to S and how to generate all next step as well,

If S is “stack”, the algorithm is called DFS If S is “queue”, the algorithm is called BFS



evaluateElse


TRIPLE AND HALF PROBLEM

Input: A number

Output: A sequence of either * 3 or / 2 operation, start from 1

that is evaluated as the given number Special version: number of operations should be minimal

Example: Input: 10 Output: 1 * 3 * 3 * 3 * 3 / 2 / 2 / 2 / 2 Input: 31 Output: 1 * 3 * 3 * 3 * 3 * 3 / 2 / 2 / 2 / 2 / 2 * 3 * 3 / 2

TRIPLE AND HALF PROBLEM

Search Space Every possible sequence of triple or half Described by a sequence of operation

(T,H,T,H,T) Issue: There is no clear length-limit of sequence!!

Generating all solutions Step: operator

Choose either the triple or half

Checking each solution Evaluate the expression

and test whether it equals to target

SOLUTION (1ST ATTEMP) Issue

Is [10000] enough?

Repeat solution? 1 * 3 * 3 / 2 *

3 / 2 6 1 * 3 * 3 * 3 /

2 / 2 6 When will the

program stop? Can you draw

the search tree?

void DFS(int goal) { Stack S; // store string S.put ("1"); while (!S.isEmpty()) { curr = S.pop();

if ( test(curr) == goal) { printf("found! %s",curr); return ;}char a[10000],b[10000];strcpy(a,curr);

strcat(a,"*3");S.push(a);strcpy(b,curr);

strcat(b,"/2");S.push(b);

}}

SOLUTION (2ND ATTEMP) Issue

Path is lost Never mind

that for now

void DFS(int goal) { Stack S; // store int Hash h; S.put (1); H.put(1); while (S.isEmpty() = false) { curr = S.pop();

if (curr == goal) { printf("found!\n”);

return ;}

if (h.contain(curr * 3) == false) { s.push(curr * 3); h.add(curr * 3); } if (h.contain(curr / 2) == false) { s.push(curr / 2); h.add(curr / 2); } }}

SOLUTION (2ND ATTEMP) BY RECURSIVE Issue

Path is lost Never mind

that for now

void DFS(int goal,int curr,Hash h) { if (curr == goal) { printf(“found\n”); } else { if ( !h.contain(curr * 3)) { h.add(curr * 3); DFS(goal,curr * 3,h); } if ( !h.contain(curr / 2)) { h.add(curr / 2); DFS(goal,curr / 2,h); } }}

SOLUTION (3RD ATTEMP) BFS What’s the

difference?void DFS(int goal) { Queue q; // store int Hash h; q.enq(1); H.put(1); while (S.isEmpty() = false) { curr = q.deq();

if (curr == goal) { printf("found!\n”);

return ;}

if (h.contain(curr * 3) == false) { s.enq(curr * 3); h.add(curr * 3); } if (h.contain(curr / 2) == false) { s.enq(curr / 2); h.add(curr / 2); } }}

CONCLUSION

DFS Stack-based Use less space Might loops forever if number of steps is not fixed

BFS Queue-based Use large space Found solution nearest to the root node in the search tree

BACKTRACKING AND BRANCH & BOUNDTechnique to reduce enumeration

MAIN IDEA

We should not enumerate solution that will never produce a solution

We have done that!!! 8-queens By naïve combination, we will have to do all 648

But, by each improvement, we further reduce what we have to do

ANOTHER EXAMPLE: PERMUTATION BY COMBINATION

0 1 2

10 11 12 20 21 2200 01 02

000

001

002

010

011

012

020

021

022

100

101

102

110

111

112

120

121

122

200

201

202

210

211

212

220

221

222

PARTIAL SOLUTIONBACKTRACKING AND B&B

Partial Solution Solution that we are generating

Not complete Should be apprehensible in some sense

We can make something out of the partial solution

Backtracking and B&B work with problem with partial solution

Pushing the concept of “do not generating something that won’t lead to answer”

BACKTRACKING

If we know, at any step, that the solution is not feasible Then, it is futile to further search along that path

Try drawing search tree of 4-queen problem

4-QUEEN

Q Q Q Q

QQ

QQ

QQ

QQ

QQ

QQ

…

…

Should we proceed on these state?

If not, how much state do we save?

SUM OF SUBSET PROBLEM

Input: Array D of positive integer A number K

Output A subset of D whose summation is K

Example D = {2,5,7,1,3,8} K = 9 Solution is {2,7} or {8,1} or {5,3,1}

SUM OF SUBSET

Search Space Every possible subset of elements Described by a sequence of selection bit

(1,0,1,0,0,0)

Generating all solutions Step: choose either we select each item

Checking each solution Sum of the selected item

and test whether it equals to target

SUM OF SUBSET BY DFSvoid ss(int step,int* sol) { if (step < n) { sol[step] = 0; ss(step + 1,sol); sol[step] = 1; ss(step + 1,sol); } else { int sum = 0; for (int i = 0;i < n;i++) if (sol[i] == 1) sum += D[i]; if (sum == K) { printf("YES!\n"); } }}

k = targetD = given arrayn = size of array

BACKTRACKING FOR SUM OF SUBSET

If summation of selected element is more than K, stopvoid ss(int step,int* sol,int total) { if (step < n) { if (total > K) break; sol[step] = 0; ss(step + 1,sol,total); sol[step] = 1; ss(step + 1,sol,total+D[i]); } else { int sum = 0; for (int i = 0;i < n;i++) if (sol[i] == 1) sum += D[i]; if (sum == K) { printf("YES!\n"); } }}

BACKTRACKING FOR SUM OF SUBSET

If summation of selected element is more than K, stopvoid ss(int step,int* sol,int total) { if (step < n) { if (total > K) break; sol[step] = 0; ss(step + 1,sol,total); sol[step] = 1; ss(step + 1,sol,total+D[i]); } else { if (total == K) { printf("YES!\n"); } }}

We don’t really need to compure sum at the final stepBecause total already do that for us

BRANCH AND BOUND

B&B is for optimization problem Consider “Maximizing Problem” for example Need bounding heuristic

Something that can tell us “what is the guaranteed minimal of our solution in the remaining steps”

If the value of the current partial solution + value of the heuristic is less than that of any candidate solution Stop

A special version of Backtracking

BRANCH & BOUND IN OPTIMIZATION PROBLEM For many problems, it is possible to assert its goodness

even the solution is not complete If we can predict the best value for the remaining steps,

then we can use that value to “bound” our search

EXAMPLE

Assuming that we have 10 steps At step 7, the goodness of the partial solution is X Assuming that we know that the remaining step could

not gain more than Y If we have found a solution having value better than X+Y

We can simply “bound” the search

KEYS

We must know the so-called “upper bound” of the remaining step It should be computed easily

EXAMPLE

23 35 2

Let value at this point be 10

If we know that this path never bet higher than 13 (which make

10 + 13 < 35)We can neglect it

KNAPSACK PROBLEM

Given a sack, able to hold W kg Given a list of objects

Each has a weight and a value Try to pack the object in the sack so that the total value

is maximized

THE PROBLEM

Input A number W, the capacity of the sack n pairs of weight and price ((w1,p1),(w2,p2),…,(wn,pn))

wi = weight of the ith items pi = price of the ith item

Output A subset S of {1,2,3,…,n} such that

is maximum

Si

ip

Si

i Ww

KNAPSACK BY SEARCH

void knapsack(int step,int* sol) { if (step < n) { sol[step] = 0; knapsack(step + 1,sol); sol[step] = 1; knapsack(step + 1,sol); } else { int sumP = 0; int sumW = 0; for (int i = 0;i < n;i++) if (sol[i] == 1) { sumP += p[i]; sumW += w[i]; } if (sumP > max && sumW <= W) { max = sum; } }}

KNAPSACK BY SEARCH REV 2.0

void knapsack(int step,int* sol,int sumP,int sumW) { if (step < n) { sol[step] = 0; knapsack(step + 1,sol,sumP,sumW); sol[step] = 1; knapsack(step + 1,sol,sumP + v[step],sumW + w[step]); } else { if (sumP > max && sumW <= W) { max = sum; } }}

KNAPSACK WITH BACKTRACKING

Knapsack is similar to sum of subset In SS, we need summation of element to be equal to K In KS, we need summation of weight to be less than W

So, obviously, we should backtrack when weight sum of selected item is more than W

KNAPSACK WITH BACKTRACKING

void knapsack(int step,int* sol,int sumP,int sumW) { if (sumW > W) return ; if (step < n) { sol[step] = 0; knapsack(step + 1,sol,sumP,sumW); sol[step] = 1; knapsack(step + 1,sol,sumP + v[step],sumW + w[step]); } else { if (sumP > max && sumW <= W) { max = sum; } }}

Stop when sumW is more than W

BRANCH AND BOUND IN KNAPSACK

Assume that we have 10 items (N = 10) Right now, our max is 100 Right now, we are searching at step = 5 Right now, our sumP = 20

If v[5] + v[6] + … + v[9] is 75, should we proceed?

KNAPSACK WITH BACKTRACKING + B&B

void knapsack(int step,int* sol,int sumP,int sumW) { if (sumW > W) return ; if (tail[step] + sumP < max) return ; if (step < n) { sol[step] = 0; knapsack(step + 1,sol,sumP,sumW); sol[step] = 1; knapsack(step + 1,sol,sumP + v[step],sumW + w[step]); } else { if (sumP > max && sumW <= W) { max = sum; } }}

Stop when sumW is more than WStop when remaining + sumP is less than current MAX

total [n-1] = v[i];for (int i = n – 2;i > =0;i--) tail[i] = tail [i+1] + v[i];

KNAPSACK WITH BACKTRACKING + B&B REV 2.0

void knapsack(int step,int* sol,int sumP,int sumW) { if (sumW > W) return ; if (tail[step] + sumP < max) return ; if (step < n) { sol[step] = 1; knapsack(step + 1,sol,sumP + v[step],sumW + w[step]); sol[step] = 0; knapsack(step + 1,sol,sumP,sumW); } else { if (sumP > max && sumW <= W) { max = sum; } }}

total [n-1] = v[i];for (int i = n – 2;i > =0;i--) tail[i] = tail [i+1] + v[i];

Why?

Does the search tree differ?

VARIATION Rational Knapsack

Object is like a gold bar, we can cut it in to piece with the same value/weight

Can be solved by greedy Sort object according to value/weight ratio Pick objects by that ratio

If object is larger than the remaining capacity, just divide it

0-1 KNAPSACK WITH B&B

0-1 knapsack is very suitable for B&B We can calculate the goodness of the partial solution

Just sum the value of the selected objects We have fast, good upper bounds (several one)

The sum of remaining unselected objects The sum of remaining unselected object that don’t exceed the

capacity The solution of the “rational knapsack” of the remaining objects

with the remaining capacity

BOUNDING HEURISTIC Maximization problem (finding highest value solution)

Bound must be higher than (or equal to) the real value (upper bound) Good bound lowest value that is >= the real value

Minimization problem (finding lowest value solution) Bound must be lower than (or equal to) the real value (lower

bound) Good bound highest value that is <= the real value

Key Never under estimate!!! We stop when current cost + future cost is sure loser. Future cost can be over estimate

optimistic

LEAST COST SEARCHa.k.a Best First Search

LEAST COST SEARCH

For optimization problems, good solution helps backtracking and B&B The better the value, the higher chance that backtracking

and B&B could benefit Why do DFS or BFS? We have “incentive” to find “good” solution earlier Search toward “promising” solution

Guided search

LEAST COST SEARCH

Use priority queue in search framework Key value in PQ is calculated from the partial solution

It’s called “Best First Search” Guided by the value of partial solution Suffers similar problem as BFS (since we need to maintain

the queue)

In practice, LC-Search always employ B&B and Backtracking

LC-SEARCH WITH BRANCH AND BOUND

Use bounding heuristic for value in PQ value stored in the PQ is X + Y

X = current value of the partial solution Y = value from the heuristic of the partial solution

I.e., use the bound of total value for guiding the search For maximization problem use upper bound of total value For minimization problem use lower bound of total value

Can guarantee minimality of the first solution

15 PUZZLE PROBLEM

Given a puzzle board Objective is to move pieces around

so that the board be like the figure on the right

15 PUZZLE PROBLEM

Input A board (solvable)

Output Movement of pieces to solve the

puzzle Using minimal moves

i.e., trying to minimize move

It’s Minimization problem

15 PUZZLE PROBLEM

Search Space Every possible movement Described by a sequence of move of the “empty” piece

(U,D,D,L,U,R,…) Don’t know the limit on the length of sequence

Generating all solutions Step: choose directions of empty piece

Checking each solution Simulate the move, see whether it leads to solution

BOUNDING HEURISTIC FOR 15-PUZZLE

Number of misplaced piece Obviously is a lower bound

1 3 46 2 11 105 8 7 914 12 15 13

1 3 46 2 11 105 8 7 914 12 15 13

1 3 46 2 11 105 8 7 914 12 15 13

1 2 3 46 11 105 8 7 914 12 15 13

Misplace = 12

Misplace = 13Moved = 1

Misplace = 13Moved = 1

Misplace = 11Moved = 1left right down

1 2 3 46 11 10

5 8 7 914 12 15 13

1 2 3 46 11 105 8 7 914 12 15 13

1 2 3 46 8 11 105 7 914 12 15 13

10 + 2 11 + 2

11 + 2

ANOTHER BOUNDING HEURISTIC

L-1 Distance of misplace piece For example, consider piece #12

It is at row 4 col 2 It should be at row 3 col 4 Distance = 1 row 2 col = 1 + 2 = 3

Obviously is a lower bound as well Better lower bound (closer to the actual # of moves)

1 3 46 2 11 105 8 7 914 12 15 13

1 2 3 45 6 7 89 10 11 1213 14 15

current

goal

CONCLUSION

Search Space Solution space State space tree, search tree

Solution Partial solution Candidate solution, solution state

Algorithm DFS (Depth first search) BFS (Breadth first search) LCS (Best first search) Backtracking Branch and Bound

Generating procedure gives “structure” of the search tree

Algorithm gives “order” of the search tree exploration (including prunning)

State Space Search

Documents

Transcript of State Space Search