State Space Search
description
Transcript of State Space Search
STATE SPACE SEARCHNattee Niparnan
OPTIMIZATION EXAMPLE: FINDING MAX VALUE IN AN ARRAY
25 2 34 43 4 9 0 -5 87 0 5 6 1
There are N possible answersThe first elementThe second element3rd, 4th …
Try all of themRemember the best one
STATE SPACE SEARCH FRAMEWORK
Define the set of admissible solution Generate all of them (generating) For each generated solution
Test whether it is the one we want By the “evaluation” function (testing)
for optimization: remember the best one so far For decision: report when we found the “correct” one
VERY IMPORTANT!!
SOLVING PROBLEM BY STATE SPACE SEARCH
1. Define the set of admissible solution (the search space)
What is it? How large? How to represent the solution
2. Determine how to generate all solutions3. Determine how to check each solution
GENERATING IN STEPS
In most problem, admissible solutions can be generated iteratively Step-by-step
Example Maximum Sum of Subsequence Minimal Spanning Tree Pachinko Problem
SEARCH: MAXIMUM SUM OF SUBSEQUENCE
Search Space Every possible sequence Described by a pair of starting,ending element
Generating all solutions Step 1: select the position of the first element Step 2: select the position of the last element
Checking each solution Sum and test whether it is maximum
SEARCH: MINIMAL SPANNING TREE
Search Space Every subset of edges Described by a set of edges
Generating all solutions For every edge
Either include or exclude it from the set
Checking each solution Sum the cost in the set
and test whether it is maximum Test whether it is a tree Test whether it is connected
SEARCH: MINIMAL SPANNING TREE 2ND ATTEMP
Search Space Every subset of edges of size |V|-1 Described by a set of edges
Generating all solutions For every edge
Either include or exclude it from the set Do not select more than |V|-1 edges
Checking each solution Sum the cost in the set
and test whether it is maximum Test whether it is a tree
SEARCH: MINIMAL SPANNING TREE 3RD ATTEMP
Search Space Every subgraph of size |V|-1 edge that is a tree Described by a set of edges
Generating all solutions For every edge in X in “cut property”
Either include or exclude it from the set Update the tree
Do not select more than |V|-1 edges
Checking each solution Sum the cost in the set
and test whether it is maximum
SEARCH: PACHINKO
Search Space Every path from the top pin to the bottom pin Described by a sequence of direction
(left,left,right,left)
Generating all solutions For every level
Choose either the left side or right side
Checking each solution Sum the cost in the path
and test whether it is maximum
GENERATING ALL POSSIBLE ANSWERS
COMBINATION AND PERMUTATION
In many case, the set of the admissible solutions is a set of “combination” or “permutation” of something
We need to knows how to generate all permutations and combinations
COMBINATION
Given N things Generate all possible selections of K things from N
things
Ex. N = 3, k = 2
COMBINATION WITH REPLACEMENT
Given N things Generate all possible selections of K things from N
things When something is selected, we are permit to select that
things again (we replace the selected thing in the pool)
Ex. N = 3, k = 2
BREAKING THE PADLOCK
BREAKING THE PADLOCK Assuming we have four rings Assuming each ring has following mark
We try
…. Undone the
second step, switch to
another value
KEY IDEA
A problem consists of several similar steps Choosing a things from the pool
We need to remember the things we’ve done so far
GENERAL FRAMEWORK
Storage
Gemerated (partial) solution
Engine
Initial Step
1. Get a step that is not complete
2. Try all possible choice in next step
3. Store each newly generated next step
SOLUTION GENERATION
For all symbol i New = curr+I Storage.push(new)
If length(curr) == 4
Storage ss ‘’While s is not empty
Curr s.getIf Curr is the last step
evaluateElse
Generate all next step from Currput them into S
SEARCH SPACE
Set of all admissible solutions E.g., Combination Padlock
Search space = 0000 3333
SEARCH SPACE ENUMERATION
Key Idea: generate step-by-step, Undo previous step when necessary
Usually using some data structure to implement the storage E.g., using stack, either explicitly or implicitly from the
processor stack i.e., using the “recursive” paradigm
Queue is a possible choice
EXAMPLE: PADLOCK
Generate all combinations of lock key Represent key by int
=0=1=2=3
Step = sequence of selected symbolsE.g.,
’01’ ‘0003’
EXAMPLE: PADLOCK
The process automatically remember the step (by sol array) and by stack (int i)
void search(int step,int *sol){ if (step < num_step) { for (int i = 0; i < num_symbol; i++) {
sol[step] = i; search(step + 1,sol);}
} else { check(sol); }}
SEARCH TREE
A tree representing every step in the seach Similar to Recursion Tree
Actually, it is related Node:
Each step in solution generation Edge:
Connects two nodes such that one node is generated from applying one step to the other node
SEARCH TREE
0 1 2 3
00 01 02 03 …
… … … …… … … …… … … …
SEARCH TREE Sols in each step
0 00 000 0000 check 000 0001 check 000 0002 check 000 0003 check 000 00 001 0010 check
8-QUEEN PROBLEM
8-QUEEN PROBLEM
Given a chess board with 8 queens
X X X
X X X
X X X X
X X X
X X X
8-QUEEN PROBLEM
Try to place the queens so that they don’t get in the others’ ways
Q
Q
Q
Q
Q
Q
Q
Q
8-QUEEN PROBLEM
Input:None!
Output:Every possible placement of 8-queens that does
not jeopardize each other
SOLVING THE PROBLEM
Define the search space What is the search space of this problem? How large it is?
Choose an appropriate representation
1ST ATTEMP
Every possible placement of queens Size: 648
Representation: a set of queens position E.g., (1,1) (1,2) (2,5) (4,1) (1,2) (3,4) (8,8) (7,6)
This includes overlapping placement!!!
2ND ATTEMP
Another representation Try to exclude overlapping
Use combination without replacement This is a combination
Selecting 8 positions out of 64 positions Size: (64)! / (64 – 8)! * 8!
Implementation: in the “generating next step”, check for overlapping
2ND ATTEMP (FIRST IMPLEMENTATION)
We go over all positions For each position, we either “choose” or “skip” that position
for the queen
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
COMBINATION WITHOUT REPLACEMENT
void e_queen(int step,int *mark_on_board) { if (step < 64) { mark_on_board[step] = 0; e_queen(step + 1,mark_on_board); mark_on_board[step] = 1; e_queen(step + 1,mark_on_board); } else { check(mark_on_board); }}
* Check if OK* Also has to check whether we mark
exactly 8 queens
Mark_on_board is a binary array to indicate whether the position is
selected
2ND ATTEMP ISSUE
The generated mark_on_board includes 000000000000 select no position (obviously not the
answer) 111111000000 select 6 positions (obviously not the
answer)
We must limit our selection to be exactly 8
2ND ATTEMP (SECOND IMPLEMENTATION)void e_queen(int step,int *mark_on_board,int chosen) { if (step < 64) { if ((64 – 8) – (step – chosen) > 0) { mark_on_board[step] = 0; e_queen(step + 1,mark_on_board,chosen); } if (8 - chosen > 0) { mark_on_board[step] = 1; e_queen(step + 1,mark_on_board,chosen+1); } } else { check(mark_on_board); }}
Number of possible 0
Number of possible 1
e_queen(0,mark,0);
2ND ATTEMP (SECOND IMPLEMENTATION) REV 2.0void e_queen(int step,int *mark_on_board,int one,int zero) { if (step < 64) { if (zero > 0) { mark_on_board[step] = 0; e_queen(step + 1,mark_on_board,one,zero - 1); } if (one > 0) { mark_on_board[step] = 1; e_queen(step + 1,mark_on_board,one – 1,zero); } } else { check(mark_on_board); }}
Number of possible 0
Number of possible 1
e_queen(0,mark,8,64 - 8);
3RD ATTEMP
Any better way? For each row, there should be only one queen
The problem consists of 8 step Placing each queen
Size: 88
Representation: sequence of columns E.g., (1,2,3,4,5,6,7,8)
3RD ATTEMP IMPLEMENTATION
There are eight possible ways in each step There are eight steps Very similar to the combination problem
void e_queen(int step,int *queen_pos) { if (step < 8) { for (int i = 0; i < 8; i++) {
queen_pos[step] = i; e_queen(step + 1, queen_pos);}
} else { check(queen_pos); }}
4TH ATTEMP
Queen should not be in the same column The solution should never have any column repeated
E.g., (1,2,3,4,5,6,7,1) is bad (column collision (1,1,3,4,5,6,7,5) is bad as well….
(1,2,3,4,5,6,7,8) is good There should be no duplicate column index!!!
PERMUTATION Given N symbols A permutation is the element arrange in any order
E.g., 1 2 3 4 Shows
1 2 3 4 1 2 4 3 1 3 2 4 1 3 4 2 … 4 3 2 1
For each step, we have to known which one is used
PERMUTATION The problem consists of several similar steps Special condition
Symbols never repeat How to do?
Easy way: Generate all combination (as done before)
Check for ones that symbols do not repeat Better way:
Remember what symbols are used
PERMUTATIONvoid search(int step,int *sol) { if (step < num_step) { for (int i = 0; i < num_symbol; i++) { if not_used(sol,i,step) {
sol[step] = i; search(step,sol);
}}
} else { check(sol); }}
Bool not_used(int *sol,int value,int step) { for (int i = 0;i < step; i++) { if (sol[i] == value) return false; } return true;}
PERMUTATION
More proper ways
void search(int step,int *sol,bool *used){ if (step < num_step) { for (int i = 0; i < num_symbol; i++) { if (!used[i]) {
used[i] = true;sol[step] = i;
search(step,sol,used); used[i] = false; }
} } else { check(sol); }}
INCRESING SEQUENCE
Given N Find any sequence of (a1,a2,a3,…) such that
a1 +a2 + a3 +… + ak = N ai > 0 ai <= aj for all i < j ai is an integer
EXAMPLE
N = 4 1 + 1 + 1 + 1 1 + 1 + 2 1 + 3 2 + 2 4
DFS AND BFS ALGORITHM
BREADTH FIRST SEARCH AND DEPTH FIRST SEARCH
Can we draw a search tree from this pseudo-code? Assume that we know how to generate next step?
Do we know the growth (in term of structure, not size) of the search tree?
Storage ss ‘’While s is not empty
Curr s.getIf Curr is the last step
evaluateElse
Generate all next step from Currput them into S
BREADTH FIRST SEARCH AND DEPTH FIRST SEARCH
Search Tree grows according to S and how to generate all next step as well,
If S is “stack”, the algorithm is called DFS If S is “queue”, the algorithm is called BFS
Storage ss ‘’While s is not empty
Curr s.getIf Curr is the last step
evaluateElse
Generate all next step from Currput them into S
TRIPLE AND HALF PROBLEM
Input: A number
Output: A sequence of either * 3 or / 2 operation, start from 1
that is evaluated as the given number Special version: number of operations should be minimal
Example: Input: 10 Output: 1 * 3 * 3 * 3 * 3 / 2 / 2 / 2 / 2 Input: 31 Output: 1 * 3 * 3 * 3 * 3 * 3 / 2 / 2 / 2 / 2 / 2 * 3 * 3 / 2
TRIPLE AND HALF PROBLEM
Search Space Every possible sequence of triple or half Described by a sequence of operation
(T,H,T,H,T) Issue: There is no clear length-limit of sequence!!
Generating all solutions Step: operator
Choose either the triple or half
Checking each solution Evaluate the expression
and test whether it equals to target
SOLUTION (1ST ATTEMP) Issue
Is [10000] enough?
Repeat solution? 1 * 3 * 3 / 2 *
3 / 2 6 1 * 3 * 3 * 3 /
2 / 2 6 When will the
program stop? Can you draw
the search tree?
void DFS(int goal) { Stack S; // store string S.put ("1"); while (!S.isEmpty()) { curr = S.pop();
if ( test(curr) == goal) { printf("found! %s",curr); return ;}char a[10000],b[10000];strcpy(a,curr);
strcat(a,"*3");S.push(a);strcpy(b,curr);
strcat(b,"/2");S.push(b);
}}
SOLUTION (2ND ATTEMP) Issue
Path is lost Never mind
that for now
void DFS(int goal) { Stack S; // store int Hash h; S.put (1); H.put(1); while (S.isEmpty() = false) { curr = S.pop();
if (curr == goal) { printf("found!\n”);
return ;}
if (h.contain(curr * 3) == false) { s.push(curr * 3); h.add(curr * 3); } if (h.contain(curr / 2) == false) { s.push(curr / 2); h.add(curr / 2); } }}
SOLUTION (2ND ATTEMP) BY RECURSIVE Issue
Path is lost Never mind
that for now
void DFS(int goal,int curr,Hash h) { if (curr == goal) { printf(“found\n”); } else { if ( !h.contain(curr * 3)) { h.add(curr * 3); DFS(goal,curr * 3,h); } if ( !h.contain(curr / 2)) { h.add(curr / 2); DFS(goal,curr / 2,h); } }}
SOLUTION (3RD ATTEMP) BFS What’s the
difference?void DFS(int goal) { Queue q; // store int Hash h; q.enq(1); H.put(1); while (S.isEmpty() = false) { curr = q.deq();
if (curr == goal) { printf("found!\n”);
return ;}
if (h.contain(curr * 3) == false) { s.enq(curr * 3); h.add(curr * 3); } if (h.contain(curr / 2) == false) { s.enq(curr / 2); h.add(curr / 2); } }}
CONCLUSION
DFS Stack-based Use less space Might loops forever if number of steps is not fixed
BFS Queue-based Use large space Found solution nearest to the root node in the search tree
BACKTRACKING AND BRANCH & BOUNDTechnique to reduce enumeration
MAIN IDEA
We should not enumerate solution that will never produce a solution
We have done that!!! 8-queens By naïve combination, we will have to do all 648
But, by each improvement, we further reduce what we have to do
ANOTHER EXAMPLE: PERMUTATION BY COMBINATION
0 1 2
10 11 12 20 21 2200 01 02
000
001
002
010
011
012
020
021
022
100
101
102
110
111
112
120
121
122
200
201
202
210
211
212
220
221
222
ANOTHER EXAMPLE: PERMUTATION BY COMBINATION
0 1 2
10 11 12 20 21 2200 01 02
000
001
002
010
011
012
020
021
022
100
101
102
110
111
112
120
121
122
200
201
202
210
211
212
220
221
222
PARTIAL SOLUTIONBACKTRACKING AND B&B
Partial Solution Solution that we are generating
Not complete Should be apprehensible in some sense
We can make something out of the partial solution
Backtracking and B&B work with problem with partial solution
Pushing the concept of “do not generating something that won’t lead to answer”
BACKTRACKING
If we know, at any step, that the solution is not feasible Then, it is futile to further search along that path
Try drawing search tree of 4-queen problem
4-QUEEN
Q Q Q Q
…
…
Should we proceed on these state?
If not, how much state do we save?
SUM OF SUBSET PROBLEM
Input: Array D of positive integer A number K
Output A subset of D whose summation is K
Example D = {2,5,7,1,3,8} K = 9 Solution is {2,7} or {8,1} or {5,3,1}
SUM OF SUBSET
Search Space Every possible subset of elements Described by a sequence of selection bit
(1,0,1,0,0,0)
Generating all solutions Step: choose either we select each item
Checking each solution Sum of the selected item
and test whether it equals to target
SUM OF SUBSET BY DFSvoid ss(int step,int* sol) { if (step < n) { sol[step] = 0; ss(step + 1,sol); sol[step] = 1; ss(step + 1,sol); } else { int sum = 0; for (int i = 0;i < n;i++) if (sol[i] == 1) sum += D[i]; if (sum == K) { printf("YES!\n"); } }}
k = targetD = given arrayn = size of array
BACKTRACKING FOR SUM OF SUBSET
If summation of selected element is more than K, stopvoid ss(int step,int* sol,int total) { if (step < n) { if (total > K) break; sol[step] = 0; ss(step + 1,sol,total); sol[step] = 1; ss(step + 1,sol,total+D[i]); } else { int sum = 0; for (int i = 0;i < n;i++) if (sol[i] == 1) sum += D[i]; if (sum == K) { printf("YES!\n"); } }}
BACKTRACKING FOR SUM OF SUBSET
If summation of selected element is more than K, stopvoid ss(int step,int* sol,int total) { if (step < n) { if (total > K) break; sol[step] = 0; ss(step + 1,sol,total); sol[step] = 1; ss(step + 1,sol,total+D[i]); } else { if (total == K) { printf("YES!\n"); } }}
We don’t really need to compure sum at the final stepBecause total already do that for us
BRANCH AND BOUND
B&B is for optimization problem Consider “Maximizing Problem” for example Need bounding heuristic
Something that can tell us “what is the guaranteed minimal of our solution in the remaining steps”
If the value of the current partial solution + value of the heuristic is less than that of any candidate solution Stop
A special version of Backtracking
BRANCH & BOUND IN OPTIMIZATION PROBLEM For many problems, it is possible to assert its goodness
even the solution is not complete If we can predict the best value for the remaining steps,
then we can use that value to “bound” our search
EXAMPLE
Assuming that we have 10 steps At step 7, the goodness of the partial solution is X Assuming that we know that the remaining step could
not gain more than Y If we have found a solution having value better than X+Y
We can simply “bound” the search
KEYS
We must know the so-called “upper bound” of the remaining step It should be computed easily
EXAMPLE
23 35 2
Let value at this point be 10
If we know that this path never bet higher than 13 (which make
10 + 13 < 35)We can neglect it
KNAPSACK PROBLEM
Given a sack, able to hold W kg Given a list of objects
Each has a weight and a value Try to pack the object in the sack so that the total value
is maximized
THE PROBLEM
Input A number W, the capacity of the sack n pairs of weight and price ((w1,p1),(w2,p2),…,(wn,pn))
wi = weight of the ith items pi = price of the ith item
Output A subset S of {1,2,3,…,n} such that
is maximum
Si
ip
Si
i Ww
KNAPSACK BY SEARCH
void knapsack(int step,int* sol) { if (step < n) { sol[step] = 0; knapsack(step + 1,sol); sol[step] = 1; knapsack(step + 1,sol); } else { int sumP = 0; int sumW = 0; for (int i = 0;i < n;i++) if (sol[i] == 1) { sumP += p[i]; sumW += w[i]; } if (sumP > max && sumW <= W) { max = sum; } }}
KNAPSACK BY SEARCH REV 2.0
void knapsack(int step,int* sol,int sumP,int sumW) { if (step < n) { sol[step] = 0; knapsack(step + 1,sol,sumP,sumW); sol[step] = 1; knapsack(step + 1,sol,sumP + v[step],sumW + w[step]); } else { if (sumP > max && sumW <= W) { max = sum; } }}
KNAPSACK WITH BACKTRACKING
Knapsack is similar to sum of subset In SS, we need summation of element to be equal to K In KS, we need summation of weight to be less than W
So, obviously, we should backtrack when weight sum of selected item is more than W
KNAPSACK WITH BACKTRACKING
void knapsack(int step,int* sol,int sumP,int sumW) { if (sumW > W) return ; if (step < n) { sol[step] = 0; knapsack(step + 1,sol,sumP,sumW); sol[step] = 1; knapsack(step + 1,sol,sumP + v[step],sumW + w[step]); } else { if (sumP > max && sumW <= W) { max = sum; } }}
Stop when sumW is more than W
BRANCH AND BOUND IN KNAPSACK
Assume that we have 10 items (N = 10) Right now, our max is 100 Right now, we are searching at step = 5 Right now, our sumP = 20
If v[5] + v[6] + … + v[9] is 75, should we proceed?
KNAPSACK WITH BACKTRACKING + B&B
void knapsack(int step,int* sol,int sumP,int sumW) { if (sumW > W) return ; if (tail[step] + sumP < max) return ; if (step < n) { sol[step] = 0; knapsack(step + 1,sol,sumP,sumW); sol[step] = 1; knapsack(step + 1,sol,sumP + v[step],sumW + w[step]); } else { if (sumP > max && sumW <= W) { max = sum; } }}
Stop when sumW is more than WStop when remaining + sumP is less than current MAX
total [n-1] = v[i];for (int i = n – 2;i > =0;i--) tail[i] = tail [i+1] + v[i];
KNAPSACK WITH BACKTRACKING + B&B REV 2.0
void knapsack(int step,int* sol,int sumP,int sumW) { if (sumW > W) return ; if (tail[step] + sumP < max) return ; if (step < n) { sol[step] = 1; knapsack(step + 1,sol,sumP + v[step],sumW + w[step]); sol[step] = 0; knapsack(step + 1,sol,sumP,sumW); } else { if (sumP > max && sumW <= W) { max = sum; } }}
total [n-1] = v[i];for (int i = n – 2;i > =0;i--) tail[i] = tail [i+1] + v[i];
Why?
Does the search tree differ?
VARIATION Rational Knapsack
Object is like a gold bar, we can cut it in to piece with the same value/weight
Can be solved by greedy Sort object according to value/weight ratio Pick objects by that ratio
If object is larger than the remaining capacity, just divide it
0-1 KNAPSACK WITH B&B
0-1 knapsack is very suitable for B&B We can calculate the goodness of the partial solution
Just sum the value of the selected objects We have fast, good upper bounds (several one)
The sum of remaining unselected objects The sum of remaining unselected object that don’t exceed the
capacity The solution of the “rational knapsack” of the remaining objects
with the remaining capacity
BOUNDING HEURISTIC Maximization problem (finding highest value solution)
Bound must be higher than (or equal to) the real value (upper bound) Good bound lowest value that is >= the real value
Minimization problem (finding lowest value solution) Bound must be lower than (or equal to) the real value (lower
bound) Good bound highest value that is <= the real value
Key Never under estimate!!! We stop when current cost + future cost is sure loser. Future cost can be over estimate
optimistic
LEAST COST SEARCHa.k.a Best First Search
LEAST COST SEARCH
For optimization problems, good solution helps backtracking and B&B The better the value, the higher chance that backtracking
and B&B could benefit Why do DFS or BFS? We have “incentive” to find “good” solution earlier Search toward “promising” solution
Guided search
LEAST COST SEARCH
Use priority queue in search framework Key value in PQ is calculated from the partial solution
It’s called “Best First Search” Guided by the value of partial solution Suffers similar problem as BFS (since we need to maintain
the queue)
In practice, LC-Search always employ B&B and Backtracking
LC-SEARCH WITH BRANCH AND BOUND
Use bounding heuristic for value in PQ value stored in the PQ is X + Y
X = current value of the partial solution Y = value from the heuristic of the partial solution
I.e., use the bound of total value for guiding the search For maximization problem use upper bound of total value For minimization problem use lower bound of total value
Can guarantee minimality of the first solution
15 PUZZLE PROBLEM
Given a puzzle board Objective is to move pieces around
so that the board be like the figure on the right
15 PUZZLE PROBLEM
Input A board (solvable)
Output Movement of pieces to solve the
puzzle Using minimal moves
i.e., trying to minimize move
It’s Minimization problem
15 PUZZLE PROBLEM
Search Space Every possible movement Described by a sequence of move of the “empty” piece
(U,D,D,L,U,R,…) Don’t know the limit on the length of sequence
Generating all solutions Step: choose directions of empty piece
Checking each solution Simulate the move, see whether it leads to solution
BOUNDING HEURISTIC FOR 15-PUZZLE
Number of misplaced piece Obviously is a lower bound
1 3 46 2 11 105 8 7 914 12 15 13
1 3 46 2 11 105 8 7 914 12 15 13
1 3 46 2 11 105 8 7 914 12 15 13
1 2 3 46 11 105 8 7 914 12 15 13
Misplace = 12
Misplace = 13Moved = 1
Misplace = 13Moved = 1
Misplace = 11Moved = 1left right down
1 2 3 46 11 10
5 8 7 914 12 15 13
1 2 3 46 11 105 8 7 914 12 15 13
1 2 3 46 8 11 105 7 914 12 15 13
10 + 2 11 + 2
11 + 2
ANOTHER BOUNDING HEURISTIC
L-1 Distance of misplace piece For example, consider piece #12
It is at row 4 col 2 It should be at row 3 col 4 Distance = 1 row 2 col = 1 + 2 = 3
Obviously is a lower bound as well Better lower bound (closer to the actual # of moves)
1 3 46 2 11 105 8 7 914 12 15 13
1 2 3 45 6 7 89 10 11 1213 14 15
current
goal
CONCLUSION
Search Space Solution space State space tree, search tree
Solution Partial solution Candidate solution, solution state
Algorithm DFS (Depth first search) BFS (Breadth first search) LCS (Best first search) Backtracking Branch and Bound
Generating procedure gives “structure” of the search tree
Algorithm gives “order” of the search tree exploration (including prunning)