Informed search algorithms - Computer Science · 2nd Term 2013 Informed Search Search strategies...

Informed search algorithms

Section 3.5 Russell & Norvig

2nd Term 2013 Informed Search

Outline

•  Informed search – Greedy search – A* search –  IDA* – Heuristics


Search strategies

•  A search strategy is defined by picking the order of node expansion

•  Let g(n) be the distance n’s state is from the initial state.

•  Depth-first search strategy is pick node with highest g-value.

•  Breadth-first search strategy is pick node with lowest g-value.

Best-first search strategy

•  Given a set of nodes on the fringe of a search, which one is best to expand next? – Based on what criteria?




•  Criteria: expand best nodes first, i.e., those along an optimal solution path – How do we do that?




•  Criteria: expand best nodes first, i.e., those along an optimal solution path – How do we do that?

•  Use additional information to suggest such nodes.



Informed Search Strategies

•  Informed Search Strategies use information beyond the problem desc.

•  We will only look at functions that “guess” distance from a state to nearest goal state.

•  h(n) is the function that guesses how far n’s state is from its nearest goal state.


Romania with step costs in km


Best-first search •  Idea: use a function f(n) for each node

–  f(n) is an estimate of "desirability” of a node –  Expand most desirable unexpanded node

•  Implementation:

Order the nodes in fringe in decreasing order of desirability (normally, higher f is then less desirable)

•  Uninformed Search:

–  Depth-first: f(n) = -g(n) –  Breadth-first: f(n) = g(n)

Best-first informed search strategies

•  Greedy Search

•  A* Search

•  Iterative Deepening A* (IDA*)

•  Weighted A* Search 2nd Term 2013 Informed Search


Outline



Greedy search

•  Evaluation function: f(n) = h(n)

•  h(n) = estimate of cost from n to goal – e.g., hSLD(n) = straight-line distance from n to

Bucharest •  Greedy search expands the node that

appears to be closest to goal


Greedy best-first search example

Why greedy search is attractive

•  With a decent enough heuristic, goes almost directly to goal.

•  Best case: time and space are linear

•  So, why not always do greedy search?



Properties of greedy best-first search

•  Complete? No, has same problem with infinite graphs as depth-first search

•  Time? O(bm), but a good heuristic can give dramatic improvement

•  Space? O(bm) -- keeps all nodes in memory

•  Optimal? No

Summary

•  Search strategy defines a traversal of the search space, e.g., pick lowest f(n).

•  Informed search strategies use information outside of problem description.

•  One such type of information is estimated distance to nearest goal: h(n).

•  Greedy search: f(n) = h(n).



Outline



A* search •  Greedy’s problem is that it doesn’t care how

expensive the current path already is •  Idea: avoid expanding paths that are already

expensive •  Evaluation function f(n) = g(n) + h(n)

–  g(n) = cost so far to reach n –  h(n) = estimated cost from n to goal –  f(n) = estimated total cost of path through n to

goal •  Note: assume Goal(n) -> h(n) = 0 •  Note: h(n) ≥ 0


A* search example

A* search

•  Is A*, as given, necessarily optimal?


A* search


•  Why not?


A* search


•  Why not?

•  What would we need in order for it to be optimal?



Admissible Heuristics •  A heuristic h(n) is admissible if for every node n,

h(n) ≤ the true cost to reach goal state from n. •  Admissible heuristic never overestimates cost to

reach goal, i.e., it is never pessimistic •  Example: hSLD(n) (never overestimates actual road

distance) •  Theorem: If h(n) is admissible, A*, using TREE-SEARCH,is optimal.

•  Challenges: Why would using tree search for A* be odd? Give a heuristic that would always be admissible.


Consistent Heuristics •  A heuristic is consistent if for every node n, every successor n' of n

generated by any action a,

h(n) ≤ c(n,a,n') + h(n') •  If h is consistent, we have f(n') = g(n') + h(n') = g(n) + c(n,a,n') + h(n') ≥ g(n) + h(n) = f(n) •  i.e., f(n) is non-decreasing along any path. •  Theorem: If h(n) is consistent, A* using GRAPH-SEARCH is optimal and

the first time you select a state, you have found the best path to that state!

•  Why? Can you prove this?

Monotone heuristics •  If h is consistent then monotone (f-values

of nodes along path never decrease) •  Monotonicity & consistency logically

equivalent, i.e., 2 views of same thing. •  Often, one view more helpful than other.

•  Can you prove monotonicity -> consistency ?



Optimality of A*

•  A* expands nodes in order of increasing f value •  Gradually adds "f-contours" of nodes •  Contour i has all nodes with f=fi, where fi < fi+1


Properties of A* •  Complete? Yes (unless there are infinitely

many nodes with f ≤ cost of optimal solution)

•  Time? Exponential

•  Space? Exponential

•  Optimal? Yes

Can we reduce our exponential requirements?

•  Exponential time is less of a limit than exponential memory: – 17 years for checkers

•  What did we do about exp memory reqs for breadth-first search?

•  Can we do something similar for A*? 2nd Term 2013 Informed Search


Outline



IDA*

•  Iterative Deepening A* is to A* as Iterative Deepening is to Breadth-first search.

•  Instead of iterating on depth, IDA* iterates on f-limit.

•  Initial f-limit is h(initialState) & for iteration i’s f-limit, it is the least f-value that exceeded the i-1th f-limit.

IDA* cont’d •  IDA* is optimal and only uses linear space. •  However, since IDA* does tree-search, it usually

cannot detect when it hits a state a 2nd time. –  Thus it can expand parts of the search space many

times on the same iteration, this can result in an exponential space blow-up.

•  Additionally, IDA*, like ID, repeatedly re-expands earlier parts of search tree on each iteration.

•  If enough memory is available A* is usually better than IDA*.


Challenge Questions

•  How could IDA* ever expand fewer nodes than A*? How or why not?

•  If h were admissible but inconsistent, would IDA* still be optimal? Why or why not?



Outline


Heuristics

•  Informed search is also known as heuristic search.

•  The quality of the heuristic determines how much search is needed to find solution.

•  Perfect heuristic means no search. 2nd Term 2013 Informed Search


Example of Admissible Heuristics E.g., for the 8-puzzle: •  h1(n) = number of misplaced tiles •  h2(n) = total Manhattan distance (i.e., no. of squares from desired location of each tile)

•  h1(S) = ? •  h2(S) = ?


Admissible heuristics E.g., for the 8-puzzle: •  h1(n) = number of misplaced tiles •  h2(n) = total Manhattan distance (i.e., no. of squares from desired location of each tile)

•  h1(S) = ? 8 •  h2(S) = ? 3+1+2+2+2+3+3+2 = 18


When is one heuristic guaranteed to be no worse than another?

•  If h2(n) > h1(n) for all non-goal n (both h’s admissible) then h2 dominates h1 and h2 is “better” for search (i.e.,

cannot expand more nodes)

•  Does h2 (manhattan distance) dominate h1 (misplaced

tiles) ?

When is one heuristic likely to better than another?

•  When its average h-value is higher. •  Is Manhattan likely to be better than

misplaced tiles? –  Hint: can Manhattan ever be less than misplaced?

•  Typical search costs (average number of nodes expanded): –  d=12 IDS = 3,644,035 nodes

A*(h1) = 227 nodes A*(h2) = 73 nodes

–  d=24 IDS = too many nodes A*(h1) = 39,135 nodes A*(h2) = 1,641 nodes


Why is higher average heuristic value better?


Analysis: How much better can a higher average heuristic value be?

•  Uninformed search tree size formula – bd where b is the effective branching factor

and d is the length of the optimal solution •  Informed search tree size formula:

– bd-r where r is effectively the “average” heuristic value

•  Given 2 heuristics along with their average heuristics values, how much better could one be than the other (on average)?


Where do Heuristics Come From?

•  Make a “simpler” version of the problem, cost of that solution is less than or equal to cost of real solution.

•  Can make simpler versions via: – Approximations – Reformulations – Abstractions – Decompositions

•  Most current heuristics either abstract the operators or the states.



Abstraction Example •  A problem with fewer restrictions on the actions is called

a relaxed problem

•  The cost of an optimal solution to a relaxed problem is an admissible heuristic for the original problem –  If the rules of the 8-puzzle are relaxed so that a tile can move

anywhere, then you have h1(n) –  If the rules are relaxed so that a tile can move to any adjacent

square, then you have h2(n) •  Note that while optimal solutions to relaxed problems can

never be longer than optimal solutions to the original problem, the effort to find a solution to a relaxed problem can be greater.

•  Challenge: Why?

Dimensions of Heuristics

•  Heuristics can be measured by how accurate they are and by how much it costs to compute them.

•  Accurate heuristics reduce the number of states expanded and expensive heuristics increase the per-node costs.

•  Usually there is an direct relation between these two dimensions.


Tradeoffs for heuristics •  Cost of search = #Nodes X time-per-node

•  Can increase accuracy of heuristic to reduce #Nodes but that may increase time-per-node

•  Can reduce heuristic’s time-per-node, but usually this increases #Nodes

•  The quest is to find the right balance (tradeoff) between cost and accuracy.


Summary

•  While for greedy search f(n) = h(n), for A* and IDA* f(n) = g(n) + h(n).

•  h(n) is the estimated distance from n’s state to its nearest goal.

•  If h never overestimates the distance then it is admissible.

•  If h is admissible then A* and IDA* are guaranteed to be optimal.


Summary cont’d •  If h(n) obeys the “triangle” inequality then it

is consistent. •  h(n) consistent iff f(n) monotonically

increasing. •  If h(n) is admissible & consistent then first

time A* picks state to expand, it is shortest path to that state.

•  IDA* is to A* as IDS is to breadth-first search.


Summary cont’d •  h1 dominates h2 iff for all non-goal nodes n :

h1(n) > h2(n) •  If h1 dominates h2 then except for ties h1 will

never expand a node that h2 does not •  Usually, using whichever heuristic has the

largest average h value, A* will expand the fewest nodes.

•  Heuristics speed A* up exponentially. •  Need to balance cost and accuracy of

heuristic


Informed search algorithms - Computer Science · 2nd Term 2013 Informed Search Search strategies...

Documents

Transcript of Informed search algorithms - Computer Science · 2nd Term 2013 Informed Search Search strategies...