An example of two searches
Problems with heuristic search
One advantage of Breadth-first and Depth-first Search is their
disadvantage is that they don't use any problem-specific
information to guide them.
We now move away from these uninformed search methods and begin to
investigate informed (or heuristic) search techniques.
The general ideas behind informed search techniques are the following:
Algorithms for informed search are still general.
The problem-specific information is encoded in an evaluation
- apply general search principles
- but, if you have information available, allow it to guide you in
selecting the next state to expand
- expand the most promising states first
Informed search techniques require an evaluation function. The evaluation
is applied to states (nodes) in the search tree.
- the function returns a number describing the desirability of
expanding a node.
- in order for it to direct search toward the goal, the
evaluation function incorporates an estimate of the path cost from the state
to a goal.
In best-first search, the general idea is to always expand the most desirable
node on the frontier (fringe). The frontier of the search
space includes all nodes
that are currently available for expansion.
Types of best-first search include:
- greedy search
- A* (recall that Shakey used A* for path planning.)
For a node n in the search tree,
Let h(n) = an estimate of the cost of the cheapest path from n to
a goal state. This is the (heuristic) evaluation function.
(Note that h(n) is 0 when n is a goal state.)
Greedy search selects the next node to expand based upon the function h.
Greedy search can work quite well, but it does have problems:
- There can be false starts.
That is, if h(n) greatly underestimates the distance of certain
nodes to the goal, the algorithm can begin to follow paths that do not
lead to the goal.
Evaluating Greedy search on the four criteria:
- It doesn't look at the overall picture. i.e., it doesn't take distance
traveled into account!
- not complete
- not optimal
- O(bm) time and space complexity; m = max depth of the
A* search corrects for the problem of "greed". In evaluating a
given node/state, it takes into account both the distance traveled so far
and the estimate of distance to the goal.
For a node n in the search tree,
let f(n) = g(n) + h(n),
where g(n) = the cost of getting to n from the initial state.
and h(n) is as before.
We can think of f(n) as an estimate of the cost of the cheapest
solution from the initial state to a goal through n.
The wonderful thing about this search is that if h is
admissible, then the search will be optimal and complete! i.e.,
the best solution will be found. (For the
math majors among you, yes there is a proof of this!)
Def. An admissible heuristic is one that never
overestimates the cost to reach the goal.
Some examples of admissible heuristics:
- Straight-line distance in any sort of distance-related problem.
- Manhattan distance in grid-like problems.
- Number of differences between current and goal states (in cases where
multiple differences cannot be resolved in a single step).
An example of the two searches
Lets compare the behavior of Greedy search and A* search on the 5-puzzle.
Well define g(n) and h(n) as follows:
Let g(n) be the number of steps taken from the initial state to n.
Let h(n) be the number of tiles out of place.
Problems with heuristic (informed) search
In spite of its optimality and completeness, A* still has problems:
For most problems, the number of nodes on the frontier of the search space
is still exponential in the length of the solution. That is, the search tree
can still grow to be as "bushy" as in Breadth-first Search.
So there can be problems with respect to the amount of memory needed to run A*
General drawbacks of heuristic (informed) search include the following:
When at a loss for a good heuristic function, consider a relaxed version
of the problem. An exact solution to a relaxed problem might be a good
heuristic for the real problem. For example, consider a sliding tile
puzzle. One relaxed version of the puzzle is one in which the tiles can
simply be picked up and put into place. Therefore, one possible heuristic
is to count the number of tiles out of position, since simply placing them
would solve the relaxed problem.
- need to keep everything on the frontier in memory
- need to be able to do cost comparisons quickly if the heuristic
function is extremely complex, the search will not be fast.
- need to choose good heuristics this is not easy for many