Informed Search

One advantage of Breadth-first and Depth-first Search is their generality. A disadvantage is that they don't use any problem-specific information to guide them.

We now move away from these uninformed search methods and begin to investigate informed (or heuristic) search techniques.

The general ideas behind informed search techniques are the following:

apply general search principles
but, if you have information available, allow it to guide you in selecting the next state to expand
expand the most promising states first

Algorithms for informed search are still general. The problem-specific information is encoded in an evaluation function

Evaluation Function

Informed search techniques require an evaluation function. The evaluation is applied to states (nodes) in the search tree.

the function returns a number describing the desirability of expanding a node.
in order for it to direct search toward the goal, the evaluation function incorporates an estimate of the path cost from the state to a goal.

Best-first search

In best-first search, the general idea is to always expand the most desirable node on the frontier (fringe). The frontier of the search space includes all nodes that are currently available for expansion.

Types of best-first search include:

greedy search
A* (recall that Shakey used A* for path planning.)

Greedy search

For a node n in the search tree,

Let h(n) = an estimate of the cost of the cheapest path from n to a goal state. This is the (heuristic) evaluation function. (Note that h(n) is 0 when n is a goal state.)

Greedy search selects the next node to expand based upon the function h.

Greedy search can work quite well, but it does have problems:

There can be false starts.

That is, if h(n) greatly underestimates the distance of certain nodes to the goal, the algorithm can begin to follow paths that do not lead to the goal.

It doesn't look at the overall picture. i.e., it doesn't take distance traveled into account!

Evaluating Greedy search on the four criteria:

not complete
not optimal
O(b^m) time and space complexity; m = max depth of the search space.

A* search

A* search corrects for the problem of "greed". In evaluating a given node/state, it takes into account both the distance traveled so far and the estimate of distance to the goal.

For a node n in the search tree,

let f(n) = g(n) + h(n),

where g(n) = the cost of getting to n from the initial state.

and h(n) is as before.

We can think of f(n) as an estimate of the cost of the cheapest solution from the initial state to a goal through n.

The wonderful thing about this search is that if h is admissible, then the search will be optimal and complete! i.e., the best solution will be found. (For the math majors among you, yes there is a proof of this!)

Def. An admissible heuristic is one that never overestimates the cost to reach the goal.

Some examples of admissible heuristics:

Straight-line distance in any sort of distance-related problem.
Manhattan distance in grid-like problems.
Number of differences between current and goal states (in cases where multiple differences cannot be resolved in a single step).

An example of the two searches

Let’s compare the behavior of Greedy search and A* search on the 5-puzzle.

We’ll define g(n) and h(n) as follows:

Let g(n) be the number of steps taken from the initial state to n.

Let h(n) be the number of tiles out of place.

Problems with heuristic (informed) search

In spite of its optimality and completeness, A* still has problems:

For most problems, the number of nodes on the frontier of the search space is still exponential in the length of the solution. That is, the search tree can still grow to be as "bushy" as in Breadth-first Search. So there can be problems with respect to the amount of memory needed to run A* search.

General drawbacks of heuristic (informed) search include the following:

need to keep everything on the frontier in memory
need to be able to do cost comparisons quickly — if the heuristic function is extremely complex, the search will not be fast.
need to choose good heuristics — this is not easy for many problems!

When at a loss for a good heuristic function, consider a relaxed version of the problem. An exact solution to a relaxed problem might be a good heuristic for the real problem. For example, consider a sliding tile puzzle. One relaxed version of the puzzle is one in which the tiles can simply be picked up and put into place. Therefore, one possible heuristic is to count the number of tiles out of position, since simply placing them would solve the relaxed problem.