Search Techniques (COMP3411)

Search strategies are defined by which order the nodes are picked for expansion.

# Evaluation

They're evaluated in terms of:

• Completeness (if it exists, does it always find the solution)
• Time Complexity (Number of nodes generated/expanded)
• Space Complexity (Maximum number of nodes in memory at any one time)
• Optimality (Does it find the best (least-cost) solution).

## Time and Space Complexity

Time and Space Complexity are measured in terms of:

• b = Maximum Branching Factor (of the tree) - i.e. if each node can have 10 children, then each level is 100, 101, 102
• m = Maximum Depth of State Space (can be infinite, there may always be possible children states)
• d = Depth of the least-cost (best) solution

## Algorithm Comparison

We compare different search algorithms in two ways:

• Benchmarking: Run both algorithms (same language, same hardware, same test cases, etc) and compare the time differences
• Mathematical Analysis: Essentially complexity analysis.

# Uninformed Search Strategies

Uninformed search strategies are also known as "blind" because they only have the information given in the problem statement, meaning they can only differentiate goal vs non-goal, not degrees of closeness.

# Informed Search Strategies

Informed search strategies (also known as "heuristic") use information about the task to determine the value of a state. They're more efficient than uninformed strategies because they can work out a system of preferable nodes, rather than just sequential nodes.

We use a BestFirst() function (like the path length one in dijkstra) to figure out which order to visit nodes.

Informed Search Strategies include:

We also need to look at:

## Heuristic Functions

A heuristic function is a problem-specific function that determines an estimate of the cheapest path cost from a current state to the goal state.

### Straight Line Distance as Heuristic

If $h_{SLD}(n)$ = the straight-line distance between n and the goal location, and we assume that paths typically tend to approximate the direct connection between two states, then we need to know the physical co-ordinates that each state represents, to figure out the heuristic.

Hence it becomes:

(1)
\begin{align} h_{SLD}(n) = \sqrt{(n_x - g_x)^2 + (n_y - g_y)^2} \end{align}

Where n = current node and g= goal node.

Straight line distance is admissible because the straight line is the shortest distance between two points.

## Dominance

The most dominant heuristic is the admissible heuristic closest to the actual value it's estimating.

Hence if $h_1() >= h_2(n) \forall n$, and both are admissible, then $h_1$ is the dominant heuristic, and better for searching.ß

## Composite Heuristics

If $h_1, h_2, ... h_n$ are all admissible heuristics then the max of them will be the composite heuristic - it'll be admissible and dominant.

page revision: 16, last edited: 24 Mar 2012 04:42