COMP1927 - A*

When we talk about informed shortest path searches we normally mean A*. A* is a simple informed search that uses a heuristic to select which state to look at next, based on the probability of said state reaching the goal.

Dijkstra vs A*

Dijkstra's shortest path algorithm is essentially A* with a heuristic of 0. That is, it cares only about the distance covered thus far as to the shortest path (where distance is determined by edge weights).

Calculating Heuristics

Generally a heuristic (f*(x)) is an approximate (shown with an asterisk - *) of the sum of the path length so far (g(x)), and the estimated approximate distance to the goal (h*(x)).

(1)
\begin{equation} f^*(x) = g(x) + h^*(x) \end{equation}

Open and Closed Sets

One way of implementing A* (the * means approximate, as above) is to have two sets; an 'open' set of states that are being looked at, and a 'closed' state of sets that have already been dealt with.

Essentially if a state is in the open set it means we either haven't seen that configuration yet, or have found it with a shorter path. If it's in open it won't be in closed (so hence if a closed state is found with a better path, it is deleted from closed and the new one put into open)l.

The open set needs to be a priority queue (in order to support deletion of minimal elements) and the closed set can be anything that has quick retrieval times.

Algorithm

At it's core, A* is just an xfs, but with a priority queue rather than a stack or a queue.

Start with a pq comprising of the start state.

  • Whilst the pq isn't empty and we haven't found the goal
    • If the current state is the goal state, set found to true and exit.
    • Otherwise, examine the children states
      • If they're not in either set, add them to open.
      • If they're in the open set with a lower path, ignore them.
      • If they're in the open set with a higher path, change the pathlength in open to the new one.
      • If they're in closed with a lower path, ignore them.
      • If they're in closed with a higher path, swap the states (i.e. delete and add)
    • Add the current state to the closed set.
  • If we found the goal, print its path.

Bam done!

Code

// astar returns whether or not the goal was found
int astar (Data start, Data goal) {
 
   OpenSet open;
   ClosedSet closed;
 
   insert (open, start);
   int found = FALSE;
 
   int i;
 
   // While there are still states to look at
   // And we haven't found the goal
   while ((!isEmpty(open)) && (!found)) {
      Data current = delMin(open);
      if (same(current, goal)) {
         found = TRUE;
         insert (closed, goal);
      } else {
         // Look at each child and insert it into the correct set
         for (i = 0; i < NUM_CHILDREN; i++) {
            Data child = getChild(current, i);
            if (child != NULL) {
               if ((!contained(open, child)) && 
                   (!contained(closed, child))) {
                  insert(open, child);
               } else if ((contained(open)) && 
                            (isLower(open, child))) {
                  update(open, child);
               } else if ((contained(closed)) &&
                            (isLower(closed, child))) {
                  remove(closed, child);
                  insert(open, child);
               }
            }
         }
         // Insert the current Data into closed
         insert (closed, current);
      }
   }
 
   if (found) {
      printPath (closed, goal);
   }
 
   return found;
 
}

Admissibility

If a heuristic never over-estimates the path to the goal then we refer to it as admissible. Nymeyer says this makes A* optimal.

In order to create an admissible heuristic we calculate the exact cost of a relaxed version of the problem (i.e. a less complex version with less rules, e.g. using manhattan distance in sliding-puzzle).

If h is admissible then it guarantees that A* is complete; meaning that if a solution exists A* will find it.

Having a larger h which is still admissible is preferable, as it will be closer to reality (and more well-informed). This will allow fewer states to be expanded and create a lower EBF.

Effective Branching Factor

The Effective Branching Factor (EBF) of an expansion state is calculated as being the number of children any node of the expanded-states-tree has, assuming that with N nodes (num expanded states) and D levels (number of levels expanded down), the tree is uniform (i.e. every node has the same number of children).

(2)
\begin{align} N = \frac{e^{d+1} - 1}{e - 1} \end{align}

We can use the EBF to work out the quality of a heuristic. An ideal EBF to have is 1 (and hence we try to approach 1), because at that point there is a single path from start to goal and the heuristic is perfect.

Classification

Complete: Yes, unless there are infinitely many nodes with f <= cost of solution
Time Complexity: Exponential in [relative error in h× length of solution]
Space Complexity: O(numNodes) = keeps all nodes in memory.
Optimality: Yes, provided the heuristic is admissible (never overestimates the cost to the goal)

Iterative Deepening A*

Iterative Deepening Search in A* form is just an A* that is repeated with an increasing cut-off limit for how high f() (sum of path length + heuristic) can go.

It's a low-memory variant of A* (similarly to how dfs is low-memory compared to bfs).

IDA* is asymptotically as efficient as A* for domains where the number of states grows exponentially.