Tag Archives: neighbors

The Depth-First-Search Algorithm

I remember when I was younger I used to play the game of hide-and-seek a lot. This is a game where a group of people (at least two) separate into a group of hiders and a group of seekers. The most common version of this that I’ve seen is having one person as the seeker and everyone as hiders. Initially, the seeker(s) is given a number to count towards and close their eyes while counting. The hiders then search for places to hide from the seeker. Once the seeker is finished counting, their job is to find where everyone is hiding or admitting that they cannot find all the seekers. Any seekers not found are said to have won, and seekers that are found are said to have lost.

I played this game a number of times in my childhood, but I remember playing it with a friend named Dennis in particular. Dennis had a certain way he played as seeker. While many of us would simply go to places we deemed as “likely” hiding spots in a somewhat random order, Dennis would always begin by looking in one area of the room, making sure that he had searched through every area connected to that area before going to a new area. He continued this process until he either found everybody or concluded that he had searched every spot he could think of and gave up.

It wasn’t until years later that I was able to note the similarity between Dennis’s way of playing hide-and-seek and the Depth-First-Search algorithm. The Depth-First-Search Algorithm is a way of exploring all the nodes in a graph. Similar to hide-and-seek, one could choose to do this in a number of different ways. Depth-First-Search does this by beginning at some node, looking first at one of the neighbors of that node, then looking at one of the neighbors of this new node. If the new node does not have any new neighbors, then the algorithm goes to the previous node, looks at the next neighbor of this node and continues from there. Initially all nodes are “unmarked” and the algorithm proceeds by marking nodes as being in one of three states: visited nodes are marked as “visited”; nodes that we’ve marked to visit, but have not visited yet are marked “to-visit”; and unmarked nodes that have not been marked or visited are “unvisited”.

Consider a bedroom with the following possible hiding locations: (1) Under Bed, (2) Behind Cabinet, (3) In Closet, (4) Under Clothes, (5) Behind Curtains, (6) Behind Bookshelf, and (7) Under Desk. We can visualize how the bedroom is arranged as a graph and then use a Breadth First Search algorithm to show how Brent would search the room. Consider the following bedroom arrangement, where we have replaced the names of each item by the number corresponding to that item. Node (0) corresponds to the door, which is where Dennis stands and counts while others hide.

Bedroom Items as a Graph

Now consider how a Breadth First Search would be run on this graph.

Bedroom Items as a Graph Colored by DFS

The colors correspond to the order in which nodes are visited in Depth-First-Search.

The way we read this is that initially Dennis would start at node 0, which is colored in Blue.
While Dennis is at node 0, she notices that nodes 1, 5, and 6 (under bed, behind curtains, and behind bookshelf) are the nearby and have not been checked yet so she places them on the “to visit” list.
Next, Dennis will begin to visit each node on the “to visit” list, and when a node is visited, she labels it as visited. At each location, she also takes note of the other locations she can reach from this location. Below is the order of nodes Dennis visits and how he discovers new locations to visit.

Order Visited Node Queue Adding Distance From Node 0
1 0 6,5,1 0
2 6 5,1 7,3,2 1
3 7 3,2,5,1 2
4 3 2,5,1 2
5 2 5,1 4 2
6 4 5,1 3
7 5 1 1
8 1 1

Here is a link to my Examples page that implements the Depth-First-Search Algorithm on Arbitrary Graphs.

The Breadth-First-Search Algorithm

I remember when I was younger I used to play the game of hide-and-seek a lot. This is a game where a group of people (at least two) separate into a group of hiders and a group of seekers. The most common version of this that I’ve seen is having one person as the seeker and everyone as hiders. Initially, the seeker(s) is given a number to count towards and close their eyes while counting. The hiders then search for places to hide from the seeker. Once the seeker is finished counting, their job is to find where everyone is hiding or admitting that they cannot find all the seekers. Any seekers not found are said to have won, and seekers that are found are said to have lost.

I played this game a number of times in my childhood, but I remember playing it with a friend named Brenda in particular. Brenda had a certain way she played as seeker. While many of us would simply go to places we deemed as “likely” hiding spots in a somewhat random order, Brenda would always take a survey of the room, and no matter where she began searching, she would always make note of the locations close to her starting point and make sure she was able to give them all a look before she looked at locations that were close to the points she deemed close to the starting point. She continued this process until she either found everybody or concluded that she had searched every spot she could think of and gave up.

It wasn’t until years later that I was able to note the similarity between Brenda’s way of playing hide-and-seek and the Breadth-First-Search algorithm. The Breadth-First-Search algorithm is a way of exploring all the nodes in a graph. Similarly to hide-and-seek, one could choose to do this in a number of different ways. Breadth-First-Search does this by beginning at some node, looking first at each of the neighbors of the starting node, then looking at each of the neighbors of the neighbors of the starting node, continuing this process until there are no remaining nodes to visit. Initially all nodes are “unmarked” and the algorithm proceeds by marking nodes as being in one of three stages: visited nodes are marked as “visited”; nodes that we’ve marked to visit, but have not visited yet are marked “to-visit”; and unmakred nodes that have not been marked are “unvisited”.

Consider a bedroom with the following possible hiding locations: (1) Under Bed, (2) Behind Cabinet, (3) In Closet, (4) Under Clothes, (5) Behind Curtains, (6) Behind Bookshelf, and (7) Under Desk. We can visualize how the bedroom is arranged as a graph and then use a Breadth First Search algorithm to show how Brenda would search the room. Consider the following bedroom arrangement, where we have replaced the names of each item by the number corresponding to that item. Node (0) corresponds to the door, which is where Brenda stands and counts while others hide.

Bedroom Items as a Graph

Now consider how a Breadth First Search would be run on this graph.

Bedroom Items as a Graph

The colors correspond to the order in which nodes are visited in Breadth-First-Search.

The way we read this is that initially Brenda would start at node 0, which is colored in Blue.
While Brenda is at node 0, she notices that nodes 1, 5, and 6 (under bed, behind curtains, and behind bookshelf) are the nearby and have not been checked yet so she places them on the “to visit” list.
Next, Brenda will begin to visit each node on the “to visit” list, and when a node is visited, she labels it as visited. At each location, she also takes note of the other locations she can reach from this location. Below is the order of nodes Brenda visits and how she discovers new locations to visit.

Order Visited Node Queue Adding Distance From Node 0
1 0 1, 5, 6 0
2 1 5, 6 2, 4 1
3 5 6, 2, 4 1
4 6 2, 4 3, 7 1
5 2 4, 3, 7 2
6 4 3, 7 2
7 3 7 2
8 7 2

Here is a link to my Examples page that implements the Breadth-First-Search Algorithm on Arbitrary Graphs.

The A* Algorithm

As a child I remember traveling on road trips, sitting in the back of a car trying to do my best to keep myself busy for what would occupy the next six to ten hours of my life. One of the things I grew to like were the simple maze books that were sold on the magazine racks at some of the gas stations where we’d stop for food. There are two basic strategies I employed for solving these mazes: For simpler mazes, I could generally just take a look at the overall maze structure, decide upon a path through the maze, then write a path without any mistakes. For more complex mazes though, I would generally begin a route that looks the most promising. If that route reaches a point where I can see that it will be impossible to finish, then I’d go back to where the decision was made, exclude the route I had just tried, and select the “most promising” remaining route. I’d continue this process until I had completed the maze, or until it became simple enough for me to solve the maze using only my memory.

The A* Algorithm works in a similar manner to the second approach I just described. We begin at a starting point, and consider where to move next from that starting point. The set of possible options for this next move is determined by the neighbor function for a given cell. For each neighbor the algorithm estimates the length of the route through that cell by calculating the sum of two values, f(n) = g(n) + h(n), where

g(n) is the (known) cost to travel from the starting point to the cell n.
h(n) is the (approximate) cost to travel from the cell n to the final cell.

The sum g(n) + h(n) allows us to approximate the total cost of a route through the cell n.

The A* algorithm begins at the starting position of the maze. There are two sets we will be considering throughout the process of determining the optimal route, called the closedSet and the openSet. The elements of closedSet are the nodes whose total distance from the starting position has been calculated, whereas the elements of openSet represents nodes whose total distance is still under consideration. There is also a map called prev which is used to reconstruct the path. Below is how the algorithm operates:

A* Algorithm Pseudocode
closedSet is the empty set
openSet = {start}
prev is the empty set

While there are still elements in openSet,
     Find the element c* in openSet with the lowest value f(c).
     If c* is the target position
          Reconstruct the path.
     Else If c* is not the target position,
          Remove c* from openSet
          Add c* to closedSet
          For each neighbor n of c* that is not in closedSet,
               Calculate a temporary g value, temp_g(n) = g(c*) + dist(c*, n).
               If n is not in openSet, or if n is in openSet and temp_g(n) < g(n),                     Set prev(n) = c*                     Set g(n) = temp_g(n)                     Set f(n) = g(n) + h(n).                     if n is not in openSet                          add n to openSet.      End If End While End Algorithm An important question becomes what makes a good heuristic function to approximate the distance to the goal. This can lead to an in depth discussion based on the word "good", but the necessary condition for any heuristic is that it NEVER over-estimates the cost of the path from the cell to the goal. Some examples of possible heuristics for mazes are the Euclidean Distance (the square root of the sum of the squares of the horizontal and vertical differences in distances, i.e. dE(x, y) = (i(xi – yi)2) and TaxiCab Distance (the sum of the differences in the horizontal and vertical dimensions, i.e. dT(x, y) = i|xi – yi|). Both of these are feasible metrics for heuristics on a maze. Other herusitics, like h(n) = 0 for all n are possible, but doing this would make the algorithm treat all cells equally and ignore the heuristic part of the A* algorithm, turning it into Dijkstra’s algorithm.

I’ve written a script that generates random mazes and uses the A* algorithm to find the optimal path through this maze. For this script, I used the taxicab distance heuristic.

Check it out and let me know what you think.