Depth-first search is a surprisingly versatile linear-time procedure that reveals a wealth of
information about a graph. The most basic question it addresses is,
*What parts of the graph are reachable from a given vertex?*

To understand this task, try putting yourself in the position of a computer that has just been given a new graph, say in the form of an adjacency list. This representation offers just one basic operation: finding the neighbors of a vertex. With only this primitive, the reachability problem is rather like exploring a labyrinth.

You start walking from a fixed place and whenever you arrive at any junction (vertex) there are a variety of passages (edges) you can follow. A careless choice of passages might lead you around in circles or might cause you to overlook some accessible part of the maze. Clearly, you need to record some intermediate information during exploration.

This classic challenge has amused people for centuries. Everybody knows that all you need to explore a labyrinth is a ball of string and a piece of chalk. The chalk prevents looping, by marking the junctions you have already visited. The string always takes you back to the starting place, enabling you to return to passages that you previously saw but did not yet investigate.

How can we simulate these two primitives, chalk and string, on a computer? The chalk
marks are easy: for each vertex, maintain a Boolean variable indicating whether it has been
visited already. As for the ball of string, the correct cyberanalog is a stack. After all, the exact
role of the string is to offer two primitive operations—*unwind* to get to a new junction (the
stack equivalent is to *push* the new vertex) and *rewind* to return to the previous junction (*pop*
the stack).

Instead of explicitly maintaining a stack, we will do so implicitly via recursion (which is implemented using a stack of activation records). The resulting algorithm is shown in the following picture.

The

More immediately, we need to confirm that

So

Incidentally, this pattern of reasoning arises often in the study of graphs and is in essence
a streamlined induction. A more formal inductive proof would start by framing a hypothesis,
such as “for any

The following figure shows the result of running *tree edges*. The dotted edges were ignored because they led back to familiar
terrain, to vertices previously visited. They are called *back edges*.

The *depth-first search* (DFS),
does this repeatedly until the entire graph has been traversed.

The first step in analyzing the running time of DFS is to observe that each vertex is

The following shows the outcome of depth-first search on a *forest*.

Depth-first search can be easily used to check whether an undirected graph is connected and, more generally, to split a graph into connected components.

We have seen how depth-first search—a few unassuming lines of code—is able to uncover the
connectivity structure of an undirected graph in just linear time. But it is far more versatile
than this. In order to stretch it further, we will collect a little more information during the
exploration process: for each node, we will note down the times of two important events, the
moment of first discovery (corresponding to

These timings will soon take on larger significance. Meanwhile, you might have noticed from the graph above that:

**Property** For any nodes

Why? Because

Our depth-first search algorithm can be run verbatim on directed graphs, taking care to traverse edges only in their prescribed directions. The following figure shows an example and the search tree that results when vertices are considered in lexicographic order.

In further analyzing the directed case, it helps to have terminology for important relationships
between nodes of a tree. *root* of the search tree; everything else is its *descendant*.
Similarly, *descendants* *ancestor* of these three nodes.
The family analogy is carried further: *parent* of *child*.

For undirected graphs we distinguished between tree edges and nontree edges. In the directed case, there is a slightly more elaborate taxonomy:

*Tree edges*are actually part of the DFS forest.*Forward edges*lead from a node to a nonchild descendant in the DFS tree.*Back edges*lead to an ancestor in the DFS tree.*Cross edges*lead to neither descendant nor ancestor; they therefore lead to a node that has already been completely explored (that is, already postvisited).

The graph above has two forward edges, two back edges, and two cross edges. Can you spot them?

Ancestor and descendant relationships, as well as edge types, can be read off directly
from

The case of descendants is symmetric, since

You can confirm each of these characterizations by consulting the diagram of edge types. Do you see why no other orderings are possible?

Using the just introduced edge types one can check whether a given graph is acyclic or not in linear time. Depth-first search can also be used to find a linearization of a given dag as well as splitting any directed graph into strongly connected components.

Source: Algorithms by Dasgupta, Papadimitriou, Vazirani. McGraw-Hill. 2006.

DFS visualisation by David Galles: http://www.cs.usfca.edu/~galles/visualization/DFS.html.