DFS — Depth-First Search | DSA Guide

01

Section One · Foundation

What is DFS?

Depth-First Search picks one neighbor and follows it as far as possible before backing up and trying the next — it uses recursion (or an explicit stack) to remember where to resume after backtracking. While BFS explores in concentric rings, DFS plunges down a single path until it hits a dead end, then reverses course. This depth-first behavior gives DFS its three killer applications: cycle detection (a back edge to an ancestor means a cycle), topological sorting (reverse post-order gives a valid ordering of dependencies), and connected component discovery (each DFS call from an unvisited node is a new component). DFS runs in O(V + E) time, visiting every vertex and examining every edge exactly once. The recursive version mirrors the call stack naturally — each recursive call represents "enter this node," and each return represents "backtrack from this node." For graphs with thousands of nodes, an iterative version with an explicit Deque<Integer> avoids StackOverflowError. Unlike BFS, DFS does NOT guarantee shortest path — it finds a path, not necessarily the shortest one.

Analogy: Exploring a maze by always picking the leftmost corridor. You walk straight ahead, always turning left at junctions, until you hit a dead end. Then you backtrack to the last junction and try the next corridor you haven't explored yet. You never skip ahead to a junction two corridors back — you always exhaust the current path before retreating. This guarantees you eventually visit every reachable room, but the first path you find to the exit is not necessarily the shortest.

02

Section Two · Mental Model

How It Thinks

Recursion (or stack) follows LIFO — last neighbor pushed is explored first

▶

DFS goes deep before wide — it fully explores one branch before touching siblings, making it natural for exhaustive search and backtracking

Each recursive call "enters" a node; each return "backtracks" from it

▶

Two timestamps emerge — discovery time and finish time — enabling edge classification (tree, back, forward, cross edges) and topological sort

A back edge (edge to an ancestor still on the stack) proves a cycle

▶

DFS detects cycles in O(V+E) — use three colors (white/gray/black) to distinguish unvisited, in-progress, and fully-processed nodes

Nodes are finished in reverse dependency order

▶

Reverse post-order of DFS gives a valid topological sort on a DAG — dependencies always appear before dependents

Each DFS call from an unvisited node explores exactly one connected component

▶

Counting the number of DFS launches from the outer loop = counting connected components — the pattern behind "Number of Islands"

DFS space is O(depth) for the call stack — proportional to the longest path, not the widest level

▶

Efficient on deep, narrow graphs where BFS would hold an entire wide level in memory — but risks StackOverflow on very deep graphs without iterative conversion

03

Section Three · The Algorithm

DFS — Step by Step

Recursive DFS

Traverse all reachable vertices from a source by recursively exploring each unvisited neighbor — the call stack implicitly manages backtracking.

Pseudocode

 function dfs(graph, node, visited):
    visited.add(node)
process(node) // pre-order work for each neighbor in graph[node]:
if neighbor not in visited:
dfs(graph, neighbor, visited)
// post-order work goes here              // O(V + E) 

DFS execution — recursive call stack and visit order from node 0

Pre-order vs post-order work:

Code placed BEFORE the recursive calls runs when you first discover a node (pre-order).
Code placed AFTER runs when you've fully processed all descendants (post-order). Pre-order gives discovery order.
Post-order is the foundation of topological sort — a node finishes only after all its dependencies have finished.

Iterative DFS

Replace the call stack with an explicit stack to avoid StackOverflowError on deep graphs — essential for graphs with 10,000+ nodes.

Pseudocode

 function dfsIterative(graph, start):
    visited = {}
    stack = [start]
while stack is not empty:
        node = stack.pop()
if node in visited: continue // may be pushed twice
visited.add(node)
process(node)
for each neighbor in graph[node]:
if neighbor not in visited:
                stack.push(neighbor) // O(V + E) 

Iterative DFS checks visited AFTER popping, not before pushing:

Unlike BFS where you mark before enqueueing, iterative DFS may push the same node multiple times (from different neighbors).
The if (visited) continue after pop handles this.
Alternatively, mark before pushing — but then the visit order differs slightly from recursive DFS.

The Three-Color Scheme

Track each node's state as WHITE (unvisited), GRAY (discovered, in progress), or BLACK (finished) — the key to cycle detection in directed graphs.

Pseudocode

 // WHITE = 0 (unvisited), GRAY = 1 (in progress), BLACK = 2 (done) function dfsColored(graph, node, color):
    color[node] = GRAY // mark in-progress for each neighbor in graph[node]:
if color[neighbor] == WHITE:
dfsColored(graph, neighbor, color)
else if color[neighbor] == GRAY:
// BACK EDGE → cycle detected! return CYCLE_FOUND
    color[node] = BLACK // mark finished 

Three-color DFS — GRAY neighbor = cycle (back edge detected)

Common Mistake: Using a plain boolean visited set for cycle detection in directed graphs. A boolean set only tells you "have I seen this node" — but in directed graphs, a visited node might be fully processed (BLACK), not part of the current path. Only GRAY nodes (currently on the recursion stack) indicate a cycle. For undirected graphs, a simpler "visited + parent" check suffices.

04

Section Four · Cycle Detection

Cycle Detection — Directed vs Undirected

Cycle detection in undirected graphs and directed graphs require different techniques. In an undirected graph, a cycle exists if DFS encounters a visited neighbor that is NOT the parent it came from — since every edge is bidirectional, the node you arrived from always appears as "visited," so you must exclude it. In a directed graph, a cycle exists only when DFS encounters a GRAY node (still on the current DFS path) — a BLACK node was processed on a different path and does not form a cycle with the current path. The Course Schedule problem (LC 207) is the canonical directed cycle detection problem.

Undirected Graph

visited + parent check

 function hasCycle(node, parent, visited):
    visited.add(node)
for each nei in graph[node]:
if nei not in visited:
if hasCycle(nei, node, visited):
return true else if nei != parent:
return true // cycle! return false 

Directed Graph

three-color (WHITE/GRAY/BLACK)

 function hasCycle(node, color):
    color[node] = GRAY
for each nei in graph[node]:
if color[nei] == GRAY:
return true // cycle! if color[nei] == WHITE:
if hasCycle(nei, color):
return true
color[node] = BLACK
return false 

Undirected: "visited neighbor ≠ parent.":

Directed: "visited neighbor is GRAY." These are the two cycle detection formulas.
In undirected, every edge is bidirectional, so the parent always appears visited — exclude it.
In directed, an edge to a BLACK node is not a cycle — it's a cross/forward edge to a different, already-completed DFS subtree.

05

Section Five · Topological Sort

Topological Sort — Ordering Dependencies

A topological sort of a Directed Acyclic Graph (DAG) is a linear ordering of vertices such that for every directed edge u→v, u appears before v. It answers: "in what order should I process tasks so that all dependencies are satisfied before any task that depends on them?" Build systems (Maven, Gradle), course prerequisite planning, and package installation order all reduce to topological sort. DFS gives topological sort for free: run DFS, add each node to a list when it finishes (post-order), then reverse the list.

DFS-based Topological Sort

Run DFS on a DAG and push each node onto a result stack when it finishes — the stack top-to-bottom gives a valid topological order.

Pseudocode

 function topologicalSort(graph):
    visited = {}
    stack = []
for each node in graph:
if node not in visited:
dfs(graph, node, visited, stack)
return stack.reverse() // O(V + E) function dfs(graph, node, visited, stack):
    visited.add(node)
for each neighbor in graph[node]:
if neighbor not in visited:
dfs(graph, neighbor, visited, stack)
    stack.push(node) // post-order: add AFTER children 

Topological Sort — course prerequisites example

Kahn's algorithm (BFS-based) is the alternative:

Instead of DFS + reverse post-order, repeatedly remove nodes with in-degree 0 and add them to the result.
Both give valid topological orders in O(V+E).
Kahn's naturally detects cycles (if the result has fewer than V nodes, a cycle exists).
Use Kahn's when you need to detect cycles simultaneously or when the problem gives in-degree information directly.

06

Section Six · Connected Components

Connected Components

A connected component is a maximal set of vertices where every vertex is reachable from every other vertex. Finding connected components is a direct application of the outer DFS loop: iterate over all vertices, and for each unvisited vertex, launch a DFS that marks all reachable vertices — each launch is one component. This pattern powers "Number of Islands" (LC 200), "Number of Provinces" (LC 547), and "Accounts Merge" (LC 721). For dynamic connectivity (where edges are added/removed over time), Union-Find is more efficient, but for static graphs, DFS is the standard approach.

Pseudocode

 function countComponents(graph, n):
    visited = {}
    count = 0 for node = 0 to n - 1:
if node not in visited:
dfs(graph, node, visited) // explore entire component
count = count + 1 // one new component return count // O(V + E) 

Connected Components — three DFS launches = three components

"Number of Islands" is connected components on a grid:

Each '1' cell is a vertex, edges connect adjacent '1' cells. Each DFS from an unvisited '1' marks one island.
Count launches = count islands. The grid is the graph — no adjacency list needed.

07

Section Seven · DFS on Grids

DFS on 2D Grids — Flood Fill Pattern

Grid DFS is the flood fill algorithm: start at a cell, recursively visit all 4-directional neighbors that match some criterion, mark as visited. This handles "Number of Islands" (mark connected '1' cells), "Surrounded Regions" (mark 'O' cells reachable from the border), and "Pacific Atlantic Water Flow" (DFS from both oceans inward). The recursive version is cleaner for grids because the 4-direction expansion maps directly to 4 recursive calls.

Java — Grid DFS Template (flood fill)

 int[][] dirs = {{0,1}, {0,-1}, {1,0}, {-1,0}};
void dfsGrid(char[][] grid, int r, int c) {
int rows = grid.length, cols = grid[0].length;
if (r < 0 || r >= rows || c < 0 || c >= cols) return; // bounds if (grid[r][c] != '1') return; // not valid
grid[r][c] = '0'; // mark visited (in-place) for (int[] d : dirs)
dfsGrid(grid, r + d[0], c + d[1]); // recurse 4 directions
}
// Number of Islands — count DFS launches int numIslands(char[][] grid) {
int count = 0;
for (int r = 0; r < grid.length; r++)
for (int c = 0; c < grid[0].length; c++)
if (grid[r][c] == '1') {
dfsGrid(grid, r, c);
                count++; // one island found
}
return count; // O(rows × cols)
}

In-place marking vs visited array:

For grid DFS, modifying the grid directly (grid[r][c] = '0') is the common pattern — it avoids allocating a separate visited array.
If the grid must not be modified, use boolean[][] visited or restore the original value after DFS (backtracking).
In interviews, clarify whether mutation is allowed.

08

Section Eight · Implementation

Build It Once

Build this from scratch once — it makes the mechanics concrete. DFS is boilerplate: visited set + recursive neighbor loop. Memorise the template and the three-color variant.

Java — DFS core mechanics

 import java.util.*;
class DFS {
// Recursive DFS — O(V+E) static void dfs(Map<Integer, List<Integer>> graph,
int node, Set<Integer> visited, List<Integer> order) {
        visited.add(node);
        order.add(node); // pre-order for (int nei : graph.getOrDefault(node, List.of()))
if (!visited.contains(nei))
dfs(graph, nei, visited, order);
// post-order work here
}
// Cycle detection — directed (three-color) — O(V+E) static boolean hasCycleDirected(Map<Integer, List<Integer>> graph,
int node, int[] color) {
        color[node] = 1; // GRAY for (int nei : graph.getOrDefault(node, List.of())) {
if (color[nei] == 1) return true; // back edge! if (color[nei] == 0 && hasCycleDirected(graph, nei, color))
return true;
        }
        color[node] = 2; // BLACK return false;
    }
// Topological sort (DFS-based) — O(V+E) static List<Integer> topoSort(Map<Integer, List<Integer>> graph, int n) {
Set<Integer> visited = new HashSet<>();
Deque<Integer> stack = new ArrayDeque<>();
for (int i = 0; i < n; i++)
if (!visited.contains(i))
topoDFS(graph, i, visited, stack);
return new ArrayList<>(stack);
    }
private static void topoDFS(Map<Integer, List<Integer>> graph,
int node, Set<Integer> visited, Deque<Integer> stack) {
        visited.add(node);
for (int nei : graph.getOrDefault(node, List.of()))
if (!visited.contains(nei))
topoDFS(graph, nei, visited, stack);
        stack.push(node); // post-order → stack
}
}

09

Section Nine · Java Reference

Use It In Java

IN JAVA — DFS Building Blocks

Recursion Direct method calls — Java's default stack size is ~512KB (~5000–10000 frames)

Stack Deque<Integer> stack = new ArrayDeque<>() — use push() / pop()

Import import java.util.*;

Pattern	Complexity	Note
`visited.add(node)`	O(1)	HashSet — returns false if already present
`stack.push(node) / stack.pop()`	O(1)	ArrayDeque — iterative DFS stack
`int[] color = new int[n]`	O(1) access	Three-color scheme: 0=W, 1=G, 2=B
`boolean[] visited`	O(1) access	Faster than HashSet for 0-indexed graphs
`map.getOrDefault(k, List.of())`	O(1)	Safe neighbor lookup — avoids null
`grid[r][c] = '0'`	O(1)	In-place visited marking for grid DFS
`-Xss4m` JVM flag	—	Increase stack size for deep recursive DFS

⚠ Gotcha: Java's default thread stack size is ~512KB, allowing roughly 5000–10000 recursive calls before StackOverflowError. LeetCode's judge has a larger stack, but local testing will crash on deep graphs. For graphs with V > 10,000, convert recursive DFS to iterative with an explicit ArrayDeque stack — same O(V+E) time, O(V) heap space instead of stack space.

Choose Use recursive DFS for clarity in interviews (fewer lines, easier to debug). Use iterative DFS when the graph may be deep (>5000 nodes in a single path). Use three-color DFS for directed cycle detection. Use boolean[] over HashSet when vertices are 0-indexed integers.

10

Section Ten · Practice

Problems To Solve

DFS interview problems cluster into four patterns: connected components counting (launch DFS from each unvisited node), cycle detection (undirected = parent check, directed = three-color), topological ordering (reverse post-order on DAGs), and backtracking (enumerate all valid configurations using DFS + undo). Recognise the DFS signal: "detect cycle," "connected components," "dependency order," "all paths," or "flood fill" — these are almost always DFS, never BFS.

Difficulty	Pattern	Problem	Key Insight
Medium	dfs	Number of Islands — LC 200	Grid DFS flood fill: each launch from an unvisited '1' marks one island. Count launches = count islands. Mark visited by setting cell to '0'.
Medium	dfs	Course Schedule — LC 207	Model courses as nodes, prerequisites as directed edges. Run three-color DFS: if any neighbor is GRAY during DFS, a cycle exists and not all courses can be finished.
Medium	dfs	Course Schedule II — LC 210	Same as LC 207, but instead of detecting a cycle, collect the topological order. DFS with reverse post-order. If a cycle exists, return empty.
Medium	dfs	Clone Graph — LC 133	DFS with a HashMap mapping original→clone. When visiting a neighbor, check the map: if cloned, reuse; if not, create and recurse. The map prevents infinite loops on cycles.
Medium	dfs	Pacific Atlantic Water Flow — LC 417	Two grid DFS passes: one from Pacific border cells, one from Atlantic border cells. Cells reachable from BOTH passes are the answer. Reverse the flow direction — DFS "uphill" from oceans.
Medium	backtrack	All Paths From Source to Target — LC 797	DFS with path tracking: add current node to path, recurse into neighbors, remove node after returning (backtrack). No visited set needed — the graph is a DAG, so no cycles.

Interview Pattern:

DFS is the right choice when the problem requires exploring all possibilities, detecting structural properties (cycles, components), or ordering dependencies.
If you see "cycle," think three-color DFS.
If you see "topological order" or "prerequisites," think DFS reverse post-order (or Kahn's BFS).
If you see "count regions" or "flood fill," think DFS + component counting.
DFS is recursive by nature — start with the recursive version, convert to iterative only if depth is a concern.

→ See all DFS problems with full solutions

LearningTree

DFS — Depth-First Search | DSA Guide

DFS Depth-First Search

What is DFS?

How It Thinks

DFS — Step by Step

Cycle Detection — Directed vs Undirected

visited + parent check

three-color (WHITE/GRAY/BLACK)

Topological Sort — Ordering Dependencies

Connected Components

DFS on 2D Grids — Flood Fill Pattern

Build It Once

Use It In Java

Problems To Solve