Graph Fundamentals
Vertices connected by edges โ the most general data structure for modeling relationships. Graphs power social networks, GPS navigation, dependency resolution, and every "shortest path" or "connected components" interview problem.
What is a Graph?
A graph is a collection of vertices (nodes) connected by edges โ unlike trees, there's no root, no parent-child hierarchy, and edges can form cycles. A graph can be directed (edges have a direction, like Twitter follows) or undirected (edges are bidirectional, like Facebook friendships). It can be weighted (each edge carries a cost, like road distances) or unweighted. The two dominant storage formats are an adjacency list (each vertex stores a list of neighbors โ space-efficient for sparse graphs) and an adjacency matrix (a 2D boolean/weight array โ fast edge lookup but O(Vยฒ) space). Graphs are everywhere: social networks, web page links, dependency trees in build systems, flight routes, and the Internet itself. In Java, there's no built-in Graph class โ you build one from Map<Integer, List<Integer>> or a 2D array.
How It Thinks
Core Operations
- For undirected graphs, every
addEdge(u, v)adds v to u's list AND u to v's list. - Forgetting one direction is the #1 graph construction bug โ BFS/DFS will miss paths because the edge only works one way.
- Add a node to
visitedwhen you enqueue it, not when you dequeue it. - If you wait until dequeue, multiple neighbors can enqueue the same node simultaneously, causing duplicate processing and incorrect shortest-path distances.
- For deep graphs, recursion can cause
StackOverflowError. - Replace the call stack with an explicit
Deque<Integer>โ push neighbors, pop next node. - The iterative version handles graphs with thousands of nodes that recursive DFS cannot.
visited set in graphs with cycles. In a tree, DFS/BFS naturally terminates because there are no cycles. In a graph, revisiting a node means infinite recursion (DFS) or infinite loop (BFS). Always initialize a visited set before traversal โ this is the single most common graph bug.
Build It Once
Build this from scratch once โ it makes the mechanics concrete. In any real project or interview, reach for the built-in (Section 05).
Use It In Java
Map<Integer, List<Integer>> (adjacency list) or int[][] (adjacency matrix) import java.util.*; | Pattern | Complexity | Note |
|---|---|---|
map.computeIfAbsent(u, k -> new ArrayList<>()).add(v) | O(1) | Add edge โ creates list if vertex is new |
map.getOrDefault(u, List.of()) | O(1) | Get neighbors โ avoids null check |
new LinkedList<>() + poll() | O(1) | Queue for BFS โ use ArrayDeque for better perf |
new HashSet<>() + add() | O(1) | Visited set โ add() returns false if duplicate |
boolean[][] matrix | O(1) lookup | Adjacency matrix โ when V โค ~1000 |
int[][] edges | O(E) | Edge list input โ convert to adj list first |
int[] parent (Union-Find) | O(ฮฑ(n)) | Connected components without BFS/DFS |
int[][] edges (edge list), not an adjacency list. Always convert to Map<Integer, List<Integer>> first โ iterating an edge list per query is O(E) per call, while an adjacency list gives O(degree) per call. The conversion loop is 3 lines; skipping it makes BFS/DFS O(VยทE) instead of O(V+E).
Map<Integer, List<Integer>> for sparse graphs (most interview problems). Use boolean[][] adjacency matrix only when V is small (< 1000) and you need O(1) edge existence checks โ like Floyd-Warshall or dense graph problems. Pick The Right Structure
| Representation | Edge Lookup | Add Edge | Space | Best For |
|---|---|---|---|---|
| Adjacency List โ this | O(degree) | O(1) | O(V + E) | Sparse graphs, most interview problems |
| Adjacency Matrix | O(1) | O(1) | O(Vยฒ) | Dense graphs, Floyd-Warshall |
| Edge List | O(E) | O(1) | O(E) | Kruskal's MST, input format only |
| Union-Find | O(ฮฑ(n)) | O(ฮฑ(n)) | O(V) | Dynamic connectivity, MST |
Problems To Solve
Graph interview problems cluster around three patterns: BFS for shortest-path and level-order problems, DFS for connectivity and cycle detection, and topological sort for dependency ordering. Most medium-difficulty graph problems reduce to "build an adjacency list, then BFS or DFS" โ the hard part is recognizing that the problem is a graph problem and correctly modeling the vertices and edges.
| Difficulty | Pattern | Problem | Key Insight |
|---|---|---|---|
| Medium | bfs | Number of Islands โ LC 200 | Treat each '1' cell as a vertex with edges to adjacent '1' cells. BFS/DFS from each unvisited '1' marks an entire island โ count how many times you start a new traversal. |
| Medium | bfs | Rotting Oranges โ LC 994 | Multi-source BFS: enqueue all rotten oranges at once (level 0), then BFS spreads rot level by level. The answer is the number of BFS levels โ each level is one minute. |
| Medium | dfs | Clone Graph โ LC 133 | DFS with a HashMap mapping old node โ new node. When you visit a neighbor, check the map: if it exists, reuse it; if not, create it and recurse. The map prevents infinite loops on cycles. |
| Medium | dfs | Course Schedule โ LC 207 | Model courses as vertices, prerequisites as directed edges. Cycle in a directed graph = impossible to finish all courses. Use DFS with three states (unvisited, in-progress, done) to detect back edges. |
| Medium | bfs | 01 Matrix โ LC 542 | Multi-source BFS from all '0' cells simultaneously. Each level increments distance by 1. Initialize the queue with all zeros, mark all ones as unvisited โ BFS fills in the correct distances layer by layer. |
| Hard | bfs | Word Ladder โ LC 127 | Each word is a vertex; edges connect words differing by one character. BFS from beginWord finds shortest transformation sequence. The trick: generate all possible one-character mutations instead of comparing every pair of words. |
- BFS and DFS each solve 40% of graph problems.
- If the problem asks for shortest path, minimum steps, or nearest distance โ use BFS.
- If it asks for "can you reach", cycle detection, connected components, or topological order โ use DFS.
- The remaining 20% require Dijkstra (weighted shortest path) or Union-Find (dynamic connectivity).
- Build the adjacency list first; the traversal is boilerplate.