Graphs

Graph Fundamentals

A graph is a collection of vertices (nodes) connected by edges, used to model pairwise relationships between objects. Graphs power social networks, GPS navigation, dependency resolution, network routing, and a huge share of interview problems. BFS and DFS are the two fundamental traversal algorithms; from them flow cycle detection, shortest paths, topological sort, and connected-component analysis.

01
Section One · Foundation

What is a Graph?

A B C D E V=5, E=7 undirected

A graph G = (V, E) consists of a set of vertices V and a set of edges E that connect pairs of vertices. Unlike trees, graphs can have cycles, disconnected components, and edges with weights or directions. This flexibility makes graphs the go-to structure whenever you need to model relationships — friendships, road maps, task dependencies, web links, circuit connections, or state machines.

Graphs are stored in two main representations:

Adjacency List

Map<V, List<V>>

Each vertex stores a list of its neighbours. Space-efficient for sparse graphs (E ≪ V²) and supports fast iteration over neighbours.

  • Space: O(V + E)
  • Add edge: O(1)
  • Check edge: O(degree)
  • Best for most interview problems
Adjacency Matrix

boolean[V][V]

A V×V matrix where matrix[u][v] = true means an edge from u to v. O(1) edge lookup but wastes space for sparse graphs.

  • Space: O(V²)
  • Add edge: O(1)
  • Check edge: O(1)
  • Best for dense graphs or Floyd-Warshall
Analogy: Think of a city road map. Intersections are vertices, roads are edges. Some roads are one-way (directed), some have tolls (weighted). Finding the fastest route from home to work is a shortest-path problem. Detecting if you can loop back to your starting point is cycle detection. A graph captures all these relationships naturally.
Key Characteristics
Directed vs Undirected
Directed edges have a source → destination; undirected edges go both ways
Weighted vs Unweighted
Weighted edges carry a cost/distance; unweighted edges are all equal
Cyclic vs Acyclic
Cyclic graphs contain loops; DAGs (Directed Acyclic Graphs) enable topological sort
Connected vs Disconnected
In a connected graph every vertex is reachable from every other; disconnected has multiple components
Sparse vs Dense
Sparse: E ≈ V — use adjacency list. Dense: E ≈ V² — adjacency matrix may be better
Graph Types at a Glance
TypeEdgesCyclesExample
UndirectedBidirectionalPossibleSocial network friendships
Directed (Digraph)One-wayPossibleWeb page links, Twitter follows
DAGDirectedNoneBuild systems, task scheduling
WeightedHave costsPossibleRoad networks, airline routes
BipartiteTwo groupsEven cycles onlyJob matching, student-course
02
Section Two · Operations

Core Operations

BFS — Breadth-First Search
Explores neighbours level by level using a queue. Finds the shortest path in unweighted graphs and detects connected components.
Pseudocode
function bfs(graph, start): visited = Set() queue = [start] visited.add(start) while queue is not empty: node = queue.poll() // dequeue front process(node) for neighbor in graph[node]: if neighbor not in visited: visited.add(neighbor) queue.add(neighbor) // enqueue // Time: O(V + E) Space: O(V)
BFS guarantees shortest path in unweighted graphs. Because it processes all nodes at distance d before any node at distance d+1, the first time you reach a node is via the shortest path. This is why BFS is the go-to algorithm for “minimum steps” problems like word ladder or grid shortest path.
DFS — Depth-First Search
Explores as deep as possible before backtracking, using recursion or an explicit stack. Powers cycle detection, topological sort, and path finding.
Pseudocode
function dfs(graph, node, visited): visited.add(node) process(node) for neighbor in graph[node]: if neighbor not in visited: dfs(graph, neighbor, visited) // Time: O(V + E) Space: O(V) recursion stack
DFS is the backbone of cycle detection and topological sort. By tracking three states (unvisited, in-progress, done), you can detect back edges (cycles) during DFS. Appending nodes to a result list in post-order gives you reverse topological order — the foundation for Course Schedule problems.
Topological Sort (Kahn’s Algorithm)
Produces a linear ordering of vertices in a DAG such that for every directed edge u → v, u comes before v. Uses BFS with in-degree tracking.
Pseudocode
function topologicalSort(graph): inDegree = computeInDegrees(graph) queue = [v for v in graph if inDegree[v] == 0] order = [] while queue is not empty: node = queue.poll() order.add(node) for neighbor in graph[node]: inDegree[neighbor] -= 1 if inDegree[neighbor] == 0: queue.add(neighbor) if order.size != graph.size: throw "Cycle detected!" // not a DAG return order // Time: O(V + E) Space: O(V)
If the result has fewer nodes than the graph, there’s a cycle. Kahn’s algorithm doubles as a cycle detector for directed graphs. This is the cleaner approach for Course Schedule problems compared to DFS-based topological sort.
Dijkstra’s Shortest Path
Finds the shortest path from a source to all other vertices in a weighted graph with non-negative edge weights, using a priority queue (min-heap).
Pseudocode
function dijkstra(graph, source): dist = {v: ∞ for v in graph} dist[source] = 0 pq = MinHeap() pq.add((0, source)) while pq is not empty: (d, u) = pq.poll() if d > dist[u]: continue // stale entry for (v, weight) in graph[u]: if dist[u] + weight < dist[v]: dist[v] = dist[u] + weight pq.add((dist[v], v)) return dist // Time: O((V + E) log V) Space: O(V)
Dijkstra fails with negative weights — use Bellman-Ford instead. Dijkstra greedily locks in the shortest distance. A negative edge could make a “locked” distance wrong. For graphs with negative weights (but no negative cycles), Bellman-Ford runs in O(V · E).
Union-Find (Disjoint Set)
Tracks connected components efficiently. Supports near-O(1) union and find operations with path compression and union by rank. Essential for Kruskal’s MST and dynamic connectivity.
Pseudocode
class UnionFind: parent[i] = i // each node is its own root rank[i] = 0 function find(x): if parent[x] != x: parent[x] = find(parent[x]) // path compression return parent[x] function union(x, y): px, py = find(x), find(y) if px == py: return false // already connected if rank[px] < rank[py]: swap(px, py) parent[py] = px if rank[px] == rank[py]: rank[px]++ return true // Near O(1) amortised per operation (inverse Ackermann)
Union-Find is the fastest way to check connectivity dynamically. If a problem asks “are these two nodes connected?” or “how many connected components?” with edges added over time, Union-Find beats BFS/DFS because each operation is nearly O(1) amortised.
Common Mistake: Forgetting to mark nodes as visited during BFS/DFS. Without a visited set, you’ll revisit the same nodes endlessly in cyclic graphs, causing infinite loops or TLE. Always add a node to the visited set when you enqueue/push it, not when you dequeue/pop — this prevents duplicate entries in the queue.
03
Section Three · Visuals

Diagrams & Structure

Adjacency List vs Adjacency Matrix
Graph: A—B, A—C, B—C, B—D (undirected)
GRAPH A B C D ADJACENCY LIST A → [B, C] B → [A, C, D] C → [A, B] D → [B] ✓ Space: O(V+E) • Preferred for sparse graphs ADJACENCY MATRIX A B C D A [0 1 1 0] B [1 0 1 1] C [1 1 0 0] D [0 1 0 0]
BFS Traversal Walkthrough
BFS from A on graph: A—B, A—C, B—D, C—D, D—E
GRAPH A B C D E BFS LEVEL-ORDER TRAVERSAL Lv 0 A Queue: [A] → visit A → enqueue B, C Lv 1 B C Queue: [B,C] → visit B,C → enqueue D Lv 2 D Queue: [D] → visit D → enqueue E Lv 3 E Queue: [E] → visit E → DONE ✓
DFS Traversal Walkthrough
DFS from A (recursive) on same graph
DFS PATH (bold = traversal edges) A visit 1 B visit 2 D visit 3 C visit 4 E visit 5 CALL STACK TRACE dfs(A) → dfs(B) → dfs(D) → dfs(C) ◀ backtrack(D already visited by C) → dfs(E) → ◀ backtrack all Order: A → B → D → C → E Stack pops: E, C, D, B, A ✓
Topological Sort — Course Prerequisites
Courses: 0→1, 0→2, 1→3, 2→3 (must take prereqs first)
DIRECTED ACYCLIC GRAPH (DAG) 0 in: 0 1 in: 1 2 in: 1 3 in: 2 KAHN'S ALGORITHM (BFS-based topological sort) Step 1: Enqueue node 0 (in-degree=0) → output: [0] Step 2: Decrement neighbours → node 1 (in:0), node 2 (in:0) → output: [0,1,2] Step 3: Decrement neighbours → node 3 (in:0) → output: [0,1,2,3] ✓ Topological order: [0, 1, 2, 3] (also valid: [0, 2, 1, 3])
04
Section Four · Code

Java Quick Reference

Adjacency-list graph with BFS, DFS, and topological sort — the three operations that cover 90% of interview graph problems.

Java — Graph (adjacency list)
class Graph { private Map<Integer, List<Integer>> adj = new HashMap<>(); void addEdge(int u, int v) { // undirected adj.computeIfAbsent(u, k -> new ArrayList<>()).add(v); adj.computeIfAbsent(v, k -> new ArrayList<>()).add(u); } List<Integer> bfs(int start) { // O(V+E) List<Integer> order = new ArrayList<>(); Set<Integer> visited = new HashSet<>(); Queue<Integer> q = new LinkedList<>(); q.add(start); visited.add(start); while (!q.isEmpty()) { int node = q.poll(); order.add(node); for (int nb : adj.getOrDefault(node, List.of())) { if (visited.add(nb)) q.add(nb); } } return order; } void dfs(int node, Set<Integer> visited) { visited.add(node); for (int nb : adj.getOrDefault(node, List.of())) { if (!visited.contains(nb)) dfs(nb, visited); } } }
05
Section Five · Complexity

Time & Space Complexity

OperationTimeSpace
Add Edge (adj list)O(1)O(1)
Check Edge (adj list)O(degree)O(1)
Check Edge (adj matrix)O(1)O(1)
BFS / DFSO(V + E)O(V)
Topological SortO(V + E)O(V)
Dijkstra (min-heap)O((V + E) log V)O(V)
Bellman-FordO(V · E)O(V)
Floyd-WarshallO(V³)O(V²)
Union-Find (amortised)O(α(N)) ≈ O(1)O(V)
Why these complexities? BFS and DFS visit every vertex once and every edge once, giving O(V + E). Dijkstra adds a log V factor because each vertex extraction from the priority queue costs O(log V). Floyd-Warshall’s three nested loops over V give O(V³). Union-Find’s inverse Ackermann function α(N) grows so slowly it’s effectively constant for any practical input.
Space Complexity Note

Adjacency list: O(V + E) — each vertex has a list entry, each edge appears once (directed) or twice (undirected). Adjacency matrix: O(V²) regardless of edges. For sparse graphs (E ≪ V²), adjacency lists can be orders of magnitude more space-efficient. BFS/DFS additionally use O(V) for the visited set and queue/stack.

06
Section Six · Decision Guide

When to Use Graph

Use This When

  • Shortest path / minimum steps — BFS for unweighted, Dijkstra for weighted, Bellman-Ford for negative weights.
  • Dependency ordering — topological sort for course schedules, build systems, task pipelines.
  • Connected components — BFS/DFS or Union-Find to count islands, friend circles, or network clusters.
  • Cycle detection — DFS with state tracking for directed graphs; Union-Find for undirected.
  • Grid problems — treat each cell as a vertex with 4 edges — Number of Islands, Shortest Path in Grid, etc.

Avoid When

  • Hierarchical data only — if there are no cycles and one root, a tree is simpler and more efficient.
  • Sequential access — if you just need ordered traversal, arrays or linked lists are better.
  • Key-value lookup — HashMap gives O(1) access without graph overhead.
Algorithm Decision Guide
Problem TypeAlgorithmTimeKey Constraint
Shortest path (unweighted)BFSO(V + E)All edges weight 1
Shortest path (weighted, no neg)DijkstraO((V+E) log V)Non-negative weights
Shortest path (negative weights)Bellman-FordO(V · E)No negative cycles
All-pairs shortest pathFloyd-WarshallO(V³)Small V (≤ 400)
Dependency orderingTopological SortO(V + E)Must be a DAG
Minimum spanning treeKruskal / PrimO(E log E)Connected, undirected
Connected componentsUnion-Find / BFSO(V + E)Dynamic: prefer Union-Find
07
Section Seven · Interview Practice

LeetCode Problems

Graph problems in interviews test your ability to model a situation as vertices and edges, then pick the right traversal. The three core patterns are: BFS for shortest distance, DFS for exhaustive exploration, and topological sort for dependency ordering. If the problem mentions “connected”, “reachable”, “shortest”, “schedule”, or “prerequisites”, think graph.

  • MediumNumber of Islands — BFS/DFS flood-fill on a 2D grid — the classic “count connected components” problem.
  • MediumClone Graph — BFS/DFS with a HashMap to map original nodes to clones — tests deep copy of graph structure.
  • MediumCourse Schedule — Detect cycles in a directed graph using topological sort or DFS with 3-color marking.
  • MediumCourse Schedule II — Return a valid topological ordering; Kahn’s algorithm is the cleanest approach.
  • MediumPacific Atlantic Water Flow — Multi-source BFS/DFS from ocean borders; find cells reachable from both oceans.
  • MediumNetwork Delay Time — Dijkstra’s algorithm on a weighted directed graph — find the time for all nodes to receive a signal.
  • MediumRedundant Connection — Union-Find to find the edge that creates a cycle in an undirected graph.
  • HardWord Ladder — BFS on an implicit graph where words differing by one letter are connected — shortest transformation sequence.
  • HardAlien Dictionary — Build a directed graph from sorted alien words, then topological sort to find character order.
  • HardCheapest Flights Within K Stops — Modified Dijkstra / BFS with level constraint — tests graph + DP thinking.
Interview Pattern: When you see a problem about relationships or connections between entities, model it as a graph. For “minimum steps” use BFS. For “can I reach X?” or “explore all paths” use DFS. For “ordering with dependencies” use topological sort. For “shortest weighted path” use Dijkstra. The hardest part is recognising the graph — once you see it, the algorithm choice is usually straightforward.