Graph | DSA Guide | LearningTree

01

Section One · Foundation

What is a Graph?

A graph G = (V, E) consists of a set of vertices V and a set of edges E that connect pairs of vertices. Unlike trees, graphs can have cycles, disconnected components, and edges with weights or directions. This flexibility makes graphs the go-to structure whenever you need to model relationships — friendships, road maps, task dependencies, web links, circuit connections, or state machines.

Graphs are stored in two main representations:

Adjacency List

Map<V, List<V>>

Each vertex stores a list of its neighbours. Space-efficient for sparse graphs (E ≪ V²) and supports fast iteration over neighbours.

Space: O(V + E)
Add edge: O(1)
Check edge: O(degree)
Best for most interview problems

Adjacency Matrix

boolean[V][V]

A V×V matrix where matrix[u][v] = true means an edge from u to v. O(1) edge lookup but wastes space for sparse graphs.

Space: O(V²)
Add edge: O(1)
Check edge: O(1)
Best for dense graphs or Floyd-Warshall

Analogy: Think of a city road map. Intersections are vertices, roads are edges. Some roads are one-way (directed), some have tolls (weighted). Finding the fastest route from home to work is a shortest-path problem. Detecting if you can loop back to your starting point is cycle detection. A graph captures all these relationships naturally.

Key Characteristics

Directed vs Undirected

▶

Directed edges have a source → destination; undirected edges go both ways

Weighted vs Unweighted

▶

Weighted edges carry a cost/distance; unweighted edges are all equal

Cyclic vs Acyclic

▶

Cyclic graphs contain loops; DAGs (Directed Acyclic Graphs) enable topological sort

Connected vs Disconnected

▶

In a connected graph every vertex is reachable from every other; disconnected has multiple components

Sparse vs Dense

▶

Sparse: E ≈ V — use adjacency list. Dense: E ≈ V² — adjacency matrix may be better

Graph Types at a Glance

Type	Edges	Cycles	Example
Undirected	Bidirectional	Possible	Social network friendships
Directed (Digraph)	One-way	Possible	Web page links, Twitter follows
DAG	Directed	None	Build systems, task scheduling
Weighted	Have costs	Possible	Road networks, airline routes
Bipartite	Two groups	Even cycles only	Job matching, student-course

02

Section Two · Operations

Core Operations

BFS — Breadth-First Search

Explores neighbours level by level using a queue. Finds the shortest path in unweighted graphs and detects connected components.

Pseudocode

function bfs(graph, start):
    visited = Set()
    queue = [start]
    visited.add(start)
while queue is not empty:
        node = queue.poll() // dequeue front process(node)
for neighbor in graph[node]:
if neighbor not in visited:
                visited.add(neighbor)
                queue.add(neighbor) // enqueue // Time: O(V + E)   Space: O(V)

BFS guarantees shortest path in unweighted graphs. Because it processes all nodes at distance d before any node at distance d+1, the first time you reach a node is via the shortest path. This is why BFS is the go-to algorithm for “minimum steps” problems like word ladder or grid shortest path.

DFS — Depth-First Search

Explores as deep as possible before backtracking, using recursion or an explicit stack. Powers cycle detection, topological sort, and path finding.

Pseudocode

function dfs(graph, node, visited):
    visited.add(node)
process(node)
for neighbor in graph[node]:
if neighbor not in visited:
dfs(graph, neighbor, visited)
// Time: O(V + E)   Space: O(V) recursion stack

DFS is the backbone of cycle detection and topological sort. By tracking three states (unvisited, in-progress, done), you can detect back edges (cycles) during DFS. Appending nodes to a result list in post-order gives you reverse topological order — the foundation for Course Schedule problems.

Topological Sort (Kahn’s Algorithm)

Produces a linear ordering of vertices in a DAG such that for every directed edge u → v, u comes before v. Uses BFS with in-degree tracking.

Pseudocode

function topologicalSort(graph):
    inDegree = computeInDegrees(graph)
    queue = [v for v in graph if inDegree[v] == 0]
    order = []
while queue is not empty:
        node = queue.poll()
        order.add(node)
for neighbor in graph[node]:
            inDegree[neighbor] -= 1 if inDegree[neighbor] == 0:
                queue.add(neighbor)
if order.size != graph.size:
throw "Cycle detected!" // not a DAG return order
// Time: O(V + E)   Space: O(V)

If the result has fewer nodes than the graph, there’s a cycle. Kahn’s algorithm doubles as a cycle detector for directed graphs. This is the cleaner approach for Course Schedule problems compared to DFS-based topological sort.

Dijkstra’s Shortest Path

Finds the shortest path from a source to all other vertices in a weighted graph with non-negative edge weights, using a priority queue (min-heap).

Pseudocode

function dijkstra(graph, source):
    dist = {v: ∞ for v in graph}
    dist[source] = 0
pq = MinHeap()
    pq.add((0, source))
while pq is not empty:
        (d, u) = pq.poll()
if d > dist[u]: continue // stale entry for (v, weight) in graph[u]:
if dist[u] + weight < dist[v]:
                dist[v] = dist[u] + weight
                pq.add((dist[v], v))
return dist
// Time: O((V + E) log V)   Space: O(V)

Dijkstra fails with negative weights — use Bellman-Ford instead. Dijkstra greedily locks in the shortest distance. A negative edge could make a “locked” distance wrong. For graphs with negative weights (but no negative cycles), Bellman-Ford runs in O(V · E).

Union-Find (Disjoint Set)

Tracks connected components efficiently. Supports near-O(1) union and find operations with path compression and union by rank. Essential for Kruskal’s MST and dynamic connectivity.

Pseudocode

class UnionFind:
    parent[i] = i // each node is its own root
rank[i] = 0
function find(x):
if parent[x] != x:
            parent[x] = find(parent[x]) // path compression return parent[x]
function union(x, y):
        px, py = find(x), find(y)
if px == py: return false // already connected if rank[px] < rank[py]: swap(px, py)
        parent[py] = px
if rank[px] == rank[py]: rank[px]++
return true // Near O(1) amortised per operation (inverse Ackermann)

Union-Find is the fastest way to check connectivity dynamically. If a problem asks “are these two nodes connected?” or “how many connected components?” with edges added over time, Union-Find beats BFS/DFS because each operation is nearly O(1) amortised.

Common Mistake: Forgetting to mark nodes as visited during BFS/DFS. Without a visited set, you’ll revisit the same nodes endlessly in cyclic graphs, causing infinite loops or TLE. Always add a node to the visited set when you enqueue/push it, not when you dequeue/pop — this prevents duplicate entries in the queue.

03

Section Three · Visuals

Diagrams & Structure

Adjacency List vs Adjacency Matrix

Graph: A—B, A—C, B—C, B—D (undirected)

BFS Traversal Walkthrough

BFS from A on graph: A—B, A—C, B—D, C—D, D—E

DFS Traversal Walkthrough

DFS from A (recursive) on same graph

Topological Sort — Course Prerequisites

Courses: 0→1, 0→2, 1→3, 2→3 (must take prereqs first)

04

Section Four · Code

Java Quick Reference

Adjacency-list graph with BFS, DFS, and topological sort — the three operations that cover 90% of interview graph problems.

Java — Graph (adjacency list)

class Graph {
private Map<Integer, List<Integer>> adj = new HashMap<>();
void addEdge(int u, int v) { // undirected
adj.computeIfAbsent(u, k -> new ArrayList<>()).add(v);
        adj.computeIfAbsent(v, k -> new ArrayList<>()).add(u);
    }
List<Integer> bfs(int start) { // O(V+E) List<Integer> order = new ArrayList<>();
Set<Integer> visited = new HashSet<>();
Queue<Integer> q = new LinkedList<>();
        q.add(start); visited.add(start);
while (!q.isEmpty()) {
int node = q.poll();
            order.add(node);
for (int nb : adj.getOrDefault(node, List.of())) {
if (visited.add(nb)) q.add(nb);
            }
        }
return order;
    }
void dfs(int node, Set<Integer> visited) {
        visited.add(node);
for (int nb : adj.getOrDefault(node, List.of())) {
if (!visited.contains(nb)) dfs(nb, visited);
        }
    }
}

05

Section Five · Complexity

Time & Space Complexity

Operation	Time	Space
Add Edge (adj list)	O(1)	O(1)
Check Edge (adj list)	O(degree)	O(1)
Check Edge (adj matrix)	O(1)	O(1)
BFS / DFS	O(V + E)	O(V)
Topological Sort	O(V + E)	O(V)
Dijkstra (min-heap)	O((V + E) log V)	O(V)
Bellman-Ford	O(V · E)	O(V)
Floyd-Warshall	O(V³)	O(V²)
Union-Find (amortised)	O(α(N)) ≈ O(1)	O(V)

Why these complexities? BFS and DFS visit every vertex once and every edge once, giving O(V + E). Dijkstra adds a log V factor because each vertex extraction from the priority queue costs O(log V). Floyd-Warshall’s three nested loops over V give O(V³). Union-Find’s inverse Ackermann function α(N) grows so slowly it’s effectively constant for any practical input.

Space Complexity Note

Adjacency list: O(V + E) — each vertex has a list entry, each edge appears once (directed) or twice (undirected). Adjacency matrix: O(V²) regardless of edges. For sparse graphs (E ≪ V²), adjacency lists can be orders of magnitude more space-efficient. BFS/DFS additionally use O(V) for the visited set and queue/stack.

06

Section Six · Decision Guide

When to Use Graph

✓

Use This When

Shortest path / minimum steps — BFS for unweighted, Dijkstra for weighted, Bellman-Ford for negative weights.
Dependency ordering — topological sort for course schedules, build systems, task pipelines.
Connected components — BFS/DFS or Union-Find to count islands, friend circles, or network clusters.
Cycle detection — DFS with state tracking for directed graphs; Union-Find for undirected.
Grid problems — treat each cell as a vertex with 4 edges — Number of Islands, Shortest Path in Grid, etc.

✗

Avoid When

Hierarchical data only — if there are no cycles and one root, a tree is simpler and more efficient.
Sequential access — if you just need ordered traversal, arrays or linked lists are better.
Key-value lookup — HashMap gives O(1) access without graph overhead.

Algorithm Decision Guide

Problem Type	Algorithm	Time	Key Constraint
Shortest path (unweighted)	BFS	O(V + E)	All edges weight 1
Shortest path (weighted, no neg)	Dijkstra	O((V+E) log V)	Non-negative weights
Shortest path (negative weights)	Bellman-Ford	O(V · E)	No negative cycles
All-pairs shortest path	Floyd-Warshall	O(V³)	Small V (≤ 400)
Dependency ordering	Topological Sort	O(V + E)	Must be a DAG
Minimum spanning tree	Kruskal / Prim	O(E log E)	Connected, undirected
Connected components	Union-Find / BFS	O(V + E)	Dynamic: prefer Union-Find

07

Section Seven · Interview Practice

LeetCode Problems

Graph problems in interviews test your ability to model a situation as vertices and edges, then pick the right traversal. The three core patterns are: BFS for shortest distance, DFS for exhaustive exploration, and topological sort for dependency ordering. If the problem mentions “connected”, “reachable”, “shortest”, “schedule”, or “prerequisites”, think graph.

MediumNumber of Islands — BFS/DFS flood-fill on a 2D grid — the classic “count connected components” problem.
MediumClone Graph — BFS/DFS with a HashMap to map original nodes to clones — tests deep copy of graph structure.
MediumCourse Schedule — Detect cycles in a directed graph using topological sort or DFS with 3-color marking.
MediumCourse Schedule II — Return a valid topological ordering; Kahn’s algorithm is the cleanest approach.
MediumPacific Atlantic Water Flow — Multi-source BFS/DFS from ocean borders; find cells reachable from both oceans.
MediumNetwork Delay Time — Dijkstra’s algorithm on a weighted directed graph — find the time for all nodes to receive a signal.
MediumRedundant Connection — Union-Find to find the edge that creates a cycle in an undirected graph.
HardWord Ladder — BFS on an implicit graph where words differing by one letter are connected — shortest transformation sequence.
HardAlien Dictionary — Build a directed graph from sorted alien words, then topological sort to find character order.
HardCheapest Flights Within K Stops — Modified Dijkstra / BFS with level constraint — tests graph + DP thinking.

Interview Pattern: When you see a problem about relationships or connections between entities, model it as a graph. For “minimum steps” use BFS. For “can I reach X?” or “explore all paths” use DFS. For “ordering with dependencies” use topological sort. For “shortest weighted path” use Dijkstra. The hardest part is recognising the graph — once you see it, the algorithm choice is usually straightforward.