Trie | DSA Guide | LearningTree

01

Section One · Foundation

What is a Trie?

A trie (pronounced "try") is a tree-like data structure where each node represents a single character and each path from the root to a terminal node spells a complete word — the name comes from "retrieval." The defining insight is that words sharing a common prefix share the same path in the tree: inserting "cat", "car", and "cup" creates only one 'c' node because all three words start with 'c', and only one 'a' node for "cat" and "car" because both share the prefix "ca." This shared-prefix property means a trie with 100,000 English words is far smaller than 100,000 separate strings because most words share common beginnings. Every operation — insert, search, startsWith — runs in O(L) time where L is the word length, completely independent of the number of words stored, because you simply walk one node per character. Unlike a hash map, which gives O(L) exact lookup but requires scanning every key for prefix queries, a trie navigates directly to the prefix node and the entire subtree below it contains all matching words. Java has no built-in Trie class — in interviews and practice you always implement it from scratch using a TrieNode with a children[26] array and a boolean isEnd flag.

Analogy: Think of the autocomplete dropdown on your phone's keyboard. As you type each letter, the system instantly narrows down suggestions — typing 's' shows all words starting with 's', typing 'se' narrows to 'se*' words, typing 'sea' narrows further to 'sea', 'seal', 'search', 'seat'. The phone isn't scanning its entire dictionary each time; it's walking down a trie one node per keystroke, and at every node the subtree below already contains exactly the matching words — grouped by prefix, ready to display. The trie is the data structure that makes "show me everything starting with X" a single downward walk instead of a full dictionary scan.

02

Section Two · Mental Model

How It Thinks

Each character occupies one level of the trie

▶

Word length L determines depth, not vocabulary size — a trie holding 1 million words is still only 20 levels deep if the longest word has 20 characters

Nodes are shared across words with a common prefix

▶

Inserting "apple" after "app" costs only 2 new nodes ('l' and 'e'), not 5 — the first three nodes already exist from "app"

A terminal flag (isEnd) marks word boundaries, not the absence of children

▶

"app" and "apple" can coexist — "app"'s final node has isEnd=true AND has child 'l', so it is simultaneously a complete word and a prefix of another word

Search follows the path character by character

▶

If any character's child pointer is null, the word is absent — no backtracking, no hash collisions, no comparisons beyond a single array index lookup per character

Prefix search stops at the prefix's last node

▶

All words in the subtree rooted there share that prefix — a single DFS/BFS from that node collects every autocomplete result without touching the rest of the trie

Deletion must check whether a node is shared (has other children or marks another word's end)

▶

Blindly removing nodes breaks other words — deletion requires careful bottom-up cleanup: only remove a node if it has no children and is not a terminal for another word

03

Section Three · Trie Structure

Trie Structure — The isEnd Flag

Every trie node holds two things: a children array (or map) that points to the next characters, and a boolean isEnd flag that marks whether a valid word ends at this node. For lowercase English letters, children is a fixed array of 26 slots — children[0] for 'a', children[1] for 'b', through children[25] for 'z' — where a null slot means that character doesn't follow the current prefix in any stored word. The isEnd flag is essential because a node's mere existence doesn't mean a word ends there — the node for 'p' in "app" exists because "apple" passes through it, but only when isEnd=true at that 'p' node does "app" count as a valid word. The classic mistake is confusing "node exists" with "word exists": without checking isEnd, searching "app" in a trie containing only "apple" would incorrectly return true.

TrieNode structure — children array + isEnd flag

Inserting "app" then "apple" then "apply" — isEnd enables coexistence

TrieNode Structure

Java

 class TrieNode {
TrieNode[] children = new TrieNode[26]; // a=0, b=1, ... z=25 boolean isEnd = false; // true if a valid word ends here
}
// Index mapping: char c → children[c - 'a'] // 'a' - 'a' = 0  →  children[0] // 'c' - 'a' = 2  →  children[2] // 'z' - 'a' = 25 →  children[25] // For non-lowercase-only use cases: // HashMap<Character, TrieNode> children = new HashMap<>(); // Trades O(1) array access for memory efficiency on sparse alphabets 

Array vs HashMap for children:

The children[26] array gives guaranteed O(1) lookup with zero overhead — just index by c - 'a'.
The downside is 26 pointers per node even if only 2 are used.
A HashMap<Character, TrieNode> uses only as much memory as the actual children, but with HashMap overhead per entry.
For interviews, almost always use children[26] for lowercase English — it's simpler, faster, and interviewers expect it.
Switch to HashMap only when the alphabet is large (Unicode) or mixed (letters + digits + symbols).

04

Section Four · Operations

Core Operations

Insert

Walk character by character from the root, creating new nodes wherever the child pointer is null, then mark the final node as a word boundary by setting isEnd = true. Always O(L) where L is the word length.

Pseudocode

 function insert(word):
    node = root
for each char c in word:
        idx = c - 'a' if node.children[idx] == null:
            node.children[idx] = new TrieNode() // create missing node
node = node.children[idx] // advance
node.isEnd = true // mark word boundary // O(L) — one node per character 

Insert — adding "search" to a trie containing "see"

Insert is always O(L):

Never O(n) where n is dictionary size.
Each character in the word requires exactly one array index lookup and possibly one node allocation.
A word of length 6 always costs exactly 6 steps regardless of whether the trie holds 10 words or 10 million.

Search (exact word)

Walk character by character from the root. If any child pointer is null, the word doesn't exist. If you reach the last character, return node.isEnd — not just "node exists."

Pseudocode

 function search(word):
    node = root
for each char c in word:
        idx = c - 'a' if node.children[idx] == null:
return false // character not in trie
node = node.children[idx] // advance return node.isEnd // true only if a word ends here // O(L) 

Search — "see" found (isEnd=true) vs "sea" not found (no 'a' child)

Two failure modes:

(1) A child pointer is null — the character doesn't follow this prefix in any word, so the word is absent entirely. (2) You reach the last character but isEnd is false — the path exists because it's a prefix of another word, but the word itself was never inserted.
Both cases return false. This distinction is the most common trie interview mistake.

startsWith (Prefix Search)

Identical to search, but the return condition changes: return true if the prefix path exists, regardless of isEnd. This is the operation that gives tries their structural advantage over hash maps.

Pseudocode

 function startsWith(prefix):
    node = root
for each char c in prefix:
        idx = c - 'a' if node.children[idx] == null:
return false // prefix not in trie
node = node.children[idx] // advance return true // path exists — ignore isEnd // O(L) — L = prefix length 

startsWith vs search differ by exactly one line:

search returns node.isEnd; startsWith returns true.
This is the most common interview follow-up after implementing search.
The autocomplete use case takes it further: startsWith finds the prefix node, then a DFS/BFS on the subtree collects all complete words below it — O(L) to navigate + O(K) to collect K matching characters.

Complexity Reference

Operation	Time	Space per insert	Notes
Insert	O(L)	O(L × 26) worst	L = word length; new nodes only for new chars
Search	O(L)	O(1)	No allocation; follow existing pointers
startsWith	O(L)	O(1)	Same as search; skip isEnd check
Delete	O(L)	O(1)	See Section 7; requires bottom-up check
Autocomplete	O(L + K)	O(K)	L to reach prefix node; K = output characters

05

Section Five · Prefix Power

Prefix Patterns — Why Tries Beat Hash Maps

A hash map gives O(1) average exact lookup, but answering "find all words starting with 'sea'" requires scanning every key — O(n·L) where n is the number of keys and L is average key length. A trie navigates to the prefix node in O(L) and the entire subtree below already contains every matching word, physically grouped by shared prefix. This structural advantage means prefix queries, autocomplete, spell-checking, and longest-prefix-match (IP routing) are all natural trie operations that hash maps cannot perform efficiently. Real-world applications include search engine autocomplete, phone contact search, DNS resolution, and word games like Boggle and Scrabble where you must verify "does any word start with this prefix?" thousands of times per second. When prefix queries are not needed and memory is tight, a simple HashSet<String> is the better choice — 26 pointers per trie node is expensive for pure exact-match lookups.

Prefix query "sea*" — HashMap O(n) scan vs Trie O(L) navigation

Autocomplete — collect all words under a prefix

Navigate to the prefix's last node in O(L), then DFS the subtree to collect all complete words — O(K) where K is the total characters in matching words.

Pseudocode

 function autocomplete(root, prefix):
    node = navigateTo(root, prefix) // O(L) — walk prefix path if node == null: return []
    results = []
dfs(node, prefix, results) // O(K) — collect all words in subtree return results
function dfs(node, current, results):
if node.isEnd: results.add(current)
for each (char c, child) in node.children:
if child != null:
dfs(child, current + c, results)

Autocomplete is O(L + K):

where K is the total characters across all matching words — not just the count of words.
DFS and BFS both work; DFS is simpler to implement recursively.
In production systems, a limit parameter stops collection after the first N results to avoid collecting the entire subtree when the prefix is short (like "a" matching tens of thousands of words).

Trie vs Alternatives

Structure	Exact Lookup	Prefix Query	Space	Best For
HashMap	O(L) avg	O(n·L) — full scan	Low	Exact lookup only
Sorted Array	O(L log n)	O(log n + K) range	Medium	Static dictionary
Trie	O(L)	O(L + K)	High (26×nodes)	Prefix queries, autocomplete
Ternary Search Tree	O(L)	O(L + K)	Medium	Memory-efficient trie alternative

06

Section Six · Variants

Trie Variants — Beyond the Standard Trie

The standard trie with children[26] is the interview baseline — know it cold before exploring variants. A compressed trie (Radix/Patricia tree) merges chains of single-child nodes into one edge with a string label, reducing node count dramatically for sparse dictionaries while preserving O(L) operations. A suffix trie stores all suffixes of a string, enabling O(L) substring search — used in DNA sequence matching and text editors' "find" feature. A bitwise trie uses children[2] (binary digits) instead of children[26] and stores integers bit by bit from MSB to LSB — the key structure for maximum-XOR problems on integer arrays. For interviews, compressed tries and bitwise tries appear as follow-ups; suffix structures appear in hard string problems.

Standard Trie vs Compressed Trie (Radix Tree) — same words, fewer nodes

Bitwise Trie — storing integers bit by bit for XOR problems

Trie Variants Reference

Variant	Children Per Node	Key Use Case	Interview Frequency
Standard Trie	children[26]	Autocomplete, spell-check, word search	Very High
Compressed (Radix)	children[26]	Memory-efficient prefix storage	Low (conceptual)
Suffix Trie	children[26+]	Substring search in O(L)	Medium (hard problems)
Bitwise Trie	children[2]	Max XOR, XOR range queries	Medium
Ternary Search Tree	3 children	Memory-efficient, cache-friendly	Low

Bitwise Trie signal:

If a problem mentions XOR + array + "maximum" or "queries", think bitwise trie immediately.
The pattern: insert each number bit by bit (MSB first, 31 bits for positive ints), then query by greedily choosing the opposite bit at each level to maximize XOR.
This gives O(32) per query — effectively O(1). LC 421 (Maximum XOR of Two Numbers) is the canonical example.

07

Section Seven · Delete

The Delete Problem — Careful Cleanup

Deletion in a trie is trickier than insert because nodes may be shared between multiple words — blindly removing nodes along a word's path can destroy other words that pass through the same nodes. Three cases arise: (1) the word doesn't exist — do nothing; (2) the word is a prefix of another word — just clear isEnd at the terminal node but keep the node alive because it has children; (3) the word's path has nodes not shared by other words — safely remove nodes bottom-up, stopping when you hit a node that is either a terminal for another word or has other children. The key check at each node during backtracking: can we delete this node? Only if isEnd is false AND all children are null. In practice, most interview problems don't require deletion — but understanding when nodes are shared is essential for LC 211 (Design Add and Search Words) and LC 1268 (Search Autocomplete).

Delete — three cases requiring different handling

Delete Implementation

Java

 // Returns true if the PARENT should delete its reference to this node boolean delete(TrieNode node, String word, int depth) {
if (node == null) return false; // word not in trie if (depth == word.length()) {
if (!node.isEnd) return false; // word not actually stored
node.isEnd = false; // unmark word boundary return isEmpty(node); // safe to remove if no children
}
int idx = word.charAt(depth) - 'a';
if (delete(node.children[idx], word, depth + 1)) {
        node.children[idx] = null; // child was deleted — null out ref return !node.isEnd && isEmpty(node); // can we delete this node too?
}
return false; // child was kept — keep this node
}
boolean isEmpty(TrieNode node) {
for (TrieNode child : node.children)
if (child != null) return false;
return true; // all 26 children are null
}
// O(L) time, O(L) stack space 

The boolean return value is the key insight: true means "the child was safely deleted — parent should null out its reference to it." false means "the child must stay — either the word wasn't found, or the child node has other children or is a terminal for another word." This bottom-up signal propagation is the same pattern as BST delete's return-node trick — the decision to keep or remove happens on the way back up the recursion stack.

Common Mistake: Just setting isEnd = false without bottom-up cleanup. This is technically correct for "search" behavior — the word will no longer be found — but it leaks nodes. Over thousands of inserts and deletes, the trie accumulates dead branches that waste memory. In interview settings, clarify whether deletion must reclaim memory or just logically remove the word. If the interviewer says "just mark it", isEnd = false is sufficient. If they want full deletion, you need the recursive approach above.

08

Section Eight · Implementation

Build It Once

Build this from scratch once — it makes the mechanics concrete. In any real project or interview, start from this skeleton.

Java — Trie core mechanics

 class TrieNode {
TrieNode[] children = new TrieNode[26];
boolean isEnd;
}
class Trie {
TrieNode root = new TrieNode();
// INSERT — O(L) void insert(String word) {
TrieNode node = root;
for (char c : word.toCharArray()) {
int i = c - 'a';
if (node.children[i] == null)
                node.children[i] = new TrieNode();
            node = node.children[i];
        }
        node.isEnd = true;
    }
// SEARCH — O(L) boolean search(String word) {
TrieNode node = find(word);
return node != null && node.isEnd;
    }
// STARTS WITH — O(L) boolean startsWith(String prefix) {
return find(prefix) != null;
    }
private TrieNode find(String s) {
TrieNode node = root;
for (char c : s.toCharArray()) {
            node = node.children[c - 'a'];
if (node == null) return null;
        }
return node;
    }
}

09

Section Nine · Java Reference

Use It In Java

IN JAVA — No built-in Trie; build from TrieNode

Standard Interview Trie Pattern

 Trie trie = new Trie();
trie.insert("apple");
trie.insert("app");

trie.search("apple"); // → true   (exact word exists)
trie.search("app"); // → true   ("app" was inserted)
trie.search("ap"); // → false  ("ap" was never inserted)
trie.startsWith("app"); // → true   (prefix exists — "app", "apple")
trie.startsWith("xyz"); // → false  (no word starts with "xyz") // Key distinction: search("ap") is false, startsWith("ap") is true // search checks isEnd; startsWith checks only path existence 

Adapting Trie for Different Inputs

 // Lowercase English only — most interview problems TrieNode[] children = new TrieNode[26]; // index = ch - 'a' // Case-insensitive — toLowerCase() before insert/search
word = word.toLowerCase(); // normalize first // Full ASCII (128 chars) — letters + digits + symbols TrieNode[] children = new TrieNode[128]; // index = (int) ch // Unicode / arbitrary chars — HashMap-based HashMap<Character, TrieNode> children = new HashMap<>();
// Digits only — phone numbers, IP addresses TrieNode[] children = new TrieNode[10]; // index = ch - '0' 

When to reach for a Trie in interviews

Problem Signal	Structure	Why
"autocomplete / all words with prefix X"	Trie	O(L+K) vs O(n·L) hash scan
"word exists in dictionary" (static)	HashSet<String>	O(L) with less code; trie is overkill
"maximum XOR of two numbers"	Bitwise Trie	Greedy opposite-bit traversal
"word search on a 2D board"	DFS + Trie	Trie prunes when checking many words
"design search autocomplete system"	Trie + frequency	Classic design question

⚠ Gotcha: Always validate ch - 'a' is in [0, 25] before indexing children[]. Non-lowercase input causes ArrayIndexOutOfBoundsException silently if unchecked. Add a guard: if (ch < 'a' || ch > 'z') throw new IllegalArgumentException(), or use a HashMap-based node for mixed input.

⚠ Gotcha: Some interview variants ask insert to return a boolean (was the word already present). The check: read node.isEnd before setting it to true, and return the old value. If it was already true, the word was a duplicate.

10

Section Ten · Practice

Problems To Solve

Trie problems cluster into three types: implement the trie itself (LC 208), use the trie to solve a string/prefix problem (LC 212, 421), or design a system using a trie (LC 642, 1268). The hardest trie problems combine trie with DFS — Word Search II builds a trie from the dictionary and DFS-es the board simultaneously, pruning branches the moment a prefix has no match. Recognise the trie signal: "prefix", "autocomplete", "words starting with", "maximum XOR." For design problems, the trie is always built incrementally on insert, never rebuilt. Interview strategy: implement TrieNode + insert + search + startsWith first — these three methods appear on 90% of trie problems.

Difficulty	Pattern	Problem	Key Insight
Medium	trie	Implement Trie (Prefix Tree) — LC 208	Implement TrieNode with `children[26]` and `isEnd`. search returns `node.isEnd`; startsWith returns `node != null`. The only difference between the two is the final check.
Hard	dfs + trie	Word Search II — LC 212	Build a trie from the word list; DFS the board while walking the trie simultaneously. When `node.isEnd` is true, add the word. Store the word at the terminal node instead of reconstructing from the path.
Medium	trie + dfs	Design Add and Search Words — LC 211	Insert is standard. Search handles '.' wildcards by trying all 26 children at that position — recursive DFS on the trie, not the input string.
Medium	bitwise trie	Maximum XOR of Two Numbers — LC 421	Insert each number bit by bit (MSB first, 31 bits for int). For each number, query the trie greedily: at each bit, try the opposite bit first to maximize XOR.
Medium	trie	Replace Words — LC 648	Insert all roots into the trie. For each word in the sentence, walk the trie character by character; return the root the moment `isEnd` is true.
Hard	trie + ranking	Design Search Autocomplete System — LC 642	Build a trie on insert; store (sentence, frequency) at terminal nodes. On each character input, navigate to the current prefix node, then DFS to collect all sentences. Sort by frequency descending.

Interview Pattern:

When you see "prefix", reach for a trie. When you see "XOR maximum", reach for a bitwise trie.
When you see "word search on a grid with a dictionary", reach for trie + DFS.
The trie's power is spatial grouping — words sharing a prefix share nodes, which means the data structure physically navigates you to the right answer rather than scanning all options.
In every case, the trie acts as a pruning device: it tells you "no word starts with this prefix" in O(L), letting you abandon a search path early.

→ See the full Trie practice set