Hash Map | DSA Guide | LearningTree

01

Section One · Foundation

What is Hash Map?

A hash map maps keys to values using a hash function that converts each key into an integer index. That index determines which bucket in an internal array stores the key-value pair. Because computing the hash and jumping to the bucket are both O(1), put/get/delete are all O(1) on average. The catch: two different keys can hash to the same bucket — a collision — which must be resolved.

Two main strategies for handling collisions:

Chaining

Linked Lists

Each bucket holds a linked list (or tree) of entries that hashed to the same index. On collision, append to the list. Java’s HashMap uses this — lists upgrade to red-black trees at 8 entries.

Simple to implement
Graceful degradation under load
Extra pointer overhead per entry

Open Addressing

Probing

All entries live inside the array itself. On collision, probe for the next open slot (linear, quadratic, or double hashing). Python’s dict uses this approach.

Cache-friendly — no pointer chasing
Needs a good probe sequence
Degrades badly above ~70% load

Analogy: A coat check at a theatre. You hand your coat (value) with a ticket number (key). The attendant uses a formula to decide which rack (bucket) to hang it on. When you return with your ticket, the formula instantly tells them which rack to check — no scanning every coat. If two tickets map to the same rack, both coats hang there and you check the name tag.

Key Characteristics

Hash function maps key → index

▶

O(1) average lookup — no scanning

Collisions are inevitable

▶

Chaining or probing handles them

Load factor triggers resize

▶

Rehash all entries when table grows — O(n) but rare

Keys must be immutable & hashable

▶

Mutable keys break the hash → lost entries

No ordering of entries

▶

Use TreeMap / sorted dict if order matters

02

Section Two · Operations

Core Operations

Put (Insert / Update)

Computes the bucket index from the key’s hash, then inserts the key-value pair or updates the value if the key already exists.

Pseudocode

function put(map, key, value):
    index = hash(key) % map.capacity
    bucket = map.table[index]
for entry in bucket:
if entry.key == key:
            entry.value = value // update existing return
bucket.append(key, value) // new entry
map.size += 1 if map.size / map.capacity > LOAD_FACTOR:
resize(map) // rehash all entries

The load factor controls the space-time trade-off. Java’s default is 0.75 — when 75% of buckets are occupied, the table doubles and all entries are rehashed. A lower load factor wastes memory but reduces collisions; a higher one saves memory but slows lookups. For interviews, 0.75 is always the right answer.

Get (Lookup)

Computes the bucket index from the key’s hash, then searches that single bucket for a matching key and returns its value.

Pseudocode

function get(map, key):
    index = hash(key) % map.capacity
    bucket = map.table[index]
for entry in bucket:
if entry.key == key:
return entry.value // found — O(1) average return null // key not present

Use getOrDefault() to avoid null checks. In Java, map.getOrDefault(key, 0) returns 0 if the key is absent — perfect for frequency counting. In Python, dict.get(key, 0) or collections.defaultdict(int) does the same. This eliminates the entire “check then get” pattern.

Remove (Delete)

Locates the entry by hashing the key, then removes it from the bucket and decrements the size.

Pseudocode

function remove(map, key):
    index = hash(key) % map.capacity
    bucket = map.table[index]
for entry in bucket:
if entry.key == key:
            bucket.remove(entry) // O(1) for linked list
map.size -= 1 return entry.value
return null // key not found

Open addressing requires tombstones. In chaining, removing a node from the linked list is straightforward. In open addressing, you cannot simply empty the slot — that would break probe chains for other keys. Instead, mark the slot as “deleted” (a tombstone) so probing continues past it. This is why chaining is simpler to implement correctly.

Frequency Count Pattern

Uses a hash map to count occurrences of each element in O(n) — the single most common hash map usage in interviews.

Pseudocode — Count Frequencies

function countFrequencies(arr):
    freq = {} // empty hash map for element in arr:
        freq[element] = freq.getOrDefault(element, 0) + 1 return freq // O(n) time, O(k) space // Two Sum — the canonical hash map problem function twoSum(nums, target):
    seen = {} // value → index for i, num in nums:
        complement = target - num
if complement in seen:
return [seen[complement], i]
        seen[num] = i // O(n) total — one pass

“Can I reduce O(n²) to O(n)?” — the answer is almost always a hash map. Whenever you have a nested loop checking “does X exist?” or “have I seen Y before?”, replace the inner loop with a hash map lookup. This converts O(n²) brute force into O(n) single-pass — the core insight behind Two Sum, anagram grouping, and subarray sum problems.

Common Mistake: Using mutable objects as keys. In Java, if you put a List as a key and then modify the list, its hash code changes but the map still has it in the old bucket — the entry is effectively lost. Always use immutable keys: String, Integer, or unmodifiable collections. In Python, use tuples (hashable) instead of lists (not hashable).

03

Section Three · Visuals

Diagrams & Memory Layout

Hash Map Structure — Chaining

Bucket array with linked-list chains for collision resolution

Put Operation — Hash, Locate, Insert

Inserting “fig”:4 — hash to bucket, append to chain

Rehashing — Table Resize

When load exceeds threshold, double capacity and rehash all entries

04

Section Four · Code

Java Quick Reference

The most common hash map operations using Java’s HashMap — plus essential interview utilities like getOrDefault and merge.

Java — HashMap Core Operations

import java.util.*;
Map<String, Integer> map = new HashMap<>();
// O(1) avg — put
map.put("apple", 5);
map.put("banana", 3);
map.put("cherry", 7);
// O(1) avg — get int val = map.get("banana"); // 3 int def = map.getOrDefault("fig", 0); // 0 (not found) // O(1) avg — containsKey boolean has = map.containsKey("apple"); // true // O(1) avg — remove
map.remove("cherry");
// Frequency counting — the #1 interview pattern
map.merge("apple", 1, Integer::sum); // increment by 1 // Iterate entries for (Map.Entry<String, Integer> e : map.entrySet())
System.out.println(e.getKey() + " = " + e.getValue());

05

Section Five · Complexity

Time & Space Complexity

Operation	Best Case	Average Case	Worst Case	Space
Put	O(1)	O(1)	O(n) all keys collide	O(1)
Get	O(1)	O(1)	O(n) all keys collide	O(1)
Remove	O(1)	O(1)	O(n) all keys collide	O(1)
ContainsKey	O(1)	O(1)	O(n) all keys collide	O(1)

Why these complexities? The hash function distributes keys uniformly across buckets. With a good hash and a load factor ≤ 0.75, each bucket holds ~1 entry on average — so every operation is a constant-time index calculation plus a constant-time chain scan. The O(n) worst case only happens if all keys hash to the same bucket (adversarial input). Java 8 mitigates this by converting long chains to red-black trees, capping the worst case at O(log n).

Space Complexity Note

A hash map of n entries uses O(n) space for the entries plus O(capacity) for the bucket array. Because capacity is proportional to n (load factor keeps it within 1.33× to 2×), total space is O(n). Each entry in a chaining implementation has key + value + pointer overhead. Rehashing temporarily uses O(n) extra space during the copy. For interviews, state “O(n) space” and you’re correct.

06

Section Six · Decision Guide

When to Use Hash Map

✓

Use This When

You need O(1) lookup by key — frequency counting, “have I seen this before?”, Two Sum complement check, and anagram grouping all hinge on constant-time contains/get.
Reducing O(n²) to O(n) — any time a brute-force solution uses nested loops to “find a match”, a hash map can replace the inner loop with a single lookup.
Prefix-sum with target — storing prefix sums as keys lets you check “does prefix[j] − target exist?” in O(1) — the pattern behind subarray sum problems.

✗

Avoid When

You need sorted order — hash maps have no ordering guarantee. Use a TreeMap (Java) or SortedDict (Python) for ordered iteration.
Keys are small integers — a plain array with key-as-index is faster and uses less memory (e.g., character frequency: int[26] beats a HashMap).

Comparison vs Alternatives

Structure	Lookup	Insert	Ordered?	Best For
HashMap ← this one	O(1) avg	O(1) avg	No	Frequency count, Two Sum, grouping
TreeMap	O(log n)	O(log n)	Yes (sorted)	Range queries, ordered iteration
HashSet	O(1) avg	O(1) avg	No	Membership check (keys only, no values)

07

Section Seven · Interview Practice

LeetCode Problems

Hash map interview problems fall into four patterns: frequency counting (count occurrences, check anagrams), complement lookup (Two Sum and its variants), prefix-sum caching (subarray sum equals K), and grouping by key (group anagrams, bucket sort). If you can spot which pattern applies, the solution writes itself.

Easy Two Sum — Store each number’s index; for each element, check if target − num exists in the map — O(n) single pass.
Medium Group Anagrams — Sort each word (or count its characters) to form a key; group words with the same key in a map — O(n·k log k).
Medium Top K Frequent Elements — Count frequencies with a map, then use a min-heap of size K or bucket sort to extract the top K — O(n).
Medium Longest Consecutive Sequence — Put all numbers in a HashSet; for each number that isn’t num − 1 in the set, count the streak forward — O(n).
Hard Subarray Sum Equals K — Cache prefix sums in a map; at each index check if prefix − K exists — classic prefix-sum + hash map in O(n).

Interview Pattern: When a brute-force approach uses a nested loop asking “does this value exist?” or “have I seen this complement?”, replace the inner loop with a hash map lookup. This single refactoring turns O(n²) into O(n) — and it applies to roughly 30% of all LeetCode mediums.