Hash Map Fundamentals
A hash map (hash table) stores key-value pairs and uses a hash function to compute an index into an array of buckets, achieving O(1) average-case put, get, and delete. It is the single most frequently tested data structure in coding interviews — Two Sum, frequency counting, grouping, and prefix-sum problems all rely on the constant-time lookup that only a hash map provides.
What is Hash Map?
A hash map maps keys to values using a hash function that converts each key into an integer index. That index determines which bucket in an internal array stores the key-value pair. Because computing the hash and jumping to the bucket are both O(1), put/get/delete are all O(1) on average. The catch: two different keys can hash to the same bucket — a collision — which must be resolved.
Two main strategies for handling collisions:
Linked Lists
Each bucket holds a linked list (or tree) of entries that hashed to the same index. On collision, append to the list. Java’s HashMap uses this — lists upgrade to red-black trees at 8 entries.
- Simple to implement
- Graceful degradation under load
- Extra pointer overhead per entry
Probing
All entries live inside the array itself. On collision, probe for the next open slot (linear, quadratic, or double hashing). Python’s dict uses this approach.
- Cache-friendly — no pointer chasing
- Needs a good probe sequence
- Degrades badly above ~70% load
Core Operations
getOrDefault() to avoid null checks. In Java, map.getOrDefault(key, 0) returns 0 if the key is absent — perfect for frequency counting. In Python, dict.get(key, 0) or collections.defaultdict(int) does the same. This eliminates the entire “check then get” pattern.List as a key and then modify the list, its hash code changes but the map still has it in the old bucket — the entry is effectively lost. Always use immutable keys: String, Integer, or unmodifiable collections. In Python, use tuples (hashable) instead of lists (not hashable).
Diagrams & Memory Layout
Java Quick Reference
The most common hash map operations using Java’s HashMap — plus essential interview utilities like getOrDefault and merge.
Time & Space Complexity
| Operation | Best Case | Average Case | Worst Case | Space |
|---|---|---|---|---|
| Put | O(1) | O(1) | O(n) all keys collide | O(1) |
| Get | O(1) | O(1) | O(n) all keys collide | O(1) |
| Remove | O(1) | O(1) | O(n) all keys collide | O(1) |
| ContainsKey | O(1) | O(1) | O(n) all keys collide | O(1) |
A hash map of n entries uses O(n) space for the entries plus O(capacity) for the bucket array. Because capacity is proportional to n (load factor keeps it within 1.33× to 2×), total space is O(n). Each entry in a chaining implementation has key + value + pointer overhead. Rehashing temporarily uses O(n) extra space during the copy. For interviews, state “O(n) space” and you’re correct.
When to Use Hash Map
Use This When
- You need O(1) lookup by key — frequency counting, “have I seen this before?”, Two Sum complement check, and anagram grouping all hinge on constant-time contains/get.
- Reducing O(n²) to O(n) — any time a brute-force solution uses nested loops to “find a match”, a hash map can replace the inner loop with a single lookup.
- Prefix-sum with target — storing prefix sums as keys lets you check “does prefix[j] − target exist?” in O(1) — the pattern behind subarray sum problems.
Avoid When
- You need sorted order — hash maps have no ordering guarantee. Use a
TreeMap(Java) orSortedDict(Python) for ordered iteration. - Keys are small integers — a plain array with key-as-index is faster and uses less memory (e.g., character frequency:
int[26]beats a HashMap).
| Structure | Lookup | Insert | Ordered? | Best For |
|---|---|---|---|---|
| HashMap ← this one | O(1) avg | O(1) avg | No | Frequency count, Two Sum, grouping |
| TreeMap | O(log n) | O(log n) | Yes (sorted) | Range queries, ordered iteration |
| HashSet | O(1) avg | O(1) avg | No | Membership check (keys only, no values) |
LeetCode Problems
Hash map interview problems fall into four patterns: frequency counting (count occurrences, check anagrams), complement lookup (Two Sum and its variants), prefix-sum caching (subarray sum equals K), and grouping by key (group anagrams, bucket sort). If you can spot which pattern applies, the solution writes itself.
- Easy Two Sum —
Store each number’s index; for each element, check if
target − numexists in the map — O(n) single pass. - Medium Group Anagrams — Sort each word (or count its characters) to form a key; group words with the same key in a map — O(n·k log k).
- Medium Top K Frequent Elements — Count frequencies with a map, then use a min-heap of size K or bucket sort to extract the top K — O(n).
- Medium Longest Consecutive Sequence —
Put all numbers in a HashSet; for each number that isn’t
num − 1in the set, count the streak forward — O(n). - Hard Subarray Sum Equals K —
Cache prefix sums in a map; at each index check if
prefix − Kexists — classic prefix-sum + hash map in O(n).