Hashing

HashMap Fundamentals

O(1) average-case get, put, and remove by hashing keys to bucket indices. The single most versatile data structure in the Java standard library โ€” underpinning frequency counting, memoization, grouping, and the majority of interview solutions.

Foundations ยท Array ยท Linked List ยท Stack ยท Queue ยท HashMap ยท Binary Tree ยท BST ยท Heap ยท Trie ยท Graph ยท Sorting ยท Binary Search ยท Two Pointers ยท Sliding Window ยท Dynamic Prog.
01
Section One ยท Foundation

What is a HashMap?

[0] [1] Aโ†’1 [2] [3] Bโ†’7 Cโ†’4 โˆ… [4] [5] Dโ†’9 hash(key) key: A key: D O(1) average

Every key inserted into a HashMap is transformed by a hash function into an array index โ€” a bucket โ€” allowing direct access to its value without scanning any other entries. O(1) average access is possible because the hash function computes the bucket index in constant time, and reading or writing one array cell is O(1) regardless of how many total entries exist. The word “average” matters: in the worst case, every key hashes to the same bucket and the map degenerates to a linked list with O(n) lookup; a good hash function distributes keys uniformly to make this astronomically unlikely in practice. A HashMap stores pairs, not values alone โ€” any object can be a key as long as it correctly implements hashCode() and equals(), the two methods the map uses to locate and identify entries. Frequency counting, memoization, grouping, deduplication, and graph adjacency representation all use HashMap internally; it is the single most versatile data structure in the Java standard library.

Analogy: A library with a card catalogue. Instead of searching every shelf for a book, you look up the title in the catalogue, which tells you the exact shelf number in O(1). The catalogue is the hash function โ€” it maps the book title (key) to a shelf location (bucket index). If two books map to the same shelf, the librarian keeps a short list on that shelf to distinguish them โ€” that is collision chaining. The library never gets slower to look up a single book as more books are added, as long as the catalogue stays accurate.
02
Section Two ยท Mental Model

How It Thinks

Hash function maps key to bucket index in O(1)
โ–ถ
get() and put() skip every other entry โ€” direct access
Two keys can hash to the same bucket (collision)
โ–ถ
Bucket holds a chain or probe sequence โ€” O(1) avg, O(n) worst
Load factor threshold reached (default 0.75)
โ–ถ
Internal array doubles and all entries rehash โ€” O(n) resize
Keys must implement hashCode() and equals() correctly
โ–ถ
Wrong hashCode() causes entries to land in wrong buckets โ€” get() returns null for keys that exist
No ordering of keys guaranteed
โ–ถ
Iteration order is undefined and changes after resize โ€” never rely on HashMap iteration order
Java 8+: chain longer than 8 entries treeifies to red-black tree
โ–ถ
Worst-case bucket lookup degrades from O(n) to O(log n) even under catastrophic hash collisions
03
Section Three ยท Internals

How Hashing Works

Storing a key-value pair requires answering one question in O(1): given this key, which array slot does it belong in? The hash function answers that question by converting any key โ€” a string, an integer, an object โ€” into an integer, then compressing that integer into the range [0, capacity-1] using the modulo operation. The quality of the hash function determines how evenly entries are distributed across buckets, which determines whether the map runs in O(1) or degrades toward O(n).

Hashing pipeline โ€” key to bucket index
key "cat" โ‘  hashCode() โ†’ 98262 (any int) โ‘ก spread h ^ (h>>>16) (reduces clustering) โ‘ข compress hash % capacity โ†’ index 3 โ‘ฃ bucket [3] Key โ†’ integer โ†’ spread โ†’ compress โ†’ bucket โ€” all O(1)
hashCode() contract

What Java requires

  • Equal objects MUST have equal hash codes
  • Unequal objects SHOULD have different hash codes (collisions allowed but hurt performance)
  • hashCode() must be consistent โ€” same result every call for the same object state
  • If you override equals(), you MUST override hashCode()
What breaks silently

Violating the contract

If two objects are equal() but return different hashCode() values, the map stores them in different buckets. get() will never find a key that was put() because it looks in the wrong bucket. The map returns null for keys that are provably present.

  • Using mutable fields in hashCode() โ€” value changes, bucket changes
  • Forgetting to override hashCode() when overriding equals()
  • Custom objects as keys without hashCode() implementation
String.hashCode() โ€” How Java Computes It
Java computes String's hash code as a polynomial rolling hash over the character values, producing a consistent integer for any string value.
Java โ€” JDK String.hashCode() source
// Java's String.hashCode() โ€” from the JDK source // s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1] int hash = 0; for (int i = 0; i < value.length; i++) { hash = 31 * hash + value[i]; // 31 chosen: prime, fast (32x - x) } // "cat" โ†’ (99ร—31ยฒ + 97ร—31 + 116) = 98262 // Compression into bucket index (Java's actual formula): int index = (capacity - 1) & hash; // bitwise AND โ€” faster than modulo // only works when capacity is power of 2
Why capacity is always a power of 2:
  • Java’s HashMap always sizes its bucket array at a power of 2 โ€” 16, 32, 64, etc.
  • This lets the modulo operation be replaced by a bitwise AND: hash % capacity = hash & (capacity - 1), which is significantly faster.
Load Factor โ€” When Does HashMap Resize?
The load factor is the ratio of entries to buckets that triggers a resize โ€” at 0.75 (default), the map resizes when 75% of buckets are occupied.
Java โ€” Load factor configuration
// Default load factor = 0.75 // Resize triggers when: size > capacity ร— loadFactor // Default capacity = 16 โ†’ resize at 12 entries // Custom load factor โ€” lower = faster, more memory Map<String, Integer> map = new HashMap<>(32, 0.5f); // โ†‘ โ†‘ // initial load factor // capacity (resizes at 16 entries) // Higher load factor โ€” more memory-efficient, slower Map<String, Integer> dense = new HashMap<>(16, 0.9f); // Resizes at 14 entries โ€” more collisions before resize
0.75 is the empirically optimal trade-off:
  • At load factor 0.75, a HashMap has a good balance between memory usage and lookup performance โ€” most buckets are empty or hold one entry, keeping the average chain length near 1.
  • A lower load factor wastes memory; a higher one causes more collisions.
Common Mistake: Using a mutable object as a HashMap key and then modifying it after insertion. The hash code is computed from the object’s state at put() time. If the state changes, hashCode() returns a different value and the entry lands in a different bucket on lookup โ€” get() returns null even though the key was put(). Always use immutable objects (String, Integer, enums) as HashMap keys.
04
Section Four ยท Collision

Collision Handling

A collision is not an error โ€” it is an expected and inevitable consequence of mapping a large key space into a small bucket array. With 16 buckets and any reasonable load, the birthday paradox guarantees collisions well before the map is full. What matters is the strategy for resolving them: how entries in the same bucket are stored, found, and removed.

Separate Chaining โ€” Java's Approach

Each bucket holds a linked list (chain) of all entries that hash to that bucket. put() appends to the chain; get() traverses the chain comparing keys with equals() until it finds a match or exhausts the list. Java’s HashMap uses separate chaining with an optimization: chains longer than 8 entries are converted to red-black trees (treeification), capping worst-case lookup at O(log n).

Separate chaining โ€” collision in bucket [3]
[0] [1] nameโ†’Alice [2] [3] cityโ†’NY [4] [5] ageโ†’30 collision โ€” 3 keys hashed to bucket [3] zipโ†’10001 codeโ†’212 โˆ… get("zip"): โ‘  hash("zip") โ†’ bucket 3 โ‘ก traverse chain โ‘ข equals("city")? No โ‘ฃ equals("zip")? Yes โ†’ 10001 Average: O(1) when chains are short
equals() traversal is the bottleneck:
  • In a collision chain, each entry is compared using equals() โ€” not hashCode().
  • By the time you reach the chain, you already know the hash matches; equals() determines which exact entry you want.
  • An expensive equals() degrades even a well-distributed HashMap.
Open Addressing โ€” Linear Probing

Open addressing stores all entries in the bucket array itself โ€” no linked lists. When a collision occurs, the algorithm probes consecutive slots (linear probing) or uses a second hash (double hashing) until an empty slot is found.

Linear probing โ€” collision at index 2, probe to index 3
insert key K, hash(K) = 2 โ€” [0] Aโ†’5 [1] Bโ†’3 โœ— occupied Kโ†’? โœ“ empty โ€” [4] probe +1 Open addressing: no pointer overhead, better cache locality Separate chaining: simpler deletion, handles high load better
Java uses separate chaining โ€” not open addressing:
  • Java’s HashMap uses separate chaining exclusively.
  • Open addressing is used in some high-performance scenarios (Python’s dict, Google’s SwissTable) because all data stays in one contiguous array โ€” better cache behaviour.
Treeification โ€” Java 8's Safety Net

Java 8 introduced treeification: when a single bucket’s chain exceeds 8 entries, that chain is converted from a linked list to a red-black tree. This caps the worst-case lookup in a single bucket at O(log n) instead of O(n) โ€” a crucial defense against hash collision attacks where an adversary deliberately sends keys that all hash to the same bucket. When the chain shrinks below 6 entries (via removal), the tree reverts to a linked list.

Treeification โ€” chain to red-black tree when length ≥ 8
BEFORE (chain = 9) bucket [k] E1 E2 E3 โ‹ฎ (9 nodes) O(n) lookup treeify (โ‰ฅ 8) AFTER treeification bucket [k] E5 E3 E7 E1 E4 E6 E9 O(log n) lookup โ€” tree height โ‰ˆ 3
Treeification is a safety net, not the fast path:
  • If your HashMap is treeifying buckets in production, your hash function is broken โ€” either producing too many collisions or under attack.
  • Fix the hash function first. Treeification prevents catastrophic O(n) but adds memory overhead.
05
Section Five ยท Operations

Core Operations

put(key, value)
Inserts a new key-value pair or updates the value of an existing key, computing the bucket index from the key's hash.
Pseudocode
function put(map, key, value): index = hash(key) & (capacity - 1) // compute bucket โ€” O(1) bucket = table[index] for each entry in bucket: if entry.key.equals(key): entry.value = value // update existing โ€” O(1) avg return bucket.add(new Entry(key, value)) // insert new โ€” O(1) avg size++ if size > capacity * loadFactor: resize() // O(n) โ€” rare, amortized O(1)
put() is an upsert โ€” not an insert:
  • If the key already exists, put() overwrites the value and returns the old value โ€” it does not throw or create a duplicate.
  • Use putIfAbsent() when you want to insert only if the key is missing.
get(key)
Retrieves the value associated with a key by computing its bucket index and searching the chain for an equals() match.
Pseudocode
function get(map, key): index = hash(key) & (capacity - 1) // compute bucket โ€” O(1) bucket = table[index] for each entry in bucket: if entry.key.equals(key): return entry.value // found โ€” O(1) avg return null // not found โ€” O(1) avg
get() returning null has two meanings:
  • map.get(key) returns null either when the key does not exist or when the key maps to a null value โ€” they are indistinguishable.
  • Use containsKey(key) to distinguish, or getOrDefault(key, fallback) to avoid null entirely.
remove(key)
Deletes the entry with the specified key from its bucket chain and returns the associated value, or null if absent.
Pseudocode
function remove(map, key): index = hash(key) & (capacity - 1) bucket = table[index] prev = null for each entry in bucket: if entry.key.equals(key): if prev is null: bucket.head = entry.next else: prev.next = entry.next // bypass โ€” O(1) avg size-- return entry.value prev = entry return null
remove(key, value) โ€” conditional delete: map.remove(key, value) deletes the entry only if the key currently maps to that exact value โ€” an atomic check-and-remove useful for concurrent scenarios or stale cache cleanup.
containsKey(key)
Returns true if the map contains a mapping for the specified key โ€” equivalent to get() but explicit about intent.
Pseudocode
function containsKey(map, key): index = hash(key) & (capacity - 1) bucket = table[index] for each entry in bucket: if entry.key.equals(key): return true // O(1) avg return false
containsValue() is O(n) โ€” avoid it:
  • containsKey() is O(1) because it uses the hash. containsValue() is O(n) because values are not hashed โ€” the map must scan every entry in every bucket.
  • If you need O(1) value lookup, maintain a second HashMap with keys and values swapped.
Resize / Rehash
When the entry count exceeds capacity ร— loadFactor, the internal array doubles in size and every existing entry is rehashed into the new array.
Pseudocode
function resize(map): oldTable = table capacity = capacity * 2 // double โ€” always power of 2 table = new Array[capacity] for each bucket in oldTable: for each entry in bucket: newIndex = entry.hash & (capacity - 1) // recompute table[newIndex].add(entry) // re-insert into new table // O(n) โ€” visits every entry once
Rehashing changes all bucket positions:
  • After a resize, entries move to new bucket indices because the capacity โ€” and therefore the modulo โ€” changed.
  • Never cache bucket indices or rely on HashMap’s internal structure. Iteration order changes after resize.
Common Mistake: Calling map.get(key) without checking containsKey() first, then treating null as “not found” โ€” but the code put null values into the map earlier. Use getOrDefault(key, 0) for counters, or merge(key, 1, Integer::sum) to increment without null checks.
06
Section Six ยท Variants

HashMap Variants

Java provides four Map implementations covering every common use case โ€” plain HashMap for O(1) lookups, LinkedHashMap for ordered iteration, TreeMap for sorted keys, and WeakHashMap for caches that should not prevent garbage collection. Choosing the right implementation is a one-decision question: do you need ordering, and if so, what kind?

LinkedHashMap โ€” Predictable Iteration Order

LinkedHashMap extends HashMap and maintains a doubly linked list running through all entries in insertion order โ€” put() adds to the tail of the list, so iteration always visits entries in the order they were inserted. Optionally, accessOrder=true makes it access-order instead: every get() or put() moves the accessed entry to the tail, making the head always the least-recently-used entry. The LRU cache pattern โ€” evict the head when capacity is exceeded โ€” is built directly on this access-order LinkedHashMap behaviour.

Java โ€” LinkedHashMap usage
// Insertion-order (default) โ€” iterates in put() order Map<String, Integer> ordered = new LinkedHashMap<>(); ordered.put("c", 3); ordered.put("a", 1); ordered.put("b", 2); // Iteration: c=3, a=1, b=2 (insertion order) // Access-order โ€” LRU cache base Map<String, Integer> lru = new LinkedHashMap<>(16, 0.75f, true) { @Override protected boolean removeEldestEntry(Map.Entry<String,Integer> e) { return size() > 3; // evict when capacity exceeded } }; lru.put("a", 1); lru.put("b", 2); lru.put("c", 3); lru.get("a"); // moves "a" to tail (most recently used) lru.put("d", 4); // triggers evict โ€” removes "b" (LRU)
LinkedHashMap is the LRU Cache building block:
  • LeetCode 146 (LRU Cache) can be solved in 10 lines with access-order LinkedHashMap and overriding removeEldestEntry().
  • The doubly linked list adds O(1) overhead per operation โ€” all operations remain O(1) average.
TreeMap โ€” Sorted Keys, Range Queries

TreeMap stores entries in a red-black tree keyed in natural order (or by a provided Comparator), giving O(log n) put(), get(), and remove() โ€” slower than HashMap but providing sorted iteration and powerful range operations. The unique TreeMap operations โ€” floorKey(), ceilingKey(), firstKey(), lastKey(), subMap() โ€” are the reason to choose TreeMap: they solve range query problems that would require sorting a HashMap’s keys first. In interviews, whenever a problem asks for “nearest value below X” or “all keys between A and B”, TreeMap is likely the intended structure.

Java โ€” TreeMap range operations
TreeMap<Integer, String> tree = new TreeMap<>(); tree.put(5, "five"); tree.put(3, "three"); tree.put(8, "eight"); tree.put(1, "one"); // Sorted iteration โ€” always ascending tree.forEach((k, v) -> System.out.println(k + "=" + v)); // 1=one, 3=three, 5=five, 8=eight // Range queries โ€” the TreeMap superpower tree.floorKey(4); // 3 โ€” largest key โ‰ค 4 tree.ceilingKey(4); // 5 โ€” smallest key โ‰ฅ 4 tree.firstKey(); // 1 โ€” minimum key O(log n) tree.subMap(3, true, 8, false); // {3=three, 5=five} โ€” [3, 8)
TreeMap vs sorted HashMap:
  • There is no such thing as a sorted HashMap โ€” HashMap’s iteration order is undefined.
  • If you find yourself calling Collections.sort(new ArrayList<>(map.keySet())) repeatedly, the correct structure is a TreeMap from the start.
WeakHashMap โ€” GC-Eligible Keys

WeakHashMap holds weak references to its keys โ€” when a key has no other strong references elsewhere in the program, the garbage collector may collect it and the corresponding entry is automatically removed from the map. This makes WeakHashMap suitable for caches and metadata stores where the map should not be the reason an object stays alive.

WeakHashMap is not a general-purpose cache:
  • GC timing is non-deterministic โ€” entries may be evicted at any point, even if the JVM has plenty of heap.
  • For a real cache, use a proper caching library (Caffeine, Guava Cache) which gives you control over eviction policy, size limits, and expiry.
ConcurrentHashMap โ€” Thread-Safe Without Full Locking

ConcurrentHashMap uses lock striping โ€” the bucket array is divided into segments, each with its own lock โ€” so multiple threads can read and write to different segments simultaneously without blocking each other. Use ConcurrentHashMap any time a HashMap is shared between threads; never use HashMap in a multi-threaded context without external synchronization, and never use Collections.synchronizedMap().

computeIfAbsent() is atomic in ConcurrentHashMap:
  • computeIfAbsent(key, mappingFunction) checks for the key and inserts the computed value atomically โ€” no race condition.
  • This is the correct pattern for building a concurrent frequency counter or a shared memoization cache.
ImplementationOrderget/putNull keysBest For
HashMapNone (undefined)O(1) avgYes (1 null)General purpose
LinkedHashMapInsertion or LRUO(1) avgYes (1 null)LRU cache, ordered output
TreeMapSorted (natural)O(log n)NoRange queries, sorted iteration
WeakHashMapNoneO(1) avgYes (1 null)Metadata, weak caches
ConcurrentHashMapNoneO(1) avgNoMulti-threaded maps
HashtableNoneO(1) avgNoLegacy โ€” never use

โš  Hashtable is fully synchronized legacy code โ€” use ConcurrentHashMap for thread safety

07
Section Seven ยท Implementation

Build It Once

Build this from scratch once โ€” it makes the internals concrete. In any real project or interview, reach for the built-in (Section 08).

Java โ€” HashMap core mechanics
// Build once to understand โ€” use HashMap<K,V> in all real code static class Entry<K, V> { K key; V value; int hash; Entry<K,V> next; Entry(K key, V value, int hash, Entry<K,V> next) { this.key = key; this.value = value; this.hash = hash; this.next = next; } } Entry<K,V>[] table; int size; int capacity = 16; float loadFactor = 0.75f; // O(1) avg โ€” insert or update key V put(K key, V value) { int h = (key == null) ? 0 : key.hashCode() ^ (key.hashCode() >>> 16); // bitwise AND โ€” works because capacity is 2^n int i = h & (capacity - 1); // search for existing key for (Entry<K,V> e = table[i]; e != null; e = e.next) { if (e.hash == h && (e.key == key || key.equals(e.key))) { V old = e.value; e.value = value; return old; } } // key not found โ€” insert at head of chain table[i] = new Entry<>(key, value, h, table[i]); if (++size > capacity * loadFactor) { resize(); } return null; } // O(1) avg โ€” retrieve value by key V get(K key) { int h = (key == null) ? 0 : key.hashCode() ^ (key.hashCode() >>> 16); for (Entry<K,V> e = table[h & (capacity - 1)]; e != null; e = e.next) { if (e.hash == h && (e.key == key || key.equals(e.key))) { return e.value; } } return null; } // O(n) โ€” double capacity and rehash all entries void resize() { Entry<K,V>[] old = table; capacity *= 2; table = new Entry[capacity]; for (Entry<K,V> head : old) { for (Entry<K,V> e = head; e != null; ) { Entry<K,V> next = e.next; // recompute index for new capacity int i = e.hash & (capacity - 1); // prepend to new bucket e.next = table[i]; table[i] = e; e = next; } } }
08
Section Eight ยท Java Built-in

Use It In Java

IN JAVA
USE HashMap<K, V> map = new HashMap<>()
IMPORT import java.util.HashMap;
import java.util.Map;
METHODS
MethodComplexityNote
put(key, value)O(1) avgReturns old value or null
get(key)O(1) avgReturns null if absent
remove(key)O(1) avgReturns removed value or null
containsKey(key)O(1) avgAlways prefer over get()!=null
containsValue(value)O(n)Scans all entries โ€” avoid in loops
getOrDefault(key, def)O(1) avgAvoid null checks for counters
putIfAbsent(key, value)O(1) avgInsert only if key missing
merge(key, value, fn)O(1) avgIdeal for frequency counting
computeIfAbsent(key, fn)O(1) avgLazy value creation (grouping)
forEach((k,v) -> ...)O(n)Preferred iteration
GOTCHA HashMap is not thread-safe โ€” concurrent reads and writes from multiple threads corrupt the internal structure silently and can cause infinite loops during resize. Never share a plain HashMap between threads. Use ConcurrentHashMap, or wrap with Collections.synchronizedMap() only if you cannot use ConcurrentHashMap.
CHOOSE Use HashMap when you need O(1) key lookup and iteration order does not matter โ€” it is the default for 90% of use cases. Switch to LinkedHashMap when you need predictable iteration order or LRU eviction. Switch to TreeMap when you need sorted keys or range queries. Never use Hashtable.
09
Section Nine ยท Decision Guide

Pick The Right Structure

Decision flowchart โ€” choosing the right map
Need key โ†’ value mapping? Yes Sorted keys? Yes TreeMap<K,V> floorKey, ceilingKey, O(log n) No Insertion order? Yes LinkedHashMap LRU cache, ordered output No Multi-threaded? Yes ConcurrentHashMap No HashMap<K,V> O(1) avg โ€” the default No Membership? Yes HashSet<V> same internals, no values No โ€ฆ
Structureget/putOrderedNull keyThread-safeBest For
HashMap โ† thisO(1) avgNoYesNoGeneral purpose lookup
LinkedHashMapO(1) avgInsertionYesNoLRU cache, ordered output
TreeMapO(log n)SortedNoNoRange queries, sorted keys
HashSetO(1) avgNoYesNoMembership, deduplication
ConcurrentHashMapO(1) avgNoNoYesShared maps across threads
ArrayListO(1) idxBy indexYesNoOrdered list, random access
HashMap solves more interview problems than any other structure:
  • Frequency counting (map.merge(key, 1, Integer::sum)), grouping (map.computeIfAbsent(key, k -> new ArrayList<>()).add(val)), and existence checking (map.containsKey(key)) are the three patterns that unlock the majority of medium interview problems.
  • When a brute-force O(nยฒ) uses a nested loop to find matches, the O(n) solution almost always involves replacing the inner loop with a HashMap lookup.
10
Section Ten ยท Patterns

Common HashMap Patterns

Five patterns account for the vast majority of HashMap usage in interviews and production code. Recognising which pattern a problem requires is more important than the implementation itself โ€” the code is always short once the pattern is clear.

Pattern 1 โ€” Frequency Counting
Count how many times each element appears, then query the map to answer questions about duplicates, majorities, or top-K.
Java โ€” Frequency counting idioms
// Classic loop โ€” works everywhere Map<String, Integer> freq = new HashMap<>(); for (String s : words) { freq.put(s, freq.getOrDefault(s, 0) + 1); } // Modern โ€” merge() is cleaner for (String s : words) { freq.merge(s, 1, Integer::sum); // if absent โ†’ 1, else += 1 } // One-liner with streams Map<String, Long> freq2 = Arrays.stream(words) .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
merge() vs getOrDefault():
  • merge(key, 1, Integer::sum) does the same thing as getOrDefault + put in one call.
  • It reads better, avoids the intermediate variable, and is atomic in ConcurrentHashMap.
Pattern 2 โ€” Complement Lookup
Store what you have seen so far, and check whether the complement of the current element exists โ€” turning an O(nยฒ) nested loop into O(n).
Java โ€” Two Sum (the canonical example)
// Two Sum โ€” find indices of two numbers that add to target Map<Integer, Integer> seen = new HashMap<>(); // value โ†’ index for (int i = 0; i < nums.length; i++) { int complement = target - nums[i]; if (seen.containsKey(complement)) return new int[] { seen.get(complement), i }; seen.put(nums[i], i); // store after checking } // O(n) time, O(n) space โ€” one pass
Store after checking โ€” not before:
  • Putting the current element into the map before checking the complement can cause false positives (an element matching with itself).
  • Always check first, then store.
Pattern 3 โ€” Grouping / Bucketing
Use computeIfAbsent() to collect items sharing a canonical key into lists โ€” anagram grouping, character frequency grouping, graph adjacency.
Java โ€” Group Anagrams
// Group Anagrams โ€” canonical key = sorted characters Map<String, List<String>> groups = new HashMap<>(); for (String s : strs) { char[] chars = s.toCharArray(); Arrays.sort(chars); String key = new String(chars); // canonical form groups.computeIfAbsent(key, k -> new ArrayList<>()).add(s); } return new ArrayList<>(groups.values()); // O(n ยท k log k) where k = max string length
computeIfAbsent() vs putIfAbsent():
  • computeIfAbsent(key, fn) creates the value lazily โ€” the function only runs when the key is absent. putIfAbsent(key, value) requires the value to already exist.
  • For lists, computeIfAbsent is always the right choice.
Pattern 4 โ€” Prefix Sum + HashMap
Store cumulative sums as keys โ€” if (currentSum - target) exists as a key, the subarray between those positions sums to target.
Java โ€” Subarray Sum Equals K
// Subarray Sum Equals K โ€” count subarrays summing to k Map<Integer, Integer> prefixCount = new HashMap<>(); prefixCount.put(0, 1); // empty prefix int sum = 0, count = 0; for (int num : nums) { sum += num; count += prefixCount.getOrDefault(sum - k, 0); prefixCount.merge(sum, 1, Integer::sum); } return count; // O(n) time, O(n) space โ€” one pass
Seed the map with (0, 1): The prefix sum map must be initialized with prefixCount.put(0, 1) โ€” this handles the case where the subarray starting from index 0 sums to exactly k.
Pattern 5 โ€” Sliding Window + HashMap
Track character or element frequencies inside a sliding window using a HashMap โ€” expand right to include, shrink left to exclude, update the map at each step.
Java โ€” Longest Substring Without Repeating Characters
// Longest substring without repeating characters Map<Character, Integer> lastSeen = new HashMap<>(); int left = 0, maxLen = 0; for (int right = 0; right < s.length(); right++) { char c = s.charAt(right); if (lastSeen.containsKey(c) && lastSeen.get(c) >= left) left = lastSeen.get(c) + 1; // shrink window past duplicate lastSeen.put(c, right); maxLen = Math.max(maxLen, right - left + 1); } return maxLen; // O(n) time, O(min(n, charset)) space
HashMap vs int[128] for character windows:
  • For ASCII characters, an int[128] array is faster than a HashMap<Character, Integer> โ€” direct indexing, no boxing, no hashing.
  • Use HashMap when the element type is unbounded (strings, integers, objects).
Common Mistake: Modifying a HashMap while iterating over it with a for-each loop throws ConcurrentModificationException. Use Iterator.remove(), or collect keys to remove into a separate list first. The replaceAll() and removeIf() methods on the entry set are iteration-safe alternatives.
11
Section Eleven ยท Complexity

Complexity Reference

HashMap operations are O(1) average-case because the hash function computes the bucket index in constant time and the chain in each bucket is expected to be length 1 under a good hash function and 0.75 load factor. The worst case โ€” all keys collide into one bucket โ€” is O(n) for a linked-list chain, but Java 8’s treeification caps it at O(log n). Resize is O(n) but happens infrequently enough to amortize to O(1) per insertion.

Time Complexity
OperationAverageWorst (pre-Java 8)Worst (Java 8+)Notes
put(key, value)O(1)O(n)O(log n)Amortized O(1) including resize
get(key)O(1)O(n)O(log n)Hash + chain/tree traversal
remove(key)O(1)O(n)O(log n)Hash + chain/tree traversal
containsKey(key)O(1)O(n)O(log n)Same as get()
containsValue(value)O(n)O(n)O(n)Scans all entries โ€” avoid
resize / rehashO(n)O(n)O(n)Visits every entry once
keySet() iterationO(n + cap)O(n + cap)O(n + cap)Visits every bucket, even empty
Space Complexity
ComponentSpaceNotes
Bucket arrayO(capacity)Always a power of 2, starts at 16
Entries (chain nodes)O(n)One Entry object per key-value pair
TotalO(n + capacity)Capacity ≤ 2n at load factor 0.75
Per entry overhead~48 bytesKey ref + value ref + hash int + next pointer + object header
Capacity vs size:
  • A new HashMap<>() allocates a 16-slot bucket array but stores 0 entries.
  • After 12 insertions (16 ร— 0.75), it resizes to 32 slots.
  • If you know the expected size, use new HashMap<>(expectedSize / 0.75 + 1) to avoid resizing.
  • Starting with a good initial capacity can eliminate all resize operations.
Amortized Analysis โ€” Why Resize Is Still O(1) Per Put

Each resize doubles the capacity and moves all n entries โ€” O(n) work. But resize only triggers after n insertions (where n = capacity ร— loadFactor). Distributing the O(n) resize cost across the n insertions that caused it gives O(1) amortized cost per put(). This is the same accounting trick that makes ArrayList.add() amortized O(1) despite occasional O(n) array copies.

When HashMap is fast

O(1) average achieved when

  • Hash function distributes keys uniformly
  • Load factor stays near default (0.75)
  • Keys are immutable (String, Integer, enum)
  • equals() and hashCode() are cheap and consistent
When HashMap degrades

Performance killers

  • All keys hash to same bucket โ†’ O(n) per operation
  • Expensive equals() โ†’ high constant factor per comparison
  • Mutable keys โ†’ lost entries after state change
  • Very low load factor โ†’ wasted memory, frequent iteration over empty buckets
12
Section Twelve ยท Practice

Problems To Solve

HashMap interview problems cluster around three patterns: frequency counting (scan once, store counts, query in O(1)), prefix sums with a map (store cumulative totals as keys to find subarrays in O(1)), and grouping (use computeIfAbsent to collect items sharing a canonical form). When a problem asks “does X exist?” or “how many times does X appear?” and a brute-force solution would scan repeatedly, the O(n) solution is almost always a HashMap.

Difficulty Pattern Problem Key Insight
Easy hash-map Two Sum โ€” LC 1 Store each number’s complement in a HashMap as you scan โ€” if the current number exists as a key, you found the pair in O(1) rather than scanning backwards.
Medium hash-map Group Anagrams โ€” LC 49 Sort each string to get its canonical form โ€” strings that are anagrams produce the same sorted key, so group them in a HashMap<String, List<String>>.
Medium hash-map Top K Frequent Elements โ€” LC 347 Count frequencies with a HashMap, then use a min-heap of size k โ€” the heap keeps the k largest frequencies without sorting the full frequency map.
Medium prefix-sum Subarray Sum Equals K โ€” LC 560 Store prefix sums in a HashMap โ€” if prefixSum - k exists as a key, the subarray between those indices sums to k. One pass, O(n) time, O(n) space.
Medium hash-set Longest Consecutive Sequence โ€” LC 128 Put all numbers in a HashSet, then for each number that has no left neighbour (n-1 not in set), count how far right the sequence extends โ€” O(n) total.
Hard design LRU Cache โ€” LC 146 HashMap for O(1) lookup by key, doubly linked list for O(1) move-to-front and eviction from tail โ€” both operations must be O(1) simultaneously.
Interview Pattern:
  • The single most powerful HashMap pattern is the complement lookup โ€” store what you have seen, check whether what you need exists.
  • Two Sum does this for pairs. Subarray Sum Equals K does this for prefix sums.
  • Group Anagrams does this for canonical forms.
  • Once you see the pattern, roughly 40% of medium array/string problems reduce to a single HashMap pass.

โ†’ See the full HashMap practice set