A HashMap stores key-value pairs using a hash function to map each key to a bucket index โ giving O(1) average-case get, put, and remove. The most versatile structure in Java.
Hashing
HashMap Fundamentals
O(1) average-case get, put, and remove by hashing keys to bucket indices. The single most versatile data structure in the Java standard library โ underpinning frequency counting, memoization, grouping, and the majority of interview solutions.
Every key inserted into a HashMap is transformed by a hash function into an array index โ a bucket โ allowing direct access to its value without scanning any other entries. O(1) average access is possible because the hash function computes the bucket index in constant time, and reading or writing one array cell is O(1) regardless of how many total entries exist. The word “average” matters: in the worst case, every key hashes to the same bucket and the map degenerates to a linked list with O(n) lookup; a good hash function distributes keys uniformly to make this astronomically unlikely in practice. A HashMap stores pairs, not values alone โ any object can be a key as long as it correctly implements hashCode() and equals(), the two methods the map uses to locate and identify entries. Frequency counting, memoization, grouping, deduplication, and graph adjacency representation all use HashMap internally; it is the single most versatile data structure in the Java standard library.
Analogy: A library with a card catalogue. Instead of searching every shelf for a book, you look up the title in the catalogue, which tells you the exact shelf number in O(1). The catalogue is the hash function โ it maps the book title (key) to a shelf location (bucket index). If two books map to the same shelf, the librarian keeps a short list on that shelf to distinguish them โ that is collision chaining. The library never gets slower to look up a single book as more books are added, as long as the catalogue stays accurate.
02
Section Two ยท Mental Model
How It Thinks
Hash function maps key to bucket index in O(1)
โถ
get() and put() skip every other entry โ direct access
Two keys can hash to the same bucket (collision)
โถ
Bucket holds a chain or probe sequence โ O(1) avg, O(n) worst
Load factor threshold reached (default 0.75)
โถ
Internal array doubles and all entries rehash โ O(n) resize
Keys must implement hashCode() and equals() correctly
โถ
Wrong hashCode() causes entries to land in wrong buckets โ get() returns null for keys that exist
No ordering of keys guaranteed
โถ
Iteration order is undefined and changes after resize โ never rely on HashMap iteration order
Java 8+: chain longer than 8 entries treeifies to red-black tree
โถ
Worst-case bucket lookup degrades from O(n) to O(log n) even under catastrophic hash collisions
03
Section Three ยท Internals
How Hashing Works
Storing a key-value pair requires answering one question in O(1): given this key, which array slot does it belong in? The hash function answers that question by converting any key โ a string, an integer, an object โ into an integer, then compressing that integer into the range [0, capacity-1] using the modulo operation. The quality of the hash function determines how evenly entries are distributed across buckets, which determines whether the map runs in O(1) or degrades toward O(n).
Hashing pipeline โ key to bucket index
hashCode() contract
What Java requires
Equal objects MUST have equal hash codes
Unequal objects SHOULD have different hash codes (collisions allowed but hurt performance)
hashCode() must be consistent โ same result every call for the same object state
If you override equals(), you MUST override hashCode()
What breaks silently
Violating the contract
If two objects are equal() but return different hashCode() values, the map stores them in different buckets. get() will never find a key that was put() because it looks in the wrong bucket. The map returns null for keys that are provably present.
Using mutable fields in hashCode() โ value changes, bucket changes
Forgetting to override hashCode() when overriding equals()
Custom objects as keys without hashCode() implementation
String.hashCode() โ How Java Computes It
Java computes String's hash code as a polynomial rolling hash over the character values, producing a consistent integer for any string value.
Java โ JDK String.hashCode() source
// Java's String.hashCode() โ from the JDK source// s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]int hash = 0;
for (int i = 0; i < value.length; i++) {
hash = 31 * hash + value[i]; // 31 chosen: prime, fast (32x - x)
}
// "cat" โ (99ร31ยฒ + 97ร31 + 116) = 98262// Compression into bucket index (Java's actual formula):int index = (capacity - 1) & hash; // bitwise AND โ faster than modulo// only works when capacity is power of 2
Why capacity is always a power of 2:
Java’s HashMap always sizes its bucket array at a power of 2 โ 16, 32, 64, etc.
This lets the modulo operation be replaced by a bitwise AND: hash % capacity = hash & (capacity - 1), which is significantly faster.
Load Factor โ When Does HashMap Resize?
The load factor is the ratio of entries to buckets that triggers a resize โ at 0.75 (default), the map resizes when 75% of buckets are occupied.
Java โ Load factor configuration
// Default load factor = 0.75// Resize triggers when: size > capacity ร loadFactor// Default capacity = 16 โ resize at 12 entries// Custom load factor โ lower = faster, more memory
Map<String, Integer> map = new HashMap<>(32, 0.5f);
// โ โ// initial load factor// capacity (resizes at 16 entries)// Higher load factor โ more memory-efficient, slower
Map<String, Integer> dense = new HashMap<>(16, 0.9f);
// Resizes at 14 entries โ more collisions before resize
0.75 is the empirically optimal trade-off:
At load factor 0.75, a HashMap has a good balance between memory usage and lookup performance โ most buckets are empty or hold one entry, keeping the average chain length near 1.
A lower load factor wastes memory; a higher one causes more collisions.
Common Mistake: Using a mutable object as a HashMap key and then modifying it after insertion. The hash code is computed from the object’s state at put() time. If the state changes, hashCode() returns a different value and the entry lands in a different bucket on lookup โ get() returns null even though the key was put(). Always use immutable objects (String, Integer, enums) as HashMap keys.
04
Section Four ยท Collision
Collision Handling
A collision is not an error โ it is an expected and inevitable consequence of mapping a large key space into a small bucket array. With 16 buckets and any reasonable load, the birthday paradox guarantees collisions well before the map is full. What matters is the strategy for resolving them: how entries in the same bucket are stored, found, and removed.
Separate Chaining โ Java's Approach
Each bucket holds a linked list (chain) of all entries that hash to that bucket. put() appends to the chain; get() traverses the chain comparing keys with equals() until it finds a match or exhausts the list. Java’s HashMap uses separate chaining with an optimization: chains longer than 8 entries are converted to red-black trees (treeification), capping worst-case lookup at O(log n).
Separate chaining โ collision in bucket [3]
equals() traversal is the bottleneck:
In a collision chain, each entry is compared using equals() โ not hashCode().
By the time you reach the chain, you already know the hash matches; equals() determines which exact entry you want.
An expensive equals() degrades even a well-distributed HashMap.
Open Addressing โ Linear Probing
Open addressing stores all entries in the bucket array itself โ no linked lists. When a collision occurs, the algorithm probes consecutive slots (linear probing) or uses a second hash (double hashing) until an empty slot is found.
Linear probing โ collision at index 2, probe to index 3
Java uses separate chaining โ not open addressing:
Java’s HashMap uses separate chaining exclusively.
Open addressing is used in some high-performance scenarios (Python’s dict, Google’s SwissTable) because all data stays in one contiguous array โ better cache behaviour.
Treeification โ Java 8's Safety Net
Java 8 introduced treeification: when a single bucket’s chain exceeds 8 entries, that chain is converted from a linked list to a red-black tree. This caps the worst-case lookup in a single bucket at O(log n) instead of O(n) โ a crucial defense against hash collision attacks where an adversary deliberately sends keys that all hash to the same bucket. When the chain shrinks below 6 entries (via removal), the tree reverts to a linked list.
Treeification โ chain to red-black tree when length ≥ 8
Treeification is a safety net, not the fast path:
If your HashMap is treeifying buckets in production, your hash function is broken โ either producing too many collisions or under attack.
Fix the hash function first. Treeification prevents catastrophic O(n) but adds memory overhead.
05
Section Five ยท Operations
Core Operations
put(key, value)
Inserts a new key-value pair or updates the value of an existing key, computing the bucket index from the key's hash.
Pseudocode
functionput(map, key, value):
index = hash(key) & (capacity - 1) // compute bucket โ O(1)
bucket = table[index]
for each entry in bucket:
if entry.key.equals(key):
entry.value = value // update existing โ O(1) avgreturn
bucket.add(new Entry(key, value)) // insert new โ O(1) avg
size++
if size > capacity * loadFactor:
resize() // O(n) โ rare, amortized O(1)
put() is an upsert โ not an insert:
If the key already exists, put() overwrites the value and returns the old value โ it does not throw or create a duplicate.
Use putIfAbsent() when you want to insert only if the key is missing.
get(key)
Retrieves the value associated with a key by computing its bucket index and searching the chain for an equals() match.
Pseudocode
functionget(map, key):
index = hash(key) & (capacity - 1) // compute bucket โ O(1)
bucket = table[index]
for each entry in bucket:
if entry.key.equals(key):
return entry.value // found โ O(1) avgreturn null// not found โ O(1) avg
get() returning null has two meanings:
map.get(key) returns null either when the key does not exist or when the key maps to a null value โ they are indistinguishable.
Use containsKey(key) to distinguish, or getOrDefault(key, fallback) to avoid null entirely.
remove(key)
Deletes the entry with the specified key from its bucket chain and returns the associated value, or null if absent.
Pseudocode
functionremove(map, key):
index = hash(key) & (capacity - 1)
bucket = table[index]
prev = nullfor each entry in bucket:
if entry.key.equals(key):
if prev is null: bucket.head = entry.next
else: prev.next = entry.next // bypass โ O(1) avg
size--
return entry.value
prev = entry
return null
remove(key, value) โ conditional delete: map.remove(key, value) deletes the entry only if the key currently maps to that exact value โ an atomic check-and-remove useful for concurrent scenarios or stale cache cleanup.
containsKey(key)
Returns true if the map contains a mapping for the specified key โ equivalent to get() but explicit about intent.
Pseudocode
functioncontainsKey(map, key):
index = hash(key) & (capacity - 1)
bucket = table[index]
for each entry in bucket:
if entry.key.equals(key):
return true// O(1) avgreturn false
containsValue() is O(n) โ avoid it:
containsKey() is O(1) because it uses the hash. containsValue() is O(n) because values are not hashed โ the map must scan every entry in every bucket.
If you need O(1) value lookup, maintain a second HashMap with keys and values swapped.
Resize / Rehash
When the entry count exceeds capacity ร loadFactor, the internal array doubles in size and every existing entry is rehashed into the new array.
Pseudocode
functionresize(map):
oldTable = table
capacity = capacity * 2// double โ always power of 2
table = new Array[capacity]
for each bucket in oldTable:
for each entry in bucket:
newIndex = entry.hash & (capacity - 1) // recompute
table[newIndex].add(entry) // re-insert into new table// O(n) โ visits every entry once
Rehashing changes all bucket positions:
After a resize, entries move to new bucket indices because the capacity โ and therefore the modulo โ changed.
Never cache bucket indices or rely on HashMap’s internal structure. Iteration order changes after resize.
Common Mistake: Calling map.get(key) without checking containsKey() first, then treating null as “not found” โ but the code put null values into the map earlier. Use getOrDefault(key, 0) for counters, or merge(key, 1, Integer::sum) to increment without null checks.
06
Section Six ยท Variants
HashMap Variants
Java provides four Map implementations covering every common use case โ plain HashMap for O(1) lookups, LinkedHashMap for ordered iteration, TreeMap for sorted keys, and WeakHashMap for caches that should not prevent garbage collection. Choosing the right implementation is a one-decision question: do you need ordering, and if so, what kind?
LinkedHashMap โ Predictable Iteration Order
LinkedHashMap extends HashMap and maintains a doubly linked list running through all entries in insertion order โ put() adds to the tail of the list, so iteration always visits entries in the order they were inserted. Optionally, accessOrder=true makes it access-order instead: every get() or put() moves the accessed entry to the tail, making the head always the least-recently-used entry. The LRU cache pattern โ evict the head when capacity is exceeded โ is built directly on this access-order LinkedHashMap behaviour.
LeetCode 146 (LRU Cache) can be solved in 10 lines with access-order LinkedHashMap and overriding removeEldestEntry().
The doubly linked list adds O(1) overhead per operation โ all operations remain O(1) average.
TreeMap โ Sorted Keys, Range Queries
TreeMap stores entries in a red-black tree keyed in natural order (or by a provided Comparator), giving O(log n) put(), get(), and remove() โ slower than HashMap but providing sorted iteration and powerful range operations. The unique TreeMap operations โ floorKey(), ceilingKey(), firstKey(), lastKey(), subMap() โ are the reason to choose TreeMap: they solve range query problems that would require sorting a HashMap’s keys first. In interviews, whenever a problem asks for “nearest value below X” or “all keys between A and B”, TreeMap is likely the intended structure.
There is no such thing as a sorted HashMap โ HashMap’s iteration order is undefined.
If you find yourself calling Collections.sort(new ArrayList<>(map.keySet())) repeatedly, the correct structure is a TreeMap from the start.
WeakHashMap โ GC-Eligible Keys
WeakHashMap holds weak references to its keys โ when a key has no other strong references elsewhere in the program, the garbage collector may collect it and the corresponding entry is automatically removed from the map. This makes WeakHashMap suitable for caches and metadata stores where the map should not be the reason an object stays alive.
WeakHashMap is not a general-purpose cache:
GC timing is non-deterministic โ entries may be evicted at any point, even if the JVM has plenty of heap.
For a real cache, use a proper caching library (Caffeine, Guava Cache) which gives you control over eviction policy, size limits, and expiry.
ConcurrentHashMap โ Thread-Safe Without Full Locking
ConcurrentHashMap uses lock striping โ the bucket array is divided into segments, each with its own lock โ so multiple threads can read and write to different segments simultaneously without blocking each other. Use ConcurrentHashMap any time a HashMap is shared between threads; never use HashMap in a multi-threaded context without external synchronization, and never use Collections.synchronizedMap().
computeIfAbsent() is atomic in ConcurrentHashMap:
computeIfAbsent(key, mappingFunction) checks for the key and inserts the computed value atomically โ no race condition.
This is the correct pattern for building a concurrent frequency counter or a shared memoization cache.
Implementation
Order
get/put
Null keys
Best For
HashMap
None (undefined)
O(1) avg
Yes (1 null)
General purpose
LinkedHashMap
Insertion or LRU
O(1) avg
Yes (1 null)
LRU cache, ordered output
TreeMap
Sorted (natural)
O(log n)
No
Range queries, sorted iteration
WeakHashMap
None
O(1) avg
Yes (1 null)
Metadata, weak caches
ConcurrentHashMap
None
O(1) avg
No
Multi-threaded maps
Hashtable
None
O(1) avg
No
Legacy โ never use
โ Hashtable is fully synchronized legacy code โ use ConcurrentHashMap for thread safety
07
Section Seven ยท Implementation
Build It Once
Build this from scratch once โ it makes the internals concrete.
In any real project or interview, reach for the built-in (Section 08).
Java โ HashMap core mechanics
// Build once to understand โ use HashMap<K,V> in all real codestatic classEntry<K, V> {
K key;
V value;
int hash;
Entry<K,V> next;
Entry(K key, V value, int hash, Entry<K,V> next) {
this.key = key;
this.value = value;
this.hash = hash;
this.next = next;
}
}
Entry<K,V>[] table;
int size;
int capacity = 16;
float loadFactor = 0.75f;
// O(1) avg โ insert or update key
V put(K key, V value) {
int h = (key == null)
? 0
: key.hashCode() ^ (key.hashCode() >>> 16);
// bitwise AND โ works because capacity is 2^nint i = h & (capacity - 1);
// search for existing keyfor (Entry<K,V> e = table[i]; e != null; e = e.next) {
if (e.hash == h && (e.key == key || key.equals(e.key))) {
V old = e.value;
e.value = value;
return old;
}
}
// key not found โ insert at head of chain
table[i] = new Entry<>(key, value, h, table[i]);
if (++size > capacity * loadFactor) {
resize();
}
return null;
}
// O(1) avg โ retrieve value by key
V get(K key) {
int h = (key == null)
? 0
: key.hashCode() ^ (key.hashCode() >>> 16);
for (Entry<K,V> e = table[h & (capacity - 1)]; e != null; e = e.next) {
if (e.hash == h && (e.key == key || key.equals(e.key))) {
return e.value;
}
}
return null;
}
// O(n) โ double capacity and rehash all entriesvoidresize() {
Entry<K,V>[] old = table;
capacity *= 2;
table = new Entry[capacity];
for (Entry<K,V> head : old) {
for (Entry<K,V> e = head; e != null; ) {
Entry<K,V> next = e.next;
// recompute index for new capacityint i = e.hash & (capacity - 1);
// prepend to new bucket
e.next = table[i];
table[i] = e;
e = next;
}
}
}
HashMap โ Full Implementation
Java โ Complete SimpleHashMap<K,V>
import java.util.*;
public classSimpleHashMap<K, V> {
static classEntry<K, V> {
K key;
V value;
int hash;
Entry<K,V> next;
Entry(K k, V v, int h, Entry<K,V> n) {
key = k;
value = v;
hash = h;
next = n;
}
}
private Entry<K,V>[] table;
privateint size;
privateint capacity;
privatefloat loadFactor;
@SuppressWarnings("unchecked")
publicSimpleHashMap(int cap, float lf) {
capacity = cap;
loadFactor = lf;
table = new Entry[cap];
}
publicSimpleHashMap() {
this(16, 0.75f);
}
privateinthash(Object key) {
if (key == null) {
return0;
}
int h = key.hashCode();
return h ^ (h >>> 16); // spread high bits
}
// O(1) avg โ insert or update keypublic V put(K key, V value) {
int h = hash(key);
int i = h & (capacity - 1);
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
if (e.hash == h && Objects.equals(e.key, key)) {
V old = e.value;
e.value = value;
return old;
}
}
table[i] = new Entry<>(key, value, h, table[i]);
if (++size > capacity * loadFactor) {
resize();
}
return null;
}
// O(1) avg โ retrieve value by keypublic V get(K key) {
int h = hash(key);
for (Entry<K,V> e = table[h & (capacity - 1)]; e != null; e = e.next) {
if (e.hash == h && Objects.equals(e.key, key)) {
return e.value;
}
}
return null;
}
// O(1) avg โ delete entry by keypublic V remove(K key) {
int h = hash(key);
int i = h & (capacity - 1);
Entry<K,V> prev = null;
for (Entry<K,V> e = table[i]; e != null; prev = e, e = e.next) {
if (e.hash == h && Objects.equals(e.key, key)) {
if (prev == null) {
table[i] = e.next;
} else {
prev.next = e.next;
}
size--;
return e.value;
}
}
return null;
}
public booleancontainsKey(K key) {
return get(key) != null || (key == null && table[0] != null);
}
publicintsize() {
return size;
}
public booleanisEmpty() {
return size == 0;
}
// O(n) โ double capacity and rehash all entries
@SuppressWarnings("unchecked")
private voidresize() {
Entry<K,V>[] old = table;
capacity *= 2;
table = new Entry[capacity];
for (Entry<K,V> head : old) {
for (Entry<K,V> e = head; e != null; ) {
Entry<K,V> next = e.next;
int i = e.hash & (capacity - 1);
e.next = table[i];
table[i] = e;
e = next;
}
}
}
public static voidmain(String[] args) {
SimpleHashMap<String, Integer> m = new SimpleHashMap<>();
m.put("a", 1);
m.put("b", 2);
m.put("c", 3);
System.out.println(m.get("b")); // 2
m.put("b", 99); // update
System.out.println(m.get("b")); // 99
m.put(null, 0); // null key at index 0
System.out.println(m.get(null)); // 0// triggers resizefor (int i = 0; i < 20; i++) {
m.put("k" + i, i);
}
System.out.println(m.size()); // 23
}
}
Python โ dict + manual HashMap
# โโ Part 1: Python dict (built-in hash map) โโ
d = {}
d["a"] = 1# put โ O(1) avg
val = d.get("a") # get โ O(1) avg, returns None if absent
d.pop("a") # remove โ O(1) avg"a"in d # containsKey โ O(1) avg# Frequency countingfrom collections import Counter
freq = Counter(["a","b","a","c","a"]) # {'a':3, 'b':1, 'c':1}# โโ Part 2: Manual HashMap โโclassHashMap:
def__init__(self, capacity=16, load_factor=0.75):
self.capacity = capacity
self.load_factor = load_factor
self.size = 0
self.table = [[] for _ in range(capacity)]
def_hash(self, key):
h = hash(key)
return h & (self.capacity - 1)
defput(self, key, value): # O(1) avg
i = self._hash(key)
for j, (k, v) in enumerate(self.table[i]):
if k == key:
self.table[i][j] = (key, value)
return
self.table[i].append((key, value))
self.size += 1if self.size > self.capacity * self.load_factor:
self._resize()
defget(self, key, default=None): # O(1) avgfor k, v in self.table[self._hash(key)]:
if k == key: return v
return default
defremove(self, key): # O(1) avg
bucket = self.table[self._hash(key)]
for j, (k, v) in enumerate(bucket):
if k == key:
bucket.pop(j); self.size -= 1; return v
return Nonedef_resize(self): # O(n)
old = self.table
self.capacity *= 2
self.table = [[] for _ in range(self.capacity)]
self.size = 0for bucket in old:
for k, v in bucket:
self.put(k, v)
GOTCHA
HashMap is not thread-safe โ concurrent reads and writes from multiple threads corrupt the internal structure silently and can cause infinite loops during resize. Never share a plain HashMap between threads. Use ConcurrentHashMap, or wrap with Collections.synchronizedMap() only if you cannot use ConcurrentHashMap.
CHOOSE
Use HashMap when you need O(1) key lookup and iteration order does not matter โ it is the default for 90% of use cases. Switch to LinkedHashMap when you need predictable iteration order or LRU eviction. Switch to TreeMap when you need sorted keys or range queries. Never use Hashtable.
09
Section Nine ยท Decision Guide
Pick The Right Structure
Decision flowchart โ choosing the right map
Structure
get/put
Ordered
Null key
Thread-safe
Best For
HashMap โ this
O(1) avg
No
Yes
No
General purpose lookup
LinkedHashMap
O(1) avg
Insertion
Yes
No
LRU cache, ordered output
TreeMap
O(log n)
Sorted
No
No
Range queries, sorted keys
HashSet
O(1) avg
No
Yes
No
Membership, deduplication
ConcurrentHashMap
O(1) avg
No
No
Yes
Shared maps across threads
ArrayList
O(1) idx
By index
Yes
No
Ordered list, random access
HashMap solves more interview problems than any other structure:
Frequency counting (map.merge(key, 1, Integer::sum)), grouping (map.computeIfAbsent(key, k -> new ArrayList<>()).add(val)), and existence checking (map.containsKey(key)) are the three patterns that unlock the majority of medium interview problems.
When a brute-force O(nยฒ) uses a nested loop to find matches, the O(n) solution almost always involves replacing the inner loop with a HashMap lookup.
10
Section Ten ยท Patterns
Common HashMap Patterns
Five patterns account for the vast majority of HashMap usage in interviews and production code. Recognising which pattern a problem requires is more important than the implementation itself โ the code is always short once the pattern is clear.
Pattern 1 โ Frequency Counting
Count how many times each element appears, then query the map to answer questions about duplicates, majorities, or top-K.
Java โ Frequency counting idioms
// Classic loop โ works everywhere
Map<String, Integer> freq = new HashMap<>();
for (String s : words) {
freq.put(s, freq.getOrDefault(s, 0) + 1);
}
// Modern โ merge() is cleanerfor (String s : words) {
freq.merge(s, 1, Integer::sum); // if absent โ 1, else += 1
}
// One-liner with streams
Map<String, Long> freq2 = Arrays.stream(words)
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
merge() vs getOrDefault():
merge(key, 1, Integer::sum) does the same thing as getOrDefault + put in one call.
It reads better, avoids the intermediate variable, and is atomic in ConcurrentHashMap.
Pattern 2 โ Complement Lookup
Store what you have seen so far, and check whether the complement of the current element exists โ turning an O(nยฒ) nested loop into O(n).
Java โ Two Sum (the canonical example)
// Two Sum โ find indices of two numbers that add to target
Map<Integer, Integer> seen = new HashMap<>(); // value โ indexfor (int i = 0; i < nums.length; i++) {
int complement = target - nums[i];
if (seen.containsKey(complement))
return newint[] { seen.get(complement), i };
seen.put(nums[i], i); // store after checking
}
// O(n) time, O(n) space โ one pass
Store after checking โ not before:
Putting the current element into the map before checking the complement can cause false positives (an element matching with itself).
Always check first, then store.
Pattern 3 โ Grouping / Bucketing
Use computeIfAbsent() to collect items sharing a canonical key into lists โ anagram grouping, character frequency grouping, graph adjacency.
Java โ Group Anagrams
// Group Anagrams โ canonical key = sorted characters
Map<String, List<String>> groups = new HashMap<>();
for (String s : strs) {
char[] chars = s.toCharArray();
Arrays.sort(chars);
String key = new String(chars); // canonical form
groups.computeIfAbsent(key, k -> new ArrayList<>()).add(s);
}
return new ArrayList<>(groups.values());
// O(n ยท k log k) where k = max string length
computeIfAbsent() vs putIfAbsent():
computeIfAbsent(key, fn) creates the value lazily โ the function only runs when the key is absent. putIfAbsent(key, value) requires the value to already exist.
For lists, computeIfAbsent is always the right choice.
Pattern 4 โ Prefix Sum + HashMap
Store cumulative sums as keys โ if (currentSum - target) exists as a key, the subarray between those positions sums to target.
Java โ Subarray Sum Equals K
// Subarray Sum Equals K โ count subarrays summing to k
Map<Integer, Integer> prefixCount = new HashMap<>();
prefixCount.put(0, 1); // empty prefixint sum = 0, count = 0;
for (int num : nums) {
sum += num;
count += prefixCount.getOrDefault(sum - k, 0);
prefixCount.merge(sum, 1, Integer::sum);
}
return count;
// O(n) time, O(n) space โ one pass
Seed the map with (0, 1): The prefix sum map must be initialized with prefixCount.put(0, 1) โ this handles the case where the subarray starting from index 0 sums to exactly k.
Pattern 5 โ Sliding Window + HashMap
Track character or element frequencies inside a sliding window using a HashMap โ expand right to include, shrink left to exclude, update the map at each step.
Java โ Longest Substring Without Repeating Characters
// Longest substring without repeating characters
Map<Character, Integer> lastSeen = new HashMap<>();
int left = 0, maxLen = 0;
for (int right = 0; right < s.length(); right++) {
char c = s.charAt(right);
if (lastSeen.containsKey(c) && lastSeen.get(c) >= left)
left = lastSeen.get(c) + 1; // shrink window past duplicate
lastSeen.put(c, right);
maxLen = Math.max(maxLen, right - left + 1);
}
return maxLen;
// O(n) time, O(min(n, charset)) space
HashMap vs int[128] for character windows:
For ASCII characters, an int[128] array is faster than a HashMap<Character, Integer> โ direct indexing, no boxing, no hashing.
Use HashMap when the element type is unbounded (strings, integers, objects).
Common Mistake: Modifying a HashMap while iterating over it with a for-each loop throws ConcurrentModificationException. Use Iterator.remove(), or collect keys to remove into a separate list first. The replaceAll() and removeIf() methods on the entry set are iteration-safe alternatives.
11
Section Eleven ยท Complexity
Complexity Reference
HashMap operations are O(1) average-case because the hash function computes the bucket index in constant time and the chain in each bucket is expected to be length 1 under a good hash function and 0.75 load factor. The worst case โ all keys collide into one bucket โ is O(n) for a linked-list chain, but Java 8’s treeification caps it at O(log n). Resize is O(n) but happens infrequently enough to amortize to O(1) per insertion.
Time Complexity
Operation
Average
Worst (pre-Java 8)
Worst (Java 8+)
Notes
put(key, value)
O(1)
O(n)
O(log n)
Amortized O(1) including resize
get(key)
O(1)
O(n)
O(log n)
Hash + chain/tree traversal
remove(key)
O(1)
O(n)
O(log n)
Hash + chain/tree traversal
containsKey(key)
O(1)
O(n)
O(log n)
Same as get()
containsValue(value)
O(n)
O(n)
O(n)
Scans all entries โ avoid
resize / rehash
O(n)
O(n)
O(n)
Visits every entry once
keySet() iteration
O(n + cap)
O(n + cap)
O(n + cap)
Visits every bucket, even empty
Space Complexity
Component
Space
Notes
Bucket array
O(capacity)
Always a power of 2, starts at 16
Entries (chain nodes)
O(n)
One Entry object per key-value pair
Total
O(n + capacity)
Capacity ≤ 2n at load factor 0.75
Per entry overhead
~48 bytes
Key ref + value ref + hash int + next pointer + object header
Capacity vs size:
A new HashMap<>() allocates a 16-slot bucket array but stores 0 entries.
After 12 insertions (16 ร 0.75), it resizes to 32 slots.
If you know the expected size, use new HashMap<>(expectedSize / 0.75 + 1) to avoid resizing.
Starting with a good initial capacity can eliminate all resize operations.
Amortized Analysis โ Why Resize Is Still O(1) Per Put
Each resize doubles the capacity and moves all n entries โ O(n) work. But resize only triggers after n insertions (where n = capacity ร loadFactor). Distributing the O(n) resize cost across the n insertions that caused it gives O(1) amortized cost per put(). This is the same accounting trick that makes ArrayList.add() amortized O(1) despite occasional O(n) array copies.
When HashMap is fast
O(1) average achieved when
Hash function distributes keys uniformly
Load factor stays near default (0.75)
Keys are immutable (String, Integer, enum)
equals() and hashCode() are cheap and consistent
When HashMap degrades
Performance killers
All keys hash to same bucket โ O(n) per operation
Expensive equals() โ high constant factor per comparison
Mutable keys โ lost entries after state change
Very low load factor โ wasted memory, frequent iteration over empty buckets
12
Section Twelve ยท Practice
Problems To Solve
HashMap interview problems cluster around three patterns: frequency counting (scan once, store counts, query in O(1)), prefix sums with a map (store cumulative totals as keys to find subarrays in O(1)), and grouping (use computeIfAbsent to collect items sharing a canonical form). When a problem asks “does X exist?” or “how many times does X appear?” and a brute-force solution would scan repeatedly, the O(n) solution is almost always a HashMap.
Store each number’s complement in a HashMap as you scan โ if the current number exists as a key, you found the pair in O(1) rather than scanning backwards.
Sort each string to get its canonical form โ strings that are anagrams produce the same sorted key, so group them in a HashMap<String, List<String>>.
Put all numbers in a HashSet, then for each number that has no left neighbour (n-1 not in set), count how far right the sequence extends โ O(n) total.