HashMap | DSA Guide | LearningTree

01

Section One · Foundation

What is a HashMap?

Every key inserted into a HashMap is transformed by a hash function into an array index — a bucket — allowing direct access to its value without scanning any other entries. O(1) average access is possible because the hash function computes the bucket index in constant time, and reading or writing one array cell is O(1) regardless of how many total entries exist. The word “average” matters: in the worst case, every key hashes to the same bucket and the map degenerates to a linked list with O(n) lookup; a good hash function distributes keys uniformly to make this astronomically unlikely in practice. A HashMap stores pairs, not values alone — any object can be a key as long as it correctly implements hashCode() and equals(), the two methods the map uses to locate and identify entries. Frequency counting, memoization, grouping, deduplication, and graph adjacency representation all use HashMap internally; it is the single most versatile data structure in the Java standard library.

Analogy: A library with a card catalogue. Instead of searching every shelf for a book, you look up the title in the catalogue, which tells you the exact shelf number in O(1). The catalogue is the hash function — it maps the book title (key) to a shelf location (bucket index). If two books map to the same shelf, the librarian keeps a short list on that shelf to distinguish them — that is collision chaining. The library never gets slower to look up a single book as more books are added, as long as the catalogue stays accurate.

02

Section Two · Mental Model

How It Thinks

Hash function maps key to bucket index in O(1)

▶

get() and put() skip every other entry — direct access

Two keys can hash to the same bucket (collision)

▶

Bucket holds a chain or probe sequence — O(1) avg, O(n) worst

Load factor threshold reached (default 0.75)

▶

Internal array doubles and all entries rehash — O(n) resize

Keys must implement hashCode() and equals() correctly

▶

Wrong hashCode() causes entries to land in wrong buckets — get() returns null for keys that exist

No ordering of keys guaranteed

▶

Iteration order is undefined and changes after resize — never rely on HashMap iteration order

Java 8+: chain longer than 8 entries treeifies to red-black tree

▶

Worst-case bucket lookup degrades from O(n) to O(log n) even under catastrophic hash collisions

03

Section Three · Internals

How Hashing Works

Storing a key-value pair requires answering one question in O(1): given this key, which array slot does it belong in? The hash function answers that question by converting any key — a string, an integer, an object — into an integer, then compressing that integer into the range [0, capacity-1] using the modulo operation. The quality of the hash function determines how evenly entries are distributed across buckets, which determines whether the map runs in O(1) or degrades toward O(n).

Hashing pipeline — key to bucket index

hashCode() contract

What Java requires

Equal objects MUST have equal hash codes
Unequal objects SHOULD have different hash codes (collisions allowed but hurt performance)
hashCode() must be consistent — same result every call for the same object state
If you override equals(), you MUST override hashCode()

What breaks silently

Violating the contract

If two objects are equal() but return different hashCode() values, the map stores them in different buckets. get() will never find a key that was put() because it looks in the wrong bucket. The map returns null for keys that are provably present.

Using mutable fields in hashCode() — value changes, bucket changes
Forgetting to override hashCode() when overriding equals()
Custom objects as keys without hashCode() implementation

String.hashCode() — How Java Computes It

Java computes String's hash code as a polynomial rolling hash over the character values, producing a consistent integer for any string value.

Java — JDK String.hashCode() source

 // Java's String.hashCode() — from the JDK source // s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1] int hash = 0;
for (int i = 0; i < value.length; i++) {
    hash = 31 * hash + value[i]; // 31 chosen: prime, fast (32x - x)
}
// "cat" → (99×31² + 97×31 + 116) = 98262 // Compression into bucket index (Java's actual formula): int index = (capacity - 1) & hash; // bitwise AND — faster than modulo // only works when capacity is power of 2 

Why capacity is always a power of 2:

Java’s HashMap always sizes its bucket array at a power of 2 — 16, 32, 64, etc.
This lets the modulo operation be replaced by a bitwise AND: hash % capacity = hash & (capacity - 1), which is significantly faster.

Load Factor — When Does HashMap Resize?

The load factor is the ratio of entries to buckets that triggers a resize — at 0.75 (default), the map resizes when 75% of buckets are occupied.

Java — Load factor configuration

 // Default load factor = 0.75 // Resize triggers when: size > capacity × loadFactor // Default capacity = 16 → resize at 12 entries // Custom load factor — lower = faster, more memory
Map<String, Integer> map = new HashMap<>(32, 0.5f);
//                                       ↑     ↑ //                             initial  load factor //                             capacity (resizes at 16 entries) // Higher load factor — more memory-efficient, slower
Map<String, Integer> dense = new HashMap<>(16, 0.9f);
// Resizes at 14 entries — more collisions before resize 

0.75 is the empirically optimal trade-off:

At load factor 0.75, a HashMap has a good balance between memory usage and lookup performance — most buckets are empty or hold one entry, keeping the average chain length near 1.
A lower load factor wastes memory; a higher one causes more collisions.

Common Mistake: Using a mutable object as a HashMap key and then modifying it after insertion. The hash code is computed from the object’s state at put() time. If the state changes, hashCode() returns a different value and the entry lands in a different bucket on lookup — get() returns null even though the key was put(). Always use immutable objects (String, Integer, enums) as HashMap keys.

04

Section Four · Collision

Collision Handling

A collision is not an error — it is an expected and inevitable consequence of mapping a large key space into a small bucket array. With 16 buckets and any reasonable load, the birthday paradox guarantees collisions well before the map is full. What matters is the strategy for resolving them: how entries in the same bucket are stored, found, and removed.

Separate Chaining — Java's Approach

Each bucket holds a linked list (chain) of all entries that hash to that bucket. put() appends to the chain; get() traverses the chain comparing keys with equals() until it finds a match or exhausts the list. Java’s HashMap uses separate chaining with an optimization: chains longer than 8 entries are converted to red-black trees (treeification), capping worst-case lookup at O(log n).

Separate chaining — collision in bucket [3]

equals() traversal is the bottleneck:

In a collision chain, each entry is compared using equals() — not hashCode().
By the time you reach the chain, you already know the hash matches; equals() determines which exact entry you want.
An expensive equals() degrades even a well-distributed HashMap.

Open Addressing — Linear Probing

Open addressing stores all entries in the bucket array itself — no linked lists. When a collision occurs, the algorithm probes consecutive slots (linear probing) or uses a second hash (double hashing) until an empty slot is found.

Linear probing — collision at index 2, probe to index 3

Java uses separate chaining — not open addressing:

Java’s HashMap uses separate chaining exclusively.
Open addressing is used in some high-performance scenarios (Python’s dict, Google’s SwissTable) because all data stays in one contiguous array — better cache behaviour.

Treeification — Java 8's Safety Net

Java 8 introduced treeification: when a single bucket’s chain exceeds 8 entries, that chain is converted from a linked list to a red-black tree. This caps the worst-case lookup in a single bucket at O(log n) instead of O(n) — a crucial defense against hash collision attacks where an adversary deliberately sends keys that all hash to the same bucket. When the chain shrinks below 6 entries (via removal), the tree reverts to a linked list.

Treeification — chain to red-black tree when length ≥ 8

Treeification is a safety net, not the fast path:

If your HashMap is treeifying buckets in production, your hash function is broken — either producing too many collisions or under attack.
Fix the hash function first. Treeification prevents catastrophic O(n) but adds memory overhead.

05

Section Five · Operations

Core Operations

put(key, value)

Inserts a new key-value pair or updates the value of an existing key, computing the bucket index from the key's hash.

Pseudocode

 function put(map, key, value):
    index = hash(key) & (capacity - 1) // compute bucket — O(1)
bucket = table[index]
for each entry in bucket:
if entry.key.equals(key):
            entry.value = value // update existing — O(1) avg return
bucket.add(new Entry(key, value)) // insert new — O(1) avg
size++
if size > capacity * loadFactor:
resize() // O(n) — rare, amortized O(1) 

put() is an upsert — not an insert:

If the key already exists, put() overwrites the value and returns the old value — it does not throw or create a duplicate.
Use putIfAbsent() when you want to insert only if the key is missing.

get(key)

Retrieves the value associated with a key by computing its bucket index and searching the chain for an equals() match.

Pseudocode

 function get(map, key):
    index = hash(key) & (capacity - 1) // compute bucket — O(1)
bucket = table[index]
for each entry in bucket:
if entry.key.equals(key):
return entry.value // found — O(1) avg return null // not found — O(1) avg 

get() returning null has two meanings:

map.get(key) returns null either when the key does not exist or when the key maps to a null value — they are indistinguishable.
Use containsKey(key) to distinguish, or getOrDefault(key, fallback) to avoid null entirely.

remove(key)

Deletes the entry with the specified key from its bucket chain and returns the associated value, or null if absent.

Pseudocode

 function remove(map, key):
    index = hash(key) & (capacity - 1)
    bucket = table[index]
    prev = null for each entry in bucket:
if entry.key.equals(key):
if prev is null: bucket.head = entry.next
else: prev.next = entry.next // bypass — O(1) avg
size--
return entry.value
        prev = entry
return null 

remove(key, value) — conditional delete: map.remove(key, value) deletes the entry only if the key currently maps to that exact value — an atomic check-and-remove useful for concurrent scenarios or stale cache cleanup.

containsKey(key)

Returns true if the map contains a mapping for the specified key — equivalent to get() but explicit about intent.

Pseudocode

 function containsKey(map, key):
    index = hash(key) & (capacity - 1)
    bucket = table[index]
for each entry in bucket:
if entry.key.equals(key):
return true // O(1) avg return false 

containsValue() is O(n) — avoid it:

containsKey() is O(1) because it uses the hash. containsValue() is O(n) because values are not hashed — the map must scan every entry in every bucket.
If you need O(1) value lookup, maintain a second HashMap with keys and values swapped.

Resize / Rehash

When the entry count exceeds capacity × loadFactor, the internal array doubles in size and every existing entry is rehashed into the new array.

Pseudocode

 function resize(map):
    oldTable = table
    capacity = capacity * 2 // double — always power of 2
table = new Array[capacity]
for each bucket in oldTable:
for each entry in bucket:
            newIndex = entry.hash & (capacity - 1) // recompute
table[newIndex].add(entry) // re-insert into new table // O(n) — visits every entry once 

Rehashing changes all bucket positions:

After a resize, entries move to new bucket indices because the capacity — and therefore the modulo — changed.
Never cache bucket indices or rely on HashMap’s internal structure. Iteration order changes after resize.

Common Mistake: Calling map.get(key) without checking containsKey() first, then treating null as “not found” — but the code put null values into the map earlier. Use getOrDefault(key, 0) for counters, or merge(key, 1, Integer::sum) to increment without null checks.

06

Section Six · Variants

HashMap Variants

Java provides four Map implementations covering every common use case — plain HashMap for O(1) lookups, LinkedHashMap for ordered iteration, TreeMap for sorted keys, and WeakHashMap for caches that should not prevent garbage collection. Choosing the right implementation is a one-decision question: do you need ordering, and if so, what kind?

LinkedHashMap — Predictable Iteration Order

LinkedHashMap extends HashMap and maintains a doubly linked list running through all entries in insertion order — put() adds to the tail of the list, so iteration always visits entries in the order they were inserted. Optionally, accessOrder=true makes it access-order instead: every get() or put() moves the accessed entry to the tail, making the head always the least-recently-used entry. The LRU cache pattern — evict the head when capacity is exceeded — is built directly on this access-order LinkedHashMap behaviour.

Java — LinkedHashMap usage

 // Insertion-order (default) — iterates in put() order
Map<String, Integer> ordered = new LinkedHashMap<>();
ordered.put("c", 3); ordered.put("a", 1); ordered.put("b", 2);
// Iteration: c=3, a=1, b=2  (insertion order) // Access-order — LRU cache base
Map<String, Integer> lru = new LinkedHashMap<>(16, 0.75f, true) {
    @Override
protected boolean removeEldestEntry(Map.Entry<String,Integer> e) {
return size() > 3; // evict when capacity exceeded
}
};
lru.put("a", 1); lru.put("b", 2); lru.put("c", 3);
lru.get("a"); // moves "a" to tail (most recently used)
lru.put("d", 4); // triggers evict — removes "b" (LRU) 

LinkedHashMap is the LRU Cache building block:

LeetCode 146 (LRU Cache) can be solved in 10 lines with access-order LinkedHashMap and overriding removeEldestEntry().
The doubly linked list adds O(1) overhead per operation — all operations remain O(1) average.

TreeMap — Sorted Keys, Range Queries

TreeMap stores entries in a red-black tree keyed in natural order (or by a provided Comparator), giving O(log n) put(), get(), and remove() — slower than HashMap but providing sorted iteration and powerful range operations. The unique TreeMap operations — floorKey(), ceilingKey(), firstKey(), lastKey(), subMap() — are the reason to choose TreeMap: they solve range query problems that would require sorting a HashMap’s keys first. In interviews, whenever a problem asks for “nearest value below X” or “all keys between A and B”, TreeMap is likely the intended structure.

Java — TreeMap range operations

TreeMap<Integer, String> tree = new TreeMap<>();
tree.put(5, "five"); tree.put(3, "three");
tree.put(8, "eight"); tree.put(1, "one");
// Sorted iteration — always ascending
tree.forEach((k, v) -> System.out.println(k + "=" + v));
// 1=one, 3=three, 5=five, 8=eight // Range queries — the TreeMap superpower
tree.floorKey(4); // 3 — largest key ≤ 4
tree.ceilingKey(4); // 5 — smallest key ≥ 4
tree.firstKey(); // 1 — minimum key O(log n)
tree.subMap(3, true, 8, false); // {3=three, 5=five} — [3, 8) 

TreeMap vs sorted HashMap:

There is no such thing as a sorted HashMap — HashMap’s iteration order is undefined.
If you find yourself calling Collections.sort(new ArrayList<>(map.keySet())) repeatedly, the correct structure is a TreeMap from the start.

WeakHashMap — GC-Eligible Keys

WeakHashMap holds weak references to its keys — when a key has no other strong references elsewhere in the program, the garbage collector may collect it and the corresponding entry is automatically removed from the map. This makes WeakHashMap suitable for caches and metadata stores where the map should not be the reason an object stays alive.

WeakHashMap is not a general-purpose cache:

GC timing is non-deterministic — entries may be evicted at any point, even if the JVM has plenty of heap.
For a real cache, use a proper caching library (Caffeine, Guava Cache) which gives you control over eviction policy, size limits, and expiry.

ConcurrentHashMap — Thread-Safe Without Full Locking

ConcurrentHashMap uses lock striping — the bucket array is divided into segments, each with its own lock — so multiple threads can read and write to different segments simultaneously without blocking each other. Use ConcurrentHashMap any time a HashMap is shared between threads; never use HashMap in a multi-threaded context without external synchronization, and never use Collections.synchronizedMap().

computeIfAbsent() is atomic in ConcurrentHashMap:

computeIfAbsent(key, mappingFunction) checks for the key and inserts the computed value atomically — no race condition.
This is the correct pattern for building a concurrent frequency counter or a shared memoization cache.

Implementation	Order	get/put	Null keys	Best For
HashMap	None (undefined)	O(1) avg	Yes (1 null)	General purpose
LinkedHashMap	Insertion or LRU	O(1) avg	Yes (1 null)	LRU cache, ordered output
TreeMap	Sorted (natural)	O(log n)	No	Range queries, sorted iteration
WeakHashMap	None	O(1) avg	Yes (1 null)	Metadata, weak caches
ConcurrentHashMap	None	O(1) avg	No	Multi-threaded maps
Hashtable	None	O(1) avg	No	Legacy — never use

⚠ Hashtable is fully synchronized legacy code — use ConcurrentHashMap for thread safety

07

Section Seven · Implementation

Build It Once

Build this from scratch once — it makes the internals concrete. In any real project or interview, reach for the built-in (Section 08).

Java — HashMap core mechanics

 // Build once to understand — use HashMap<K,V> in all real code static class Entry<K, V> {
    K key;
    V value;
int hash;
    Entry<K,V> next;

    Entry(K key, V value, int hash, Entry<K,V> next) {
this.key = key;
this.value = value;
this.hash = hash;
this.next = next;
    }
}

Entry<K,V>[] table;
int size;
int capacity = 16;
float loadFactor = 0.75f;
// O(1) avg — insert or update key
V put(K key, V value) {
int h = (key == null)
        ? 0
: key.hashCode() ^ (key.hashCode() >>> 16);
// bitwise AND — works because capacity is 2^n int i = h & (capacity - 1);
// search for existing key for (Entry<K,V> e = table[i]; e != null; e = e.next) {
if (e.hash == h && (e.key == key || key.equals(e.key))) {
            V old = e.value;
            e.value = value;
return old;
        }
    }
// key not found — insert at head of chain
table[i] = new Entry<>(key, value, h, table[i]);
if (++size > capacity * loadFactor) {
resize();
    }
return null;
}
// O(1) avg — retrieve value by key
V get(K key) {
int h = (key == null)
        ? 0
: key.hashCode() ^ (key.hashCode() >>> 16);
for (Entry<K,V> e = table[h & (capacity - 1)]; e != null; e = e.next) {
if (e.hash == h && (e.key == key || key.equals(e.key))) {
return e.value;
        }
    }
return null;
}
// O(n) — double capacity and rehash all entries void resize() {
    Entry<K,V>[] old = table;
    capacity *= 2;
    table = new Entry[capacity];
for (Entry<K,V> head : old) {
for (Entry<K,V> e = head; e != null; ) {
            Entry<K,V> next = e.next;
// recompute index for new capacity int i = e.hash & (capacity - 1);
// prepend to new bucket
e.next = table[i];
            table[i] = e;

            e = next;
        }
    }
}

08

Section Eight · Java Built-in

Use It In Java

IN JAVA

USE HashMap<K, V> map = new HashMap<>()

IMPORT import java.util.HashMap;
import java.util.Map;

METHODS

Method	Complexity	Note
`put(key, value)`	O(1) avg	Returns old value or null
`get(key)`	O(1) avg	Returns null if absent
`remove(key)`	O(1) avg	Returns removed value or null
`containsKey(key)`	O(1) avg	Always prefer over get()!=null
`containsValue(value)`	O(n)	Scans all entries — avoid in loops
`getOrDefault(key, def)`	O(1) avg	Avoid null checks for counters
`putIfAbsent(key, value)`	O(1) avg	Insert only if key missing
`merge(key, value, fn)`	O(1) avg	Ideal for frequency counting
`computeIfAbsent(key, fn)`	O(1) avg	Lazy value creation (grouping)
`forEach((k,v) -> ...)`	O(n)	Preferred iteration

GOTCHA HashMap is not thread-safe — concurrent reads and writes from multiple threads corrupt the internal structure silently and can cause infinite loops during resize. Never share a plain HashMap between threads. Use ConcurrentHashMap, or wrap with Collections.synchronizedMap() only if you cannot use ConcurrentHashMap.

CHOOSE Use HashMap when you need O(1) key lookup and iteration order does not matter — it is the default for 90% of use cases. Switch to LinkedHashMap when you need predictable iteration order or LRU eviction. Switch to TreeMap when you need sorted keys or range queries. Never use Hashtable.

09

Section Nine · Decision Guide

Pick The Right Structure

Decision flowchart — choosing the right map

Structure	get/put	Ordered	Null key	Thread-safe	Best For
HashMap ← this	O(1) avg	No	Yes	No	General purpose lookup
LinkedHashMap	O(1) avg	Insertion	Yes	No	LRU cache, ordered output
TreeMap	O(log n)	Sorted	No	No	Range queries, sorted keys
HashSet	O(1) avg	No	Yes	No	Membership, deduplication
ConcurrentHashMap	O(1) avg	No	No	Yes	Shared maps across threads
ArrayList	O(1) idx	By index	Yes	No	Ordered list, random access

HashMap solves more interview problems than any other structure:

Frequency counting (map.merge(key, 1, Integer::sum)), grouping (map.computeIfAbsent(key, k -> new ArrayList<>()).add(val)), and existence checking (map.containsKey(key)) are the three patterns that unlock the majority of medium interview problems.
When a brute-force O(n²) uses a nested loop to find matches, the O(n) solution almost always involves replacing the inner loop with a HashMap lookup.

10

Section Ten · Patterns

Common HashMap Patterns

Five patterns account for the vast majority of HashMap usage in interviews and production code. Recognising which pattern a problem requires is more important than the implementation itself — the code is always short once the pattern is clear.

Pattern 1 — Frequency Counting

Count how many times each element appears, then query the map to answer questions about duplicates, majorities, or top-K.

Java — Frequency counting idioms

 // Classic loop — works everywhere
Map<String, Integer> freq = new HashMap<>();
for (String s : words) {
    freq.put(s, freq.getOrDefault(s, 0) + 1);
}
// Modern — merge() is cleaner for (String s : words) {
    freq.merge(s, 1, Integer::sum); // if absent → 1, else += 1
}
// One-liner with streams
Map<String, Long> freq2 = Arrays.stream(words)
    .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

merge() vs getOrDefault():

merge(key, 1, Integer::sum) does the same thing as getOrDefault + put in one call.
It reads better, avoids the intermediate variable, and is atomic in ConcurrentHashMap.

Pattern 2 — Complement Lookup

Store what you have seen so far, and check whether the complement of the current element exists — turning an O(n²) nested loop into O(n).

Java — Two Sum (the canonical example)

 // Two Sum — find indices of two numbers that add to target
Map<Integer, Integer> seen = new HashMap<>(); // value → index for (int i = 0; i < nums.length; i++) {
int complement = target - nums[i];
if (seen.containsKey(complement))
return new int[] { seen.get(complement), i };
    seen.put(nums[i], i); // store after checking
}
// O(n) time, O(n) space — one pass 

Store after checking — not before:

Putting the current element into the map before checking the complement can cause false positives (an element matching with itself).
Always check first, then store.

Pattern 3 — Grouping / Bucketing

Use computeIfAbsent() to collect items sharing a canonical key into lists — anagram grouping, character frequency grouping, graph adjacency.

Java — Group Anagrams

 // Group Anagrams — canonical key = sorted characters
Map<String, List<String>> groups = new HashMap<>();
for (String s : strs) {
char[] chars = s.toCharArray();
    Arrays.sort(chars);
    String key = new String(chars); // canonical form
groups.computeIfAbsent(key, k -> new ArrayList<>()).add(s);
}
return new ArrayList<>(groups.values());
// O(n · k log k) where k = max string length 

computeIfAbsent() vs putIfAbsent():

computeIfAbsent(key, fn) creates the value lazily — the function only runs when the key is absent. putIfAbsent(key, value) requires the value to already exist.
For lists, computeIfAbsent is always the right choice.

Pattern 4 — Prefix Sum + HashMap

Store cumulative sums as keys — if (currentSum - target) exists as a key, the subarray between those positions sums to target.

Java — Subarray Sum Equals K

 // Subarray Sum Equals K — count subarrays summing to k
Map<Integer, Integer> prefixCount = new HashMap<>();
prefixCount.put(0, 1); // empty prefix int sum = 0, count = 0;
for (int num : nums) {
    sum += num;
    count += prefixCount.getOrDefault(sum - k, 0);
    prefixCount.merge(sum, 1, Integer::sum);
}
return count;
// O(n) time, O(n) space — one pass 

Seed the map with (0, 1): The prefix sum map must be initialized with prefixCount.put(0, 1) — this handles the case where the subarray starting from index 0 sums to exactly k.

Pattern 5 — Sliding Window + HashMap

Track character or element frequencies inside a sliding window using a HashMap — expand right to include, shrink left to exclude, update the map at each step.

Java — Longest Substring Without Repeating Characters

 // Longest substring without repeating characters
Map<Character, Integer> lastSeen = new HashMap<>();
int left = 0, maxLen = 0;
for (int right = 0; right < s.length(); right++) {
char c = s.charAt(right);
if (lastSeen.containsKey(c) && lastSeen.get(c) >= left)
        left = lastSeen.get(c) + 1; // shrink window past duplicate
lastSeen.put(c, right);
    maxLen = Math.max(maxLen, right - left + 1);
}
return maxLen;
// O(n) time, O(min(n, charset)) space 

HashMap vs int[128] for character windows:

For ASCII characters, an int[128] array is faster than a HashMap<Character, Integer> — direct indexing, no boxing, no hashing.
Use HashMap when the element type is unbounded (strings, integers, objects).

Common Mistake: Modifying a HashMap while iterating over it with a for-each loop throws ConcurrentModificationException. Use Iterator.remove(), or collect keys to remove into a separate list first. The replaceAll() and removeIf() methods on the entry set are iteration-safe alternatives.

11

Section Eleven · Complexity

Complexity Reference

HashMap operations are O(1) average-case because the hash function computes the bucket index in constant time and the chain in each bucket is expected to be length 1 under a good hash function and 0.75 load factor. The worst case — all keys collide into one bucket — is O(n) for a linked-list chain, but Java 8’s treeification caps it at O(log n). Resize is O(n) but happens infrequently enough to amortize to O(1) per insertion.

Time Complexity

Operation	Average	Worst (pre-Java 8)	Worst (Java 8+)	Notes
put(key, value)	O(1)	O(n)	O(log n)	Amortized O(1) including resize
get(key)	O(1)	O(n)	O(log n)	Hash + chain/tree traversal
remove(key)	O(1)	O(n)	O(log n)	Hash + chain/tree traversal
containsKey(key)	O(1)	O(n)	O(log n)	Same as get()
containsValue(value)	O(n)	O(n)	O(n)	Scans all entries — avoid
resize / rehash	O(n)	O(n)	O(n)	Visits every entry once
keySet() iteration	O(n + cap)	O(n + cap)	O(n + cap)	Visits every bucket, even empty

Space Complexity

Component	Space	Notes
Bucket array	O(capacity)	Always a power of 2, starts at 16
Entries (chain nodes)	O(n)	One Entry object per key-value pair
Total	O(n + capacity)	Capacity ≤ 2n at load factor 0.75
Per entry overhead	~48 bytes	Key ref + value ref + hash int + next pointer + object header

Capacity vs size:

A new HashMap<>() allocates a 16-slot bucket array but stores 0 entries.
After 12 insertions (16 × 0.75), it resizes to 32 slots.
If you know the expected size, use new HashMap<>(expectedSize / 0.75 + 1) to avoid resizing.
Starting with a good initial capacity can eliminate all resize operations.

Amortized Analysis — Why Resize Is Still O(1) Per Put

Each resize doubles the capacity and moves all n entries — O(n) work. But resize only triggers after n insertions (where n = capacity × loadFactor). Distributing the O(n) resize cost across the n insertions that caused it gives O(1) amortized cost per put(). This is the same accounting trick that makes ArrayList.add() amortized O(1) despite occasional O(n) array copies.

When HashMap is fast

O(1) average achieved when

Hash function distributes keys uniformly
Load factor stays near default (0.75)
Keys are immutable (String, Integer, enum)
equals() and hashCode() are cheap and consistent

When HashMap degrades

Performance killers

All keys hash to same bucket → O(n) per operation
Expensive equals() → high constant factor per comparison
Mutable keys → lost entries after state change
Very low load factor → wasted memory, frequent iteration over empty buckets

12

Section Twelve · Practice

Problems To Solve

HashMap interview problems cluster around three patterns: frequency counting (scan once, store counts, query in O(1)), prefix sums with a map (store cumulative totals as keys to find subarrays in O(1)), and grouping (use computeIfAbsent to collect items sharing a canonical form). When a problem asks “does X exist?” or “how many times does X appear?” and a brute-force solution would scan repeatedly, the O(n) solution is almost always a HashMap.

Difficulty	Pattern	Problem	Key Insight
Easy	hash-map	Two Sum — LC 1	Store each number’s complement in a HashMap as you scan — if the current number exists as a key, you found the pair in O(1) rather than scanning backwards.
Medium	hash-map	Group Anagrams — LC 49	Sort each string to get its canonical form — strings that are anagrams produce the same sorted key, so group them in a HashMap<String, List<String>>.
Medium	hash-map	Top K Frequent Elements — LC 347	Count frequencies with a HashMap, then use a min-heap of size k — the heap keeps the k largest frequencies without sorting the full frequency map.
Medium	prefix-sum	Subarray Sum Equals K — LC 560	Store prefix sums in a HashMap — if prefixSum - k exists as a key, the subarray between those indices sums to k. One pass, O(n) time, O(n) space.
Medium	hash-set	Longest Consecutive Sequence — LC 128	Put all numbers in a HashSet, then for each number that has no left neighbour (n-1 not in set), count how far right the sequence extends — O(n) total.
Hard	design	LRU Cache — LC 146	HashMap for O(1) lookup by key, doubly linked list for O(1) move-to-front and eviction from tail — both operations must be O(1) simultaneously.

Interview Pattern:

The single most powerful HashMap pattern is the complement lookup — store what you have seen, check whether what you need exists.
Two Sum does this for pairs. Subarray Sum Equals K does this for prefix sums.
Group Anagrams does this for canonical forms.
Once you see the pattern, roughly 40% of medium array/string problems reduce to a single HashMap pass.

→ See the full HashMap practice set