Arrays & Hashing
When to Use
- Need O(1) lookup (does X exist? how many times?)
- Reducing O(N²) brute force to O(N) by trading space for time
- Grouping elements by a shared property (anagrams, frequencies)
- Finding relationships between elements (complement, difference, pair)
Signal phrases: "two sum", "duplicate", "anagram", "group by", "frequency", "count", "consecutive", "seen before", "unique"
Pattern 1 — Frequency Map
Count occurrences, then query.
from collections import Counter, defaultdict
# Counter (cleanest)
freq = Counter(nums) # Counter({'a': 3, 'b': 1})
freq['z'] # 0 if not present — no KeyError
# defaultdict (explicit control)
freq = defaultdict(int)
for n in nums:
freq[n] += 1
# Manual dict (most explicit)
freq = {}
for n in nums:
freq[n] = freq.get(n, 0) + 1
Problems: Valid Anagram (Counter(s) == Counter(t)), Group Anagrams, Top K Frequent Elements
Pattern 2 — Complement Lookup (Two Sum)
Store seen elements. For each new element, check if its complement already exists.
# LC 1 — Two Sum — O(N) time, O(N) space
def twoSum(nums, target):
seen = {} # value -> index
for i, n in enumerate(nums):
complement = target - n
if complement in seen:
return [seen[complement], i]
seen[n] = i
return []
Key insight: Process left-to-right. By the time you check complement in seen, you've never added the current element yet — so you can't accidentally pair an element with itself.
Pattern 3 — Grouping by Canonical Key
Map elements that share a property to the same key. Most common keys: sorted tuple, character-count tuple.
# LC 49 — Group Anagrams
from collections import defaultdict
def groupAnagrams(strs):
groups = defaultdict(list)
for word in strs:
key = tuple(sorted(word)) # "eat" → ('a','e','t')
groups[key].append(word)
return list(groups.values())
# Optimal key — count array avoids O(M log M) sort per word
def groupAnagramsOptimal(strs):
groups = defaultdict(list)
for word in strs:
count = [0] * 26
for c in word:
count[ord(c) - ord('a')] += 1
groups[tuple(count)].append(word) # hashable
return list(groups.values())
Pattern 4 — Set Membership + Sequence Head Trick
Build a set for O(1) membership. Only start processing from "heads" of sequences to avoid O(N²).
# LC 128 — Longest Consecutive Sequence — O(N)
def longestConsecutive(nums):
num_set = set(nums)
longest = 0
for n in num_set:
if n - 1 not in num_set: # only process sequence heads
length = 1
while n + length in num_set:
length += 1
longest = max(longest, length)
return longest
Key insight: n - 1 not in num_set skips elements that are in the middle of a sequence. Without this guard, every element triggers a full count → O(N²) worst case.
Key Problems
| Problem | Pattern | LC | One-liner / Key |
|---|---|---|---|
| Contains Duplicate | Set | #217 | len(set(nums)) != len(nums) |
| Valid Anagram | Frequency map | #242 | Counter(s) == Counter(t) |
| Two Sum | Complement lookup | #1 | Store index in hashmap |
| Group Anagrams | Canonical key | #49 | tuple(sorted(word)) |
| Top K Frequent Elements | Freq + min-heap | #347 | Counter + heapq size K |
| Product of Array Except Self | Prefix + suffix | #238 | No division — see [[Prefix Sum]] |
| Longest Consecutive Sequence | Set + head trick | #128 | Skip non-heads |
| Valid Sudoku | 3 sets per constraint | #36 | Row/col/box tracking |
| Encode and Decode Strings | Length-prefix | #271 | "4#cats3#dog" |
Complexity
| Operation | Average | Worst |
|---|---|---|
dict / set lookup | O(1) | O(N) hash collision |
Counter(s) | O(M) | O(M) |
| Sorted-tuple key | O(M log M) per word | — |
| Count-tuple key | O(M) per word | — |
HashSet vs HashMap: Use
setfor existence check only. Usedictwhen you need to store a value alongside (index, count, group).
Related
- [[DSA/DSA Techniques for Interviews/02. Two Pointers]] — pairs with hashing for O(N) pair problems
- [[DSA/DSA Techniques for Interviews/17. Prefix Sum]] — alternative for range-sum problems
- [[DSA/Data Structures/Hash Map]] — data structure internals