Arrays & Hashing

When to Use

Need O(1) lookup (does X exist? how many times?)
Reducing O(N²) brute force to O(N) by trading space for time
Grouping elements by a shared property (anagrams, frequencies)
Finding relationships between elements (complement, difference, pair)

Signal phrases: "two sum", "duplicate", "anagram", "group by", "frequency", "count", "consecutive", "seen before", "unique"

Pattern 1 — Frequency Map

Count occurrences, then query.

from collections import Counter, defaultdict

# Counter (cleanest)
freq = Counter(nums)          # Counter({'a': 3, 'b': 1})
freq['z']                     # 0 if not present — no KeyError

# defaultdict (explicit control)
freq = defaultdict(int)
for n in nums:
    freq[n] += 1

# Manual dict (most explicit)
freq = {}
for n in nums:
    freq[n] = freq.get(n, 0) + 1

Problems: Valid Anagram (Counter(s) == Counter(t)), Group Anagrams, Top K Frequent Elements

Pattern 2 — Complement Lookup (Two Sum)

Store seen elements. For each new element, check if its complement already exists.

# LC 1 — Two Sum — O(N) time, O(N) space
def twoSum(nums, target):
    seen = {}  # value -> index
    for i, n in enumerate(nums):
        complement = target - n
        if complement in seen:
            return [seen[complement], i]
        seen[n] = i
    return []

Key insight: Process left-to-right. By the time you check complement in seen, you've never added the current element yet — so you can't accidentally pair an element with itself.

Pattern 3 — Grouping by Canonical Key

Map elements that share a property to the same key. Most common keys: sorted tuple, character-count tuple.

# LC 49 — Group Anagrams
from collections import defaultdict

def groupAnagrams(strs):
    groups = defaultdict(list)
    for word in strs:
        key = tuple(sorted(word))     # "eat" → ('a','e','t')
        groups[key].append(word)
    return list(groups.values())

# Optimal key — count array avoids O(M log M) sort per word
def groupAnagramsOptimal(strs):
    groups = defaultdict(list)
    for word in strs:
        count = [0] * 26
        for c in word:
            count[ord(c) - ord('a')] += 1
        groups[tuple(count)].append(word)  # hashable
    return list(groups.values())

Pattern 4 — Set Membership + Sequence Head Trick

Build a set for O(1) membership. Only start processing from "heads" of sequences to avoid O(N²).

# LC 128 — Longest Consecutive Sequence — O(N)
def longestConsecutive(nums):
    num_set = set(nums)
    longest = 0

    for n in num_set:
        if n - 1 not in num_set:       # only process sequence heads
            length = 1
            while n + length in num_set:
                length += 1
            longest = max(longest, length)

    return longest

Key insight: n - 1 not in num_set skips elements that are in the middle of a sequence. Without this guard, every element triggers a full count → O(N²) worst case.

Key Problems

Problem	Pattern	LC	One-liner / Key
Contains Duplicate	Set	#217	`len(set(nums)) != len(nums)`
Valid Anagram	Frequency map	#242	`Counter(s) == Counter(t)`
Two Sum	Complement lookup	#1	Store index in hashmap
Group Anagrams	Canonical key	#49	`tuple(sorted(word))`
Top K Frequent Elements	Freq + min-heap	#347	Counter + heapq size K
Product of Array Except Self	Prefix + suffix	#238	No division — see [[Prefix Sum]]
Longest Consecutive Sequence	Set + head trick	#128	Skip non-heads
Valid Sudoku	3 sets per constraint	#36	Row/col/box tracking
Encode and Decode Strings	Length-prefix	#271	`"4#cats3#dog"`

Complexity

Operation	Average	Worst
`dict` / `set` lookup	O(1)	O(N) hash collision
`Counter(s)`	O(M)	O(M)
Sorted-tuple key	O(M log M) per word	—
Count-tuple key	O(M) per word	—

HashSet vs HashMap: Use set for existence check only. Use dict when you need to store a value alongside (index, count, group).

[[DSA/DSA Techniques for Interviews/02. Two Pointers]] — pairs with hashing for O(N) pair problems
[[DSA/DSA Techniques for Interviews/17. Prefix Sum]] — alternative for range-sum problems
[[DSA/Data Structures/Hash Map]] — data structure internals