Back to Notes

Consistent Hashing

Consistent Hashing

The standard algorithm for distributing keys across servers when servers can be added/removed. Essential for distributed caches, sharded DBs, load balancers.


The Problem with Naive Hashing

Naive: server = hash(key) % N (N = number of servers)

If you add or remove a server, N changes → almost every key maps to a different server → massive cache miss storm / data migration.

3 servers: key "user:42" → hash(key) % 3 = 1 → Server 1
4 servers: key "user:42" → hash(key) % 4 = 2 → Server 2  ← moved!

For a cache: 75-80% of keys remap when adding just one server.


Consistent Hashing Solution

Map both servers and keys onto a ring (circle of hash values 0 to 2^32).

Hash ring: 0 ──────────────────── 2^32

Servers placed at hash(server_id) positions:
         Server A (hash=90)
       /
  0 ──────── Server B (hash=180) ──────── Server C (hash=270) ──────── 2^32

Key lookup: Hash the key → find the next clockwise server on the ring.

key "user:1" → hash=50  → Server A (next clockwise at 90)
key "user:2" → hash=150 → Server B (next clockwise at 180)
key "user:3" → hash=250 → Server C (next clockwise at 270)
key "user:4" → hash=350 → Server A (wraps around, next clockwise)

Adding / Removing a Server

Add Server D at hash=130:

key "user:2" (hash=150) → was Server B → now Server B (130 < 150 → D, but 130 is behind 150...)
Wait: "user:2" (hash=150) → next clockwise → was Server B (180) → still Server B
key at hash=110 → was Server B (180) → now Server D (130) ← only this range remaps!

Only the keys between the new server and its predecessor remap. 1/N of keys move (on average).

Remove Server B: Only keys that were on B move to the next server (C). All other keys unaffected.


Virtual Nodes (V-nodes)

Problem: With few physical servers, key distribution is uneven. One server might get 40%, another 5%.

Solution: Each physical server gets K virtual nodes placed at random ring positions.

Server A → virtual nodes at hash=10, 87, 203, 301, ...
Server B → virtual nodes at hash=45, 130, 250, 389, ...
Server C → virtual nodes at hash=25, 167, 333, 410, ...

Each virtual node owns a smaller arc → more uniform distribution. When a server is removed, its virtual nodes' keys spread across multiple servers (not all to one).


Implementation (simplified)

import hashlib
import bisect

class ConsistentHash:
    def __init__(self, nodes=None, replicas=150):
        self.replicas = replicas
        self.ring = {}          # hash → node
        self.sorted_keys = []   # sorted hash values
        if nodes:
            for node in nodes:
                self.add_node(node)

    def _hash(self, key: str) -> int:
        return int(hashlib.md5(key.encode()).hexdigest(), 16)

    def add_node(self, node: str):
        for i in range(self.replicas):
            h = self._hash(f"{node}:{i}")
            self.ring[h] = node
            bisect.insort(self.sorted_keys, h)

    def remove_node(self, node: str):
        for i in range(self.replicas):
            h = self._hash(f"{node}:{i}")
            del self.ring[h]
            self.sorted_keys.remove(h)

    def get_node(self, key: str) -> str:
        if not self.ring:
            return None
        h = self._hash(key)
        idx = bisect.bisect(self.sorted_keys, h) % len(self.sorted_keys)
        return self.ring[self.sorted_keys[idx]]

Where It's Used

SystemHow
Distributed cache (Redis Cluster)Keys distributed across cluster nodes, minimal remap on scale
CDN routingRoute user to nearest/least-loaded edge server
Load balancer (session affinity)Same user → same backend server (sticky sessions)
Cassandra / DynamoDBData partitioning across nodes
Chord DHTPeer-to-peer distributed hash tables

Comparison: Consistent Hashing vs Range Partitioning

Consistent HashingRange Partitioning
Key distributionUniform (with v-nodes)Uneven (hot ranges possible)
Add/remove node1/N keys remap1/N keys remap (similar)
Range queriesImpossible — keys scatteredEasy — contiguous keys together
Hot spotsRandom (mitigated)Sequential keys → single shard
Used inCassandra, Redis, MemcachedHBase, DynamoDB (sort key), Spanner

Interview Talking Points

"How do you handle adding a new cache server without invalidating everything?" Consistent hashing. New server takes over only the keys between itself and its predecessor (~1/N). All other cache entries remain valid.

"What are virtual nodes and why do you need them?" With few servers, ring arcs are uneven — one server gets too many keys. Virtual nodes (K replicas per physical node) create K smaller arcs → uniform distribution. Also: when a server dies, its load spreads across multiple servers rather than one successor.

"What's the difference between consistent hashing and modulo hashing?" Modulo: key % N — changing N remaps ~(N-1)/N of keys. Consistent: changing N remaps ~1/N of keys. This is the core difference.


Related

  • [[Caching & Redis]] — Redis Cluster uses consistent hashing
  • [[Distributed Systems Concepts]] — partitioning, replication
  • [[Message Queues & Kafka]] — Kafka partition assignment
  • [[System Design/Problem Designs/Design a URL shortener]] — base62 encoding (different sharding)