Senior Storage Systems 10 min read

Design a Distributed Key-Value Store (System Design Interview)

The Interview Question

Design a distributed key-value store that supports get(key) and put(key, value) with high availability and partition tolerance.

Expert Answer

This question is essentially asking you to design a simplified DynamoDB or Cassandra. Start with the CAP theorem tradeoff — you need to choose between consistency and availability during network partitions. Most modern systems choose AP (available + partition-tolerant) with eventual consistency. The core components: data partitioning uses consistent hashing with virtual nodes to distribute keys across servers evenly and minimize redistribution when nodes join or leave. Replication copies each key to N nodes (typically 3) walking clockwise on the hash ring. For writes, use a quorum-based approach: W replicas must acknowledge a write before returning success. For reads, R replicas are queried and the latest version wins (using vector clocks or timestamps for conflict resolution). The formula: if W + R > N, you get strong consistency; if W + R <= N, you get better availability with eventual consistency. Handle node failures with hinted handoff (temporarily store writes for a downed node on its neighbor) and anti-entropy using Merkle trees (detect and repair inconsistencies between replicas in the background).

Key Points to Hit in Your Answer

  • CAP theorem: choose AP (eventual consistency) or CP (strong consistency) during partitions
  • Consistent hashing with virtual nodes for even distribution
  • Replication factor N, write quorum W, read quorum R — W+R>N means strong consistency
  • Vector clocks or timestamps for conflict detection/resolution
  • Hinted handoff for temporary node failures
  • Merkle trees for anti-entropy and replica synchronization
  • Gossip protocol for cluster membership and failure detection

Code Example

// Consistent hashing pseudocode
class ConsistentHash {
  constructor(nodes, virtualNodesPerNode = 150) {
    this.ring = new SortedMap();
    for (const node of nodes) {
      for (let i = 0; i < virtualNodesPerNode; i++) {
        const hash = md5(`${node}:${i}`);
        this.ring.set(hash, node);
      }
    }
  }
  
  getNode(key) {
    const hash = md5(key);
    // Find first node clockwise from key's position
    const entry = this.ring.ceiling(hash) || this.ring.first();
    return entry.value;
  }
  
  getReplicaNodes(key, n = 3) {
    // Walk clockwise, collect N distinct physical nodes
    // Skip virtual nodes that map to same physical node
  }
}

What Interviewers Are Really Looking For

This is an advanced question that tests distributed systems knowledge. The interviewer wants you to discuss tradeoffs explicitly: consistency vs. availability, latency vs. durability. Mention the quorum formula (W+R>N). Explaining vector clocks or hinted handoff demonstrates deep knowledge. Structure matters — start with single-node, then distribute.

Practice This Question with AI Grading

Reading about interview questions is a start — but practicing with real-time AI feedback is how you actually get better. Goliath Prep grades your answers instantly and tells you exactly what you're missing.

Start Practicing Free →