Senior Communication Systems 9 min read

Design a Real-Time Chat System (System Design Interview)

The Interview Question

Design a real-time chat system that supports one-on-one messaging, group chats, online presence, and message delivery receipts.

Expert Answer

A chat system needs persistent connections for real-time delivery and a robust backend for reliability. For the real-time layer, use WebSocket connections between clients and chat servers. Each chat server maintains a registry of which users are connected to it. When User A sends a message to User B, the flow is: message hits User A's chat server → stored in message database → if User B is online, route to User B's chat server via a message broker (Redis Pub/Sub or Kafka) → delivered through User B's WebSocket. If User B is offline, the message is stored and delivered when they reconnect (pull unread messages on connection). For group chats, the sender's server publishes to a group channel, and each chat server with group members subscribed delivers locally. For presence (online/offline status), use heartbeat pings over the WebSocket — if no heartbeat for 30 seconds, mark the user offline. Store presence in Redis with TTL. For message ordering, use server-assigned timestamps, not client timestamps. Store messages with a composite key of (conversation_id, server_timestamp) for efficient pagination.

Key Points to Hit in Your Answer

  • WebSocket for real-time bidirectional communication
  • Message broker (Redis Pub/Sub or Kafka) for cross-server routing
  • Offline message queue — pull unread on reconnect
  • Presence via heartbeat with Redis TTL
  • Message ordering with server timestamps, not client timestamps
  • Delivery receipts: sent → delivered → read (three states)
  • Group fanout: publish to channel, each server delivers to local members

Code Example

// Message flow pseudocode
async function sendMessage(senderId, conversationId, content) {
  const message = {
    id: generateSnowflakeId(),
    conversationId,
    senderId,
    content,
    timestamp: Date.now(),
    status: 'sent'
  };
  
  // Persist first (durability)
  await messageStore.save(message);
  
  // Get all participants
  const members = await getConversationMembers(conversationId);
  
  for (const memberId of members) {
    if (memberId === senderId) continue;
    
    const serverNode = await presenceService.getServer(memberId);
    if (serverNode) {
      // User online — route through their chat server
      await messageBroker.publish(serverNode, message);
    } else {
      // User offline — they'll pull on reconnect
      await unreadQueue.push(memberId, message.id);
    }
  }
}

What Interviewers Are Really Looking For

The interviewer wants to see you handle the online/offline split explicitly. How do messages get delivered when the recipient is offline? How do you handle the user switching devices? The presence heartbeat design shows practical knowledge. At staff level, discuss how to shard conversations across database nodes and handle the fan-out problem for large group chats (1000+ members).

Practice This Question with AI Grading

Reading about interview questions is a start — but practicing with real-time AI feedback is how you actually get better. Goliath Prep grades your answers instantly and tells you exactly what you're missing.

Start Practicing Free →