Design a Notification System (System Design Interview)
The Interview Question
Design a notification system that supports push notifications, email, SMS, and in-app notifications for millions of users.
Expert Answer
A notification system is fundamentally a message routing problem. Events come in (user signed up, order shipped, friend request), and the system needs to decide: which users should be notified, through which channels, with what content, and with what priority. The architecture has four layers. The ingestion layer receives notification requests via an API or message queue from other services. The routing layer checks user preferences (did they opt into email? do they have push enabled?), applies rate limiting (don't send 50 emails in an hour), and handles deduplication. The delivery layer has a separate queue and worker per channel — email workers use SendGrid/SES, push workers use FCM/APNs, SMS workers use Twilio. The tracking layer records delivery status, opens, clicks, and failures. Use a message queue (Kafka or RabbitMQ) between each layer for decoupling and retry handling. This is critical because third-party delivery services fail — you need at-least-once delivery with idempotency keys to prevent duplicate sends.
Key Points to Hit in Your Answer
- Separate queues per channel (email, push, SMS) for independent scaling and failure isolation
- User preference service: opt-in/opt-out per channel, quiet hours, frequency caps
- At-least-once delivery with idempotency keys prevents duplicates
- Priority levels: critical (security alerts) bypass frequency caps, promotional does not
- Template service for consistent branding across channels
- Tracking: delivery receipts, open rates, click tracking, failure logging
Code Example
// Notification request flow
{
"event": "order_shipped",
"user_id": "user_123",
"data": { "order_id": "ord_456", "tracking": "1Z999..." },
"channels": ["push", "email"], // or let routing decide
"priority": "high"
}
// Routing logic pseudocode
async function routeNotification(event) {
const user = await getUser(event.user_id);
const prefs = await getUserPreferences(user.id);
const template = await getTemplate(event.event, event.data);
for (const channel of event.channels) {
if (!prefs.isEnabled(channel)) continue;
if (rateLimiter.isLimited(user.id, channel)) continue;
await channelQueues[channel].publish({
userId: user.id,
content: template.render(channel),
idempotencyKey: `${event.event}:${event.data.order_id}:${channel}`,
priority: event.priority
});
}
}
What Interviewers Are Really Looking For
The interviewer wants to see you think about failure modes. What happens when the email service is down? (Queue retries with exponential backoff.) What about duplicate sends? (Idempotency keys.) How do you handle users in different time zones? (Quiet hours in user preferences.) At senior level, discuss how to handle millions of push notifications simultaneously (fan-out problem) and database choice for notification history.
Practice This Question with AI Grading
Reading about interview questions is a start — but practicing with real-time AI feedback is how you actually get better. Goliath Prep grades your answers instantly and tells you exactly what you're missing.
Start Practicing Free →