SQL Query Optimization — Indexes, EXPLAIN, and Performance
The Interview Question
How do you optimize a slow SQL query? Walk me through your approach.
Expert Answer
Query optimization follows a systematic approach. First, use EXPLAIN (or EXPLAIN ANALYZE) to see the execution plan — this tells you whether the database is using indexes, doing full table scans, or choosing suboptimal join strategies. Look for sequential scans on large tables, high row estimates vs. actual rows (bad statistics), and missing index usage. Second, check your indexes — add indexes on columns used in WHERE, JOIN, and ORDER BY clauses. Composite indexes should match the query's column order (leftmost prefix rule). But don't over-index: every index slows down writes and consumes storage. Third, rewrite the query — common optimizations include replacing correlated subqueries with JOINs, using EXISTS instead of IN for subqueries with large result sets, avoiding SELECT * (fetch only needed columns), and moving conditions from HAVING to WHERE when possible (filters earlier = fewer rows to process). Fourth, check for N+1 query problems at the application level — a loop executing one query per row instead of a single batch query.
Key Points to Hit in Your Answer
- Always start with EXPLAIN ANALYZE — don't guess, measure
- Index columns in WHERE, JOIN ON, and ORDER BY clauses
- Composite index order matters: (a, b, c) supports WHERE a=1 AND b=2, but not WHERE b=2 alone
- Avoid functions on indexed columns in WHERE: WHERE YEAR(date) = 2026 can't use index
- EXISTS is typically faster than IN for subqueries (short-circuits)
- SELECT * prevents covering index optimization — specify columns
- N+1 problem: 1 query for list + N queries for details → rewrite as JOIN or IN
- Update table statistics periodically: ANALYZE TABLE
Code Example
-- EXPLAIN shows the query plan
EXPLAIN ANALYZE
SELECT u.name, COUNT(o.id) AS order_count
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE o.created_at > '2026-01-01'
GROUP BY u.name;
-- BAD: function on indexed column prevents index use
WHERE YEAR(created_at) = 2026
-- GOOD: range comparison uses index
WHERE created_at >= '2026-01-01' AND created_at < '2027-01-01'
-- BAD: correlated subquery (runs once per row)
SELECT name FROM users u
WHERE (SELECT COUNT(*) FROM orders WHERE user_id = u.id) > 5;
-- GOOD: JOIN with GROUP BY (single pass)
SELECT u.name FROM users u
JOIN orders o ON u.id = o.user_id
GROUP BY u.name
HAVING COUNT(o.id) > 5;
-- Create composite index for common query pattern
CREATE INDEX idx_orders_user_date
ON orders (user_id, created_at DESC);
What Interviewers Are Really Looking For
Senior question — they want a systematic approach, not random tips. Start with EXPLAIN, look at the plan, then optimize. The 'function on indexed column' gotcha (WHERE YEAR(date)) is extremely common in practice and interviews. Knowing about covering indexes (an index that contains all columns the query needs) shows deep knowledge. Mention that premature indexing is a problem too — every index has write overhead.
Practice This Question with AI Grading
Reading about interview questions is a start — but practicing with real-time AI feedback is how you actually get better. Goliath Prep grades your answers instantly and tells you exactly what you're missing.
Start Practicing Free →