GROUP BY and HAVING in SQL — Aggregation Interview Questions
The Interview Question
Explain GROUP BY and HAVING. What's the difference between WHERE and HAVING?
Expert Answer
GROUP BY collapses rows that share the same values in specified columns into a single row per group. Aggregate functions (COUNT, SUM, AVG, MIN, MAX) operate on each group independently. WHERE filters rows before grouping — it operates on individual rows and cannot reference aggregate functions. HAVING filters groups after aggregation — it operates on the grouped results and CAN reference aggregate functions. The SQL execution order is: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY. This order is why WHERE can't use aggregates (grouping hasn't happened yet) and why column aliases from SELECT can't be used in WHERE or HAVING (SELECT runs after both). Common interview pattern: 'Find customers who have placed more than 5 orders' — this requires GROUP BY customer_id, then HAVING COUNT(*) > 5. You can't use WHERE here because the count isn't known until after grouping.
Key Points to Hit in Your Answer
- WHERE filters rows before grouping; HAVING filters groups after aggregation
- HAVING can use aggregate functions; WHERE cannot
- SQL execution order: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY
- Every non-aggregated column in SELECT must be in GROUP BY
- COUNT(*) counts all rows including NULLs; COUNT(column) excludes NULLs
- GROUP BY with ROLLUP adds subtotals and grand totals automatically
Code Example
-- Find departments with more than 10 employees earning above 50k
SELECT
department,
COUNT(*) AS headcount,
AVG(salary) AS avg_salary
FROM employees
WHERE salary > 50000 -- filter ROWS first (before grouping)
GROUP BY department
HAVING COUNT(*) > 10 -- filter GROUPS (after aggregation)
ORDER BY avg_salary DESC;
-- Common pattern: find duplicates
SELECT email, COUNT(*) AS occurrences
FROM users
GROUP BY email
HAVING COUNT(*) > 1;
-- COUNT(*) vs COUNT(column)
SELECT
department,
COUNT(*) AS total, -- all rows
COUNT(manager_id) AS has_mgr, -- excludes NULLs
COUNT(DISTINCT manager_id) AS unique_mgrs
FROM employees
GROUP BY department;
What Interviewers Are Really Looking For
The WHERE vs HAVING distinction is interview 101 but many candidates stumble. If you can state the SQL execution order from memory, that's a strong signal. The 'find duplicates' pattern (GROUP BY + HAVING COUNT > 1) appears in almost every SQL interview. Know the difference between COUNT(*) and COUNT(column) — NULLs are the gotcha.
Practice This Question with AI Grading
Reading about interview questions is a start — but practicing with real-time AI feedback is how you actually get better. Goliath Prep grades your answers instantly and tells you exactly what you're missing.
Start Practicing Free →