SqlDatabaseIntermediate

GROUP BY

Q: What parameters does GROUP BY use?

group_col: Column(s) that define each group. AGG(col): Aggregate calculation per group.

Groups rows so aggregate functions can be computed per group.

Review the syntaxStudy the examplesOpen the coding app

SELECT group_col, AGG(col) FROM table_name GROUP BY group_col;

Open Coding Reference

This static page keeps the syntax and examples indexed for search, while the coding app handles interactive exploration and saved references.

What it does

Overview

Groups rows so aggregate functions can be computed per group.

The GROUP BY clause is a fundamental SQL operation used to arrange identical data into groups. It typically follows the WHERE clause and precedes ORDER BY in the query execution order. When executed, the database engine partitions the result set into discrete buckets based on the grouping keys. Performance is heavily influenced by indexing; grouping on indexed columns allows the engine to perform a 'Group Aggregate' or 'Stream Aggregate' (sequential read) rather than a more expensive 'Hash Aggregate' which requires building an in-memory hash table. A critical edge case involves NULL values: standard SQL treats all NULL values as a single group. Non-aggregated columns in the SELECT list must be included in the GROUP BY clause to avoid indeterminate results, a requirement strictly enforced by most modern RDBMS like PostgreSQL and SQL Server, although some configurations of MySQL allow the omission of non-aggregated columns if functional dependency is maintained.

Quick reference

Syntax

SELECT group_col, AGG(col) FROM table_name GROUP BY group_col;

Inputs

Parameters

group_colColumn(s) that define each group.

AGG(col)Aggregate calculation per group.

See it in practice

Examples

Basic Count Aggregation

SELECT department, COUNT(*) AS employee_count FROM employees GROUP BY department;

Output:
department | employee_count Sales | 15 Tech | 42 HR | 8

Groups employees by their department and counts the number of records in each bucket.

Grouping by Multiple Columns

SELECT region, product_category, SUM(revenue) FROM sales_data GROUP BY region, product_category;

Calculates revenue for every unique combination of region and category.

Filtering Groups with HAVING

SELECT supplier_id, AVG(lead_time) FROM orders GROUP BY supplier_id HAVING AVG(lead_time) > 5;

Output:
supplier_id | avg 101 | 7.2 304 | 6.5

Groups orders by supplier and filters out groups where the average lead time is 5 days or less.

Debug faster

Common Errors

IndeterminateColumnError

Cause: Attempting to select a column that is neither part of the GROUP BY clause nor wrapped in an aggregate function.

Fix: Add the missing column to the GROUP BY clause or use an aggregate like MIN() or MAX().

SELECT name, department, SUM(salary) FROM employees GROUP BY department;

LogicError

Cause: Attempting to filter aggregated results inside the WHERE clause instead of the HAVING clause.

Fix: Move conditions involving aggregate functions (like SUM, COUNT) to a HAVING clause after the GROUP BY.

SELECT category FROM items WHERE COUNT(*) > 10 GROUP BY category;

Runtime support

Compatibility

All SQL databasesSQL-92 (Standard)

Date extraction functions vary by DB

Source: MDN Web Docs

Common questions

Frequently Asked Questions

Groups rows so aggregate functions can be computed per group.

$g r o u p_{c}$ ol: Column(s) that define each group. AGG(col): Aggregate calculation per group.

IndeterminateColumnError: Add the missing column to the GROUP BY clause or use an aggregate like MIN() or MAX(). LogicError: Move conditions involving aggregate functions (like SUM, COUNT) to a HAVING clause after the GROUP BY.

GROUP BY

Overview

Syntax

Parameters

Examples

Basic Count Aggregation

Grouping by Multiple Columns

Filtering Groups with HAVING

Common Errors

IndeterminateColumnError

LogicError

Compatibility

Related References

HAVING

COUNT()

SUM()

ORDER BY

Frequently Asked Questions