Collection Contents Index The GROUP BY clause: organizing query results into groups The HAVING clause: selecting groups of data pdf/chap5.pdf

User's Guide
   PART 1. Working with Databases
     CHAPTER 5. Summarizing, Grouping, and Sorting Query Results       

Thinking about GROUP BY


Understanding which queries are valid and which are not can be difficult when the query involves a GROUP BY clause. This section describes a way to think about queries with GROUP BY so that you may understand the results and the validity of queries better.

Top of page  How queries with GROUP BY are executed

Consider a single-table query of the following form:

SELECT select-list

FROM table

WHERE search-condition

GROUP BY group-by-expression

HAVING search-condition

This query can be thought of as being executed in the following manner:

  1. Apply the WHERE clause     This generates an intermediate result that contains only some of the rows of the table.

  2. Partition the result into groups     This action generates an intermediate result with one row for each group as dictated by the GROUP BY clause. Each row contains the group-by-expression for each group and the operands of any aggregate functions in the select-list.

  3. Apply the HAVING clause     Any rows that do not meet the criteria of the HAVING clause are removed at this point.

  4. Project out the results to display     This action takes from the second intermediate result only those columns that need to be displayed in the result set of the query.

This process makes requirements on queries with a GROUP BY clause:

Top of page  GROUP BY with multiple columns

You can list more than one column in the GROUP BY clause in order to nest groups—that is, you can group a table by any combination of columns.

  To list the average price of products, grouped first by name and then by size:
  1. Enter the following command:

    SELECT name, size, AVG(unit_price)
    FROM product
    GROUP BY name, size

    name

    size

    AVG(unit_price)

    Tee Shirt

    Small

    9.000

    Tee Shirt

    Medium

    14.000

    Tee Shirt

    One size fits all

    14.000

    Baseball Cap

    One size fits all

    9.500

    Visor

    One size fits all

    7.000

    Sweatshirt

    Large

    24.000

    Shorts

    Medium

    15.000

Columns in GROUP BY that are not in the select list 

A Sybase extension to the SQL/92 standard that is supported by both Adaptive Server Enterprise and Adaptive Server Anywhere is to add columns to the GROUP BY clause that are not in the select list. For example, the following query lists the number of contacts in each city:

SELECT state, count(id)
FROM contact
GROUP BY state, city

Top of page  WHERE clause and GROUP BY

You can use a WHERE clause in a statement with GROUP BY. The WHERE clause is evaluated before the GROUP BY clause. Rows that do not satisfy the conditions in the WHERE clause are eliminated before any grouping is done. Here is an example:

SELECT  name, AVG(unit_price)
FROM product
WHERE id > 400
GROUP BY name

Only the rows with id values of more than 400 are included in the groups that are used to produce the query results.

Top of page  An example

The following query illustrates the use of WHERE, GROUP BY, and HAVING clauses in one query:

SELECT name, SUM(quantity)
FROM product
WHERE name LIKE '%shirt%'
GROUP BY name
HAVING SUM(quantity) > 100

name

SUM(quantity)

Tee Shirt

157

In this example:

  1. The WHERE clause includes only rows that have a name including the word shirt (Tee Shirt, Sweatshirt).

  2. The GROUP BY clause collects the rows with a common name.

  3. The SUM aggregate calculates the total number of products available for each group.

  4. The HAVING clause excludes from the final results the groups whose inventory totals do not exceed 100.

Top of page  

Collection Contents Index The GROUP BY clause: organizing query results into groups The HAVING clause: selecting groups of data pdf/chap5.pdf