Collection Contents Index Nested subqueries CHAPTER 8.  Adding, Changing, and Deleting Data pdf/chap7.pdf

User's Guide
   PART 1. Working with Databases
     CHAPTER 7. Using Subqueries       

How subqueries work


Understanding which queries are valid and which ones aren't can be complicated when a query contains a subquery. Similarly, figuring out what a multi-level query does can also be very involved, and it helps to understand how SQL processes subqueries. For general information about how queries are processed, see .

Top of page  Correlated subqueries

In a simple query, SQL evaluates and processes the query's WHERE clause once for each row of the query. Sometimes, though, the subquery returns only one result, making it unnecessary for SQL to evaluate it more than once for the entire result set.

Uncorrelated subqueries 

Consider this query:

SELECT name, description 
FROM product 
WHERE quantity <  2 * (
   SELECT avg (quantity) 
   FROM sales_order_items)

In this example, the subquery calculates exactly one value: the average quantity from the sales_order_items table. In evaluating the query, SQL computes this value once, and compares each value in the quantity field of the product table to it to determine whether the corresponding row is selected.

Correlated subqueries 

When a subquery contains an outer reference, this shortcut can no longer be used. For instance, the subquery in the query

SELECT name, description 
FROM product 
WHERE quantity < 2 * (
   SELECT avg (quantity) 
   FROM sales_order_items 
   WHERE product.id=sales_order_items.prod_id)

returns a value that is dependent upon the active row in the product table. Such subqueries are called correlated subqueries. In these cases, there the subquery might return a different value for each row of the outer query, making it necessary for SQL to perform more than one evaluation.

Top of page  Converting subqueries in the WHERE clause to joins

In general, a query that uses joins is executed faster than is a multi-level query. For this reason, whenever possible, Adaptive Server Anywhere's query optimizer will convert a multi-level query to a query that uses joins. The conversion is carried out without any user action. This section describes which subqueries can be converted to joins so that you can understand the performace of queries in your database.

Example 

The question "When did Mrs. Clarke and Suresh place their orders, and by which sales representatives?" can be written as a two-level query:

SELECT order_date, sales_rep 
FROM sales_order 
WHERE cust_id IN (
   SELECT id 
   FROM customer 
   WHERE lname = 'Clarke' OR fname = 'Suresh')

An alternate, and equally correct way to write the query uses joins:

SELECT fname, lname, order_date, sales_rep 
FROM sales_order, customer 
WHERE cust_id=customer.id AND (lname = 'Clarke' OR fname = 'Suresh')

The critieria that must be satisfied in order for a multi-level query to be able to be rewritten with joins differ for the various types of operators. Recall that when a subquery appears in the query's WHERE clause, it is of the form

SELECT select-list

FROM table

WHERE [NOT] expression comparison-operator (subquery) |

[NOT] expression comparison-operator ANY / SOME (subquery) |

[NOT] expression comparison-operator ALL (subquery) |

[NOT] expression IN (subquery) |

[NOT] EXISTS (subquery)

GROUP BY group-by-expression

HAVING search-condition

Whether a subquery can be converted to a join depends on a number of factors, such as the type of operator and the structure of the query.

Top of page  Comparison operators

A subquery that follows a comparison operator (=, <>, <, <=, >, >=) must satisfy certain conditions if it is to be converted into a join. Subqueries that follow comparison operators in general are valid only if they return exactly one value for each row of the main query. In addition to this criterion, a subquery is converted to a join only if

Example 

Suppose the request "When were Suresh's products ordered, and by which sales representative?" were phrased as the subquery

SELECT order_date, sales_rep 
FROM sales_order 
WHERE cust_id = (
   SELECT id 
   FROM customer 
   WHERE fname = 'Suresh')

This query satisfies the criteria, and therefore, it would be converted to a query that uses a join:

SELECT order_date, sales_rep 
FROM sales_order 
WHERE cust_id=customer.id AND (lname = 'Clarke' OR fname = 'Suresh')

However, the request, "Find the products whose in-stock quantities are less than double the average ordered quantity" cannot be converted to a join, as the subquery contains the aggregate function AVG:

SELECT name, description 
FROM product 
WHERE quantity <  2 * (
   SELECT avg (quantity) 
   FROM sales_order_items)

Top of page  Quantified comparison test

A subquery that follows one of the keywords ALL, ANY and SOME is converted into a join only if it satisfies certain criteria.

The first four of these conditions are relatively straightforward.

Example 

The request "When did Mrs. Clarke and Suresh place their orders, and by which sales representatives?" can be handled in subquery form:

SELECT order_date, sales_rep 
FROM sales_order 
WHERE cust_id = ANY (
   SELECT id 
   FROM customer 
   WHERE lname = 'Clarke' OR fname = 'Suresh')

Alternately, it can be phrased in join form

SELECT fname, lname, order_date, sales_rep 
FROM sales_order, customer 
WHERE cust_id=customer.id AND (lname = 'Clarke' OR fname = 'Suresh')

However, request, "When did Mrs. Clarke, Suresh, and any employee who is also a customer, place their orders?" would be phrased as a union query, and thus cannot be converted to a join:

SELECT order_date, sales_rep 
FROM sales_order 
WHERE cust_id = ANY (
   SELECT id 
   FROM customer 
   WHERE lname = 'Clarke' OR fname = 'Suresh'
   UNION 
   SELECT id
   FROM employee)

Similarly, the request "Find the customer and product ID's of those products that were not ordered after all orders of product #2001 were shipped," is naturally expressed with a subquery

A subquery with the ALL operator that can be converted to a join 

SELECT id, cust_id 
FROM sales_order 
WHERE NOT order_date > ALL (
   SELECT ship_date 
   FROM sales_order_items 
   WHERE id=2001)

It would be converted to the join:

SELECT sales_order.id, cust_id
FROM sales_order, sales_order_items
WHERE (sales_order_items.id=2001) and (order_date <= ship_date)

However, the request "Find the customer and product ID's of those products that were not shipped after the first shipping dates of all the products" would be phrased as the aggregate query

SELECT id, cust_id 
FROM sales_order 
WHERE NOT order_date > ALL (
   SELECT first (ship_date)
   FROM sales_order_items )

Therefore, it would not be converted to a join.

Negating subqueries with the ANY and ALL operators 

The fifth criterion is a little more puzzling: queries of the form

SELECT select-list

FROM table

WHERE [NOT] expression comparison-operator ALL (subquery)

are converted to joins, as are queries of the form

SELECT select-list

FROM table
WHERE expression comparison-operator ANY (subquery)

but the queries

SELECT select-list

FROM table
WHERE expression comparison-operator ALL (subquery)

and

SELECT select-list

FROM table

WHERE NOT expression comparison-operator ANY (subquery)

are not.

Logical equivalence of ANY and ALL expressions 

This is because the first two queries are in fact equivalent, as are the last two. Recall that the any operator analogous to the or operator, but with a variable number of arguments; and that the ALL operator is similarly analogous to the AND operator. Just as the expression

NOT ((> A) AND (> B))

is equivalent to the expression

(<= A) OR (<= B)

the expression

NOT order_date > ALL (
   SELECT first (ship_date) 
   FROM sales_order_items )

is equivalent to the expression

order_date <= ANY (
   SELECT first (ship_date)
   FROM sales_order_items )

Negating the ANY and ALL expressions 

In general, the expression

NOT column-name operator ANY (subquery)

is equivalent to the expression

column-name inverse-operator ALL (subquery)

and the expression

NOT column-name operator ALL (subquery)

is equivalent to the expression

column-name inverse-operator ANY (subquery)

where inverse-operator is obtained by negating operator, as shown in the table:

Table of operators and their inverses 

Operator

inverse-operator

=

<>

<

=>

>

=<

=<

>

=>

<

<>

=

Top of page  Set membership test

A query containing a subquery that follows the keyword IN is converted into a join only if:

Example 

So, the request "Find the names of the employees who are also department heads", which is expressed by the query

SELECT emp_fname, emp_lname
FROM employee 
WHERE emp_id IN (
   SELECT dept_head_id 
   FROM department 
   WHERE (dept_name='Finance' or dept_name = 'Shipping'))

would be converted to a joined query, as it satisfies the conditions; however, the request, "Find the names of the employees who are also either department heads or customers" would not be converted to a join if it were expressed by the UNION query

A UNION query following the IN operator can't be converted 

SELECT emp_fname, emp_lname
FROM employee 
WHERE emp_id IN (
   SELECT dept_head_id 
   FROM department 
   WHERE (dept_name='Finance' or dept_name = 'Shipping')
   UNION
   SELECT cust_id
   FROM sales_order)

Similarly, the request "Find the names of employees who are not department heads" is formualted as the negated subquery

SELECT emp_fname, emp_lname
FROM employee 
WHERE NOT emp_id IN (
   SELECT dept_head_id 
   FROM department 
   WHERE (dept_name='Finance' OR dept_name = 'Shipping'))

and would not be converted.

The conditions that must be fulfilled in order for a subquery that follows the must be fulfilled in order for a subquery that follows the must be fulfilled in order for a subquery that follows the IN keyword to be converted to a join are identical to those that must be fulfilled in order for a subquery that follows the ANY keyword to be converted. This is not a coincidence, and the reason for this is that the expression

A query with an IN operator can be converted to one with an ANY operator 

WHERE column-name IN (subquery)

is logically equivalent to the expression

WHERE column-name = ANY (subquery)

So the query

SELECT emp_fname, emp_lname
FROM employee 
WHERE emp_id IN (
   SELECT dept_head_id 
   FROM department 
   WHERE (dept_name='Finance' or dept_name = 'Shipping'))

is equivalent to the query

SELECT emp_fname, emp_lname
FROM employee 
WHERE emp_id = ANY (
   SELECT dept_head_id 
   FROM department 
   WHERE (dept_name='Finance' or dept_name = 'Shipping'))

Conceptually, SQL converts a query with the IN operator to one with an ANY operator, and decides accordingly whether to convert the subquery to a join.

Top of page  Existence test

A subquery that follows the keyword EXISTS is converted to a join only if it satisfies the following two conditions:

Example 

Therefore, the request, "Which customers placed orders after July 13, 1994?", which can be formulated by this query whose non-negated subquery subquery contains the outer reference customer.id = sales_order.cust_id, could be converted to a join.

SELECT fname, lname
FROM customer
WHERE EXISTS (
   SELECT * 
   FROM sales_order 
   WHERE (order_date > '1994-07-13') AND (customer.id = sales_order.cust_id))

The EXISTS keyword essentially tells SQL to check for NULLs. When inner joins are used, SQL automatically only displays the rows in which there is data from all of the tables in the FROM clause; that is, SQL only returns the rows that do not contain NULLs. So this query returns the same rows as does the one with the subquery:

SELECT fname, lname
FROM customer, sales_order
WHERE (sales_order.order_date > '1994-07-13') AND (customer.id = sales_order.cust_id).
Top of page  

Collection Contents Index Nested subqueries CHAPTER 8.  Adding, Changing, and Deleting Data pdf/chap7.pdf