User's Guide
PART 4. Database Administration and Advanced Use
CHAPTER 24. Monitoring and Improving Performance
Adaptive Server Anywhere has an optimizer that attempts to pick the best strategy for executing each query. The best strategy is the one that gets the results in the shortest period of time. The optimizer determines the cost of each strategy by estimating the number of disk reads and writes required. The strategy with the lowest cost is chosen.
The optimizer must decide which order to access the tables in a query, and whether or not to use an index for each table. If a query joins N tables, there are N factorial possible ways to access the tables. The optimizer will estimate the cost of executing the query in the different ways and use the ordering with the lowest cost estimate. The query execution plan in the Interactive SQL statistics window shows the table ordering for the current query and indicates in parentheses the index that was used for each table.
This section provides an introduction to the optimizer. For more information, see Query Optimization.
The optimizer uses heuristics (educated guesses) to help decide the best strategy.
For each table in a potential execution plan, the optimizer must estimate the number of rows that will be part of the results. The number of rows will depend on the size of the table and the restrictions in the WHERE clause or the ON clause of the query.
In many cases, the optimizer uses more sophisticated heuristics. For example, a default estimate for equality is only used in cases where no better statistics are available.
The optimizer makes use of indexes and keys to improve its guess of the number of rows. Here are a few single-column examples:
Equating a column to a value: estimate one row when the column has a unique index or is the primary key.
A comparison of an indexed column to a constant: use the index to estimate the percentage of rows that will satisfy the comparison.
Equating a foreign key to a primary key (key join): use relative table sizes in determining an estimate. For example, if a 5000 row table has a foreign key to a 1000 row table, the optimizer guesses that there are five foreign rows for each primary row.
One of the most common constraints in a query is equality with a column value. For example,
SELECT * FROM employee WHERE sex = 'f'
tests for equality of the sex column. For this type of constraint, the Adaptive Server Anywhere optimizer learns from experience. A query will not always be optimized the same way the second time it is executed. The estimate for an equality constraint will be modified for columns that have an unusual distribution of values. This information is stored permanently in the database. If needed, the statistics can be deleted with the DROP OPTIMIZER STATISTICS command.
Since the query optimizer is guessing at the number of rows in a result based on the size of tables and particular restrictions used in the WHERE clause, it almost always makes inexact guesses. In many cases, the guess that the query optimizer makes is close enough to the real number of rows that the optimizer will have chosen the best search strategy. However, in some cases this does not occur.
The following query displays a list of order items that shipped later than the end of June, 1994:
SELECT ship_date FROM sales_order_items WHERE ship_date > '1994/06/30' ORDER BY ship_date DESC
The estimated number of rows is 274. However, the actual number of rows returned is only 12. This estimate is wrong because the query optimizer guesses that a test for greater than will succeed 25 percent of the time. In this example, the condition on the ship_date column:
ship_date > '1994/06/30'
is assumed to choose 25 percent of rows in the sales_order_items table.
If you know that a condition has a success rate that differs from the optimizer rule, you can tell the database this information by using an estimate. An estimate is formed by enclosing in brackets the expression followed by a comma and a number. The number represents the percentage of rows that the expression is estimated to select. In this case, you could estimate a success rate of one percent:
SELECT ship_date FROM sales_order_items WHERE ( ship_date > '1994/06/30', 1 ) ORDER BY ship_date DESC
With this estimate, the optimizer estimates ten rows in the query.
Incorrect estimates are only a problem if they lead to poorly optimized queries.
For further information about the optimizer and query optimization, see Query Optimization.