User's Guide
PART 4. Database Administration and Advanced Use
CHAPTER 25. Query Optimization
Storage allocations for each table or entry have a large impact on the efficiency of queries. The following points are of particular importance because each influence how fast your queries execute.
Every new row that is smaller than the page size of the database file will always be stored on a single page. If no present page has enough free space for the new row, Anywhere will write the row to a new page. For example, if the new row requires 600 bytes of space but only 500 bytes are available on a partially filled page, then Anywhere will place the row on a new page at the end of the table.
The engine locates space on pages and inserts rows in the order that it receives them. It assigns each to a page, but the locations it chooses in the table may not correspond to the order they were inserted. For example, the engine may have to start a new page in order to store a long row contiguously. Should the next row be short, it may fit in an empty location on a previous page.
The rows of all tables are unordered. If the order that you receive or process the rows is important, use an ORDER BY clause in your SELECT statement to apply an ordering to the result. Applications that rely on the order of rows in a table can fail without warning.
If you will frequently require the rows of table in a particular order, consider creating an index on those columns. Anywhere always tries to take advantage of indices when processing queries.
Whenever Anywhere inserts a row, it reserves only the space necessary to show the row with the values it contains at the time of creation. It reserves no space to store values which are NULL. It reserves no extra space to accommodate fields, such as text strings, which may enlarge.
Once assigned a home position on a page, a row is never moved. If an update changes any of the values in the row so that it will no longer fit in its assigned location, then the row is split and the extra information is inserted on another page.
This characteristic deserves special attention, especially since Anywhere allots no extra space at the time the row is inserted. For example, suppose you insert a large number of empty rows into a table, then fill in the values, one column at a time, using update statements. The result would be that almost every value in a single row will be stored on a separate page. To retrieve all the values from one row, the engine may need to read several disk pages. This simple operation would become extremely and unnecessarily slow.
You should consider filling new rows with data in the time of insertion. Once inserted, they will then have sufficient room for the data that you expect them to hold.
As you insert and delete rows from the database, the space they occupy is automatically reused. Thus, Anywhere may insert a row into space formerly occupied by another row.
Anywhere keeps a record of the amount of empty space on each page. When you ask it to insert a new row, it first searches its record of space on existing pages. If it finds enough space on an existing page, it places the new row on that page, reorganizing the contents of the page if necessary. If not, it starts a new page.
Over time, however, if a number of rows are deleted and no new rows small enough to use the empty space are inserted, the information in the database may become sparse. No utility exists to defragment the database file, as moving even one row might involve updating numerous index entries.
Since Anywhere automatically reuses empty space, the presence of these empty slots rarely affects performance. If necessary, you can reduce disk fragmentation by unloading, then reloading, the database.
Reloading also accomplishes another task. Since you are likely to reload each table in the order you frequently search them, the order the rows are stored in pages during the reload is likely to correspond closely to your preferred order. Hence, it is possible that this operation will improve database performance, much as a defragmentation utility improves disk performance by grouping all the pieces of each file together on the surface of the disk.
The page size you choose for your database can affect both the performance of your database. In general, smaller page sizes are likely to benefit operations that retrieve relatively small rows from random locations.
By contrast, larger pages tend to benefit queries that perform sequential scans, particularly when the rows are stored on pages in close to the order that the rows are retrieved via an index. In this situation, reading one page of memory to obtain the values of one row may have the side effect of loading the contents of the next few rows into memory. Often, the physical design of disks permits them to retrieve few large blocks more efficiently than more small ones.
Should you choose a larger page size, such as 4 kb pages, you may wish to increase the size of the cache. Fewer large pages can fit into the same space; for example, 1 Mb of memory can hold 1000 pages that are each 1 kb in size, but only 250 pages that are 4 kb in size. How many pages is enough depends entirely on your database and the nature of the queries your application performs. You can conduct performance tests with various cache sizes. If your cache can not hold enough pages, performance will suffer as Anywhere begins swapping frequently-used pages to disk.
Anywhere attempts to fill pages as much as possible. Empty space accumulates only when new objects are too large to fit empty space on existing pages. Consequently, adjusting the page size may not significantly affect the overall size of your database.