Chapter 4 A Multiprocessing Kernel

Overview

When Sybase set out to create its first relational database management system, one of the principal design goals was to efficiently use operating system and hardware resources. It was clear that the use of a threaded operating system would be a big step towards realizing the goal.

Few operating systems supported threading at that time, so Sybase designed its own kernel (a collection of programs that handle resource management for the system). The SQL Server kernel takes care of all the DBMS requirements that the operating system would have to supply if the kernel did not exist. The kernel handles the following activities:

Schedule management
Task switching
Coordinating multiple engines
Handling network and disk I/O
Managing low-level memory
Optimizing queries

Performing this work in the database kernel eliminates nearly all of the inefficient operating system overhead typically associated with many relational database management systems. It also results in very low memory requirements, achieving the goal for efficient use of hardware.

This chapter describes SQL Server's multithreaded kernel by covering the following topics:

Kernel Architecture

A SQL Server process consists of the kernel and the DBMS, which both perform work associated with tasks. The kernel manages tasks associated with SQL Server's clients. For example, it keeps track of the task's state, which is necessary in a multithreaded environment because of context switching. The DBMS component performs query processing, resource management, and access methods as described in Chapter 3.

The kernel provides a running environment for the DBMS. As illustrated in Figure 4-1, it creates and manages DBMS tasks, handles disk and network I/O, provides time of day, obtains memory resources from the operating system, and insulates the DBMS, providing operating system and hardware independence.

Figure 4-1: A functional view of SQL Server architecture

The operating system services that SQL Server does require are:

Creating and managing processes
Managing interprocess communication
Handling device and file requests
Creating and managing space for objects in shared memory

Server Structures

The kernel manages server structures¯programming language data structures that hold the constructs SQL Server uses for system management. Server structures include all services need for user connections, such as stacks and status structures; lock structures; many of the configurable aspects of SQL Server behavior; and structures that keep track of database object states and those that track devices.

The kernel allocates data structures from a pool of shared memory that it obtains from the operating system. Server structures shared between DBMS tasks in a multiprocessor environment require synchronization to avoid overwriting data in the databases they share. The kernel uses spinlocks and semaphores (introduced in Chapter 3) to ensure mutual exclusion between DBMS tasks and engines. When a task requests data, the kernel generates a spinlock, which waits at the data address in memory until the task obtains a lock.

Cost-Based Optimizer

SQL Server uses a cost-based optimizer to determine the best execution plan for a query. It is called a "cost-based" optimizer because it makes intelligent decisions about the fastest way to perform a query based on the time cost. A syntax-based optimizer, on the other hand, has no knowledge of the actual data or size of the tables¯and SQL Server's optimizer looks at both.

Sybase was one of the first to feature time-cost optimization in its relational database products. Our developers are continually improving this feature with analysis of real-world customer installations.

The optimizer examines parsed and normalized queries and information about database objects. The input to the optimizer is a parsed SQL query; the output from the optimizer is a query plan. The query plan is the ordered set of steps required to carry out the query, including the methods (table scan, index choice, and so on) to access each table. A query plan is compiled code ready to run.

Optimizer Use of Cache

Sybase's fully integrated query optimizer evaluates the type of data being accessed and selects the best I/O size for each object. It uses the buffer sizes available in the named cache to which that object is bound, automatically optimizing I/O for each application.

The optimizer also determines when a query will access a large number of sequential pages in a table and dictates the use of sequential prefetch in these cases. The ability to choose optimal I/O size combined with asynchronous prefetches of large amounts of data gives quick results for decision-support applications, while preserving the high performance and throughput of OLTP applications running on the same system. For the first time in an open RDBMS, a single instance of the database can support mixed workloads without compromise. Also see ``Tunable Block I/O''.

When a data cache is configured to allow large I/Os, the optimizer can choose to prefetch data pages if it will be faster. When several pages are read into cache with a single I/O, they are treated as a unit. They age in cache together and if any page in the unit has been changed, all pages are written to disk as a unit. Because much of the time required to perform I/O operations is taken up in seeking and positioning, reading eight pages in a 16K I/O performs nearly eight times as fast as a single-page 2K I/O.

Stored Procedures and Triggers

Stored procedures and triggers are optimized when the stored procedure or trigger object is first executed and the query plan is stored in the procedure cache. If other users execute the same procedure while the plan resides in the procedure cache, they'll receive a performance benefit because the compiled query plan is already in cache and need not be recompiled.

Updates

SQL Server 11 implements complex query optimization that minimizes the number of logical I/Os and streamlines update operations. The optimizer handles updates in different ways depending on the changes being made to the data and the indexes used to locate the rows.

The two major types of updates are direct updates and deferred updates. The optimizer performs direct updates whenever possible because they require less overhead than deferred updates and are generally faster. They limit the number of log scans, reduce logging, save traversal of index B-trees reducing lock contention, and save I/O because SQL Server does not have to refetch pages to perform modifications based on log records.

Direct Updates

SQL Server performs direct updates in three ways: in-place updates, cheap direct updates, and expensive direct updates.

In-Place Updates

SQL Server performs in-place updates whenever possible. In an in-place update, subsequent rows on the page do not move; the row IDs remain the same and the pointers in the row offset table do not change. Figure 4-2 shows an in-place update. The pubdate column is fixed length, so the length of the data row does not change. The access method in this example could be a table scan or a clustered or nonclustered index on title_id.

update titles set title = "Portable C Software" 

where title = "Designing Portable Software"

Other Factors that Affect the Query Plan

Sybase's optimizer estimates the costs of possible query strategies for a given query. It takes into account factors such as:

The size of each table in the query, both in rows and data pages.
The size of the available data cache.
What indexes, and what types of indexes exist on the tables and columns used in the query.
Whether the index "covers" the query¯that is, whether the query can be satisfied by retrieving data from index keys without having to access the data pages. SQL Server can use indexes that cover queries even if no where clauses are included in the query.
The density and distribution of keys in the indexes. SQL Server maintains statistics for index keys and keeps density statistics for each prefix of columns in indexes with compound keys. That is, it keeps density statistics for the first column, the first and second columns, the first, second and third columns, and so on.
The estimated cost of physical and logical reads and cost of caching.
Optimizable join clauses and the best join order, considering the costs and number of table scans required for each join, and the usefulness of indexes in limiting the scans.
Where there are no useful indexes, whether building a worktable (an internal, temporary table) with an index on the join columns would be faster than table scans.
Whether the query contains a max or min aggregate that can use an index to find the value without scanning the table.

Disk I/O

The disk I/O operations that take place when SQL Server writes data to disk or reads data from disk have a much bigger impact on SQL Server system performance than any other operation. That's because I/O performance includes the time that the CPU and operating system need to process the data packet. So I/O is a potential bottleneck in relational database management systems running on-line applications.

Figure 4-3: Reads and writes to disk

The database management system should minimize the number of actual physical reads and writes and amortize the cost of I/O over many users. SQL Server architecture is designed to maximize I/O performance and several ways to do that, as this section discusses.

Tunable Block I/O

There is virtually the same amount of overhead associated with all data packets no matter what size they are. Most of the time that an I/O operation consumes is spent queueing I/O requests, seeking on disk, and physical positioning. When more data can be sent in fewer packets, it reduces overhead and improves throughput. Large I/O can greatly improve performance for some types of queries, such as those that scan entire tables or ranges of tables.

Large I/O for Data Caches

SQL Server 11 can perform I/O requests in various block sizes ranging from 2K to 16K for data and log pages. Each object in the database can be retrieved using different block sizes. It may, for example, be best to fetch text/image data using an 8K block size and data records using a 4K block size. SQL Server's ability to use different block sizes allows it to make very efficient use of memory. Several pages can be fetched in a single I/O, reducing the overall number of physical I/Os and increasing performance.

To perform large I/O on a table, the cache associated with that table must be configured for large I/O, which is easily done by defining buffer pools of various block sizes within the cache. You can configure the default cache and any named caches you create for large I/O by dividing it into pools of 2K, 4K, 8K, and 16K. The default I/O size is 2K, the size of one SQL Server data page. For queries that can access data sequentially (because the data is stored sequentially), you can read up to eight data pages in a single I/O.

Large I/O can increase performance for:

Queries that table scan, both single-table queries and queries that perform joins
Queries that scan the leaf level of a nonclustered index
Queries that use text or image data
Queries that allocate several pages, such as select into
Bulk copy operations on heaps, both copy in and copy out
The update statistics command, dbcc checktable, and dbcc checkdb

When a cache is configured for 16K I/O and the optimizer chooses 16K I/O for the query plan, SQL Server reads an entire extent, eight 2K data pages, when it needs to access a page that is not in cache.

Configurable I/O for the Log

The transaction log file can also be bound to its own cache and configured for the most appropriate I/O block size. Using large I/Os for the log in addition to the user log cache can significantly boost OLTP and data warehouse performance. See ``Minimizing Log Contention with the User Log Cache''.

Shared Buffers

Shared data buffers and shared procedure buffers are other SQL Server architectural features that can reduce the number of I/O operations. If one task's request brings data into cache, the next task that needs that data gets a performance benefit because it doesn't have to wait for disk access¯the data is used from cache (if the data is still there). Similarly, if a user calls a procedure into cache, and the same or another user calls the procedure while it is still in cache, the caller gets a performance benefit.

Keeping Data in Cache

SQL Server writes pages to disk only when the SQL Server has spare cycles (through the housekeeper task), when buffer space is needed for new records, or when you issue a checkpoint command to write all modified pages in memory to disk. Some other database architectures require that a data page be written immediately to disk after a record is altered in order to guarantee data consistency. SQL Server's write-ahead logging and shared transaction log strategies allow it to keep frequently accessed pages in cache for performance gains and guarantee data consistency at the same time.

Network Features

SQL Server's network architecture is based on the ISO OSI Reference Model for the network protocol stack. SQL Server supports most common network protocols, and you can install additional protocols if necessary. Application software interacts directly with the APIs, so application developers need not get bogged down by the intricacies of network-level protocols.

In an SMP system, all SQL Server engines are peers. There are no master engines. As of SQL Server 11.0, all engines have the capability to perform all tasks, including network tasks. If the SMP system supports network affinity migration, SQL Server distributes I/O tasks among all the engines, making them truly symmetric. This is a large benefit because it facilitates workload adaptability and reduces the need for tuning¯a big contrast to Oracle and Informix.

Multiple Engines Handle Network Connections

``Multiple Network Engines'' introduced System 11's capability for handling more than 10,000 connections using multiple network engines. When a client logs in, SQL Server assigns the client's I/O tasks to the engine that is currently handling the smallest number of network connections. That engine handles all of the client's network operations until the client connection terminates.

In this environment, any SQL Server engine can perform the same set of user or system tasks, such as:

Parsing, compiling, and optimizing the query
Searching the data cache
Issuing disk I/O read and write requests
Requesting and releasing locks

The assigned network engine performs these tasks for the connections they manage:

Polling sockets
Preparing incoming requests for optimization and execution
Returning data to client applications

If the SMP system does not support network affinity migration, only one of the engines, engine 0, handles network management tasks. If there are many user connections, engine 0 can become disproportionately busy compared to the other engines. Also, the operating system could impose limits on the number of concurrent users possible for a single CPU.

Using network affinity migration offers the following benefits:

It increases the overall number of simultaneous connections that a single SQL Server can handle as you add more engines.
It makes SQL Server more symmetric by allowing any engine to handle network connections.
It reduces the performance and throughput bottlenecks that occur because of the increased load on engine 0 (to perform all the network I/O) and other engines waiting for engine 0.
It distributes the network I/O load among its engines more efficiently by migrating responsibility for that user task's network operations to the engine with the lightest load.

Transact-SQL and TDS

Tabular Data Stream (TDS) is Sybase's application-level protocol. TDS implements the key components of the presentation, session, and application layers of the OSI Reference Model. It handles the following tasks:

Establishes connections to servers
Requests data and server status
Receives data, status, error information, and other SQL Server results
Requests stored procedures
Provides orderly shut down for connections

Application developers need not consider specifics about target data because TDS provides protocol independence for client applications. TDS handles field types, number of values returned, and other such information.

The practical result of TDS for the application programmer is that clients can also use RPC commands, package commands, bulk copy commands, and message commands, to name a few of the alternatives to Transact-SQL. Competitors do not offer this advantage.

[Top] [Prev] [Next] [Bottom]