User's Guide
PART 4. Database Administration and Advanced Use
CHAPTER 20. Backup and Data Recovery
The following logs protect your data from system and media failure.
The checkpoint log.
The rollback log.
The transaction log.
All of these logs play a role in data recovery. Each database running on a server has its own checkpoint log, rollback log, and transaction log. Optionally, you can also maintain a mirror of the transaction log for greater protection of vital data.
The checkpoint log is used during database recovery after a system failure or improper shutdown of the server. The checkpoint log is stored in the database file.
A database file is composed of pages. Before a page is updated (made dirty), a copy of the original is always made. The copied pages are the checkpoint log.
Dirty pages are not written immediately to the database file on disk. For improved performance, they are cached in memory and only written to the database file when the cache is full or the server has no pending requests. A checkpoint is a point at which all dirty pages are written to disk. Following a checkpoint, the checkpoint log is deleted.
A checkpoint can occur for several reasons:
The server is shut down
The amount of time since the last checkpoint exceeds the database option CHECKPOINT_TIME
The estimated time to do a recovery operation exceeds the database option RECOVERY_TIME
The server is idle long enough to write all dirty pages
A transaction issues a CHECKPOINT statement
The server is running without a transaction log, and a transaction is committed
The priority of writing dirty pages to the disk increases as the time and the amount of work since the last checkpoint grows. The priority is determined by the following factors:
Checkpoint Urgency The time that has elapsed since the last checkpoint, as a percentage of the checkpoint time setting of the database. The server -gc command line option controls the maximum desired time, in minutes, between checkpoints. You can also set the desired time using the CHECKPOINT_TIME option.
Recovery Urgency A heuristic to estimate the amount of time required to recover the database if it fails right now. The server -gu command line option controls the maximum desired time, in minutes, for recovery in the event of system failure. You can also set the desired time using the RECOVERY_TIME option.
The checkpoint and recovery urgencies are important only if the server does not have enough idle time to write dirty pages.
For a description of the command-line options, see The database server.
Optional information |
The writing of dirty pages to disk is carried out by a task within the server called the idle I/O task. This task shares processing time with other database tasks, according to a priority. The lower the priority of the idle I/O task, the less time it gets.
There is a threshold for the number of dirty pages, below which writing of database pages does not take place.
When the database is busy, the urgency is low, and the cache only has a few dirty pages, the idle I/O task runs at a very low priority and no writing of dirty pages takes place.
Once the urgency exceeds 30%, the priority of the idle I/O task is increased. At intervals, the priority is increased again. As the urgency becomes high, the engine shifts its primary focus to writing dirty pages until the number gets below the threshold again. However, the engine only writes out pages during the idle I/O task if the number of dirty pages is greater than the threshold.
If, because of other activity in the database, the number of dirty pages falls to zero, and if the urgency is 50% or more, then a checkpoint takes place automatically, since it is a convenient time.
Both the checkpoint urgency and recovery urgency values increase in value until the checkpoint occurs, at which point they drop to zero. They do not decrease otherwise.
As changes are made to the contents of tables, a rollback log is kept for the purpose of canceling changes. It is used to process the ROLLBACK statement for recovering from system failure. There is a separate rollback log for each connection. When a transaction is complete, the rollback log contents for that connection are deleted. The rollback logs are stored in the database file.
All changes to the database are stored in the transaction log in the order that they occur. Inserts, updates, deletes, commits, rollbacks, and database schema changes are all logged. The transaction log is also called a forward log file.
The transaction log is stored in a separate file.
The transaction log is optional. If you run with no transaction log, a checkpoint is carried out whenever a transaction is committed. The checkpoint ensures that all committed transactions are written to the disk. Checkpoints can be time consuming, so you should run with a transaction log for improved performance as well as protection against media failure and corrupted databases.
For greater protection, you can maintain two identical transaction logs in tandem. This is called transaction log mirroring.
For information on creating a database with a mirrored transaction log, see The Initialization utility. For information on changing an existing database to use a mirrored transaction log, see The Transaction Log utility.
The transaction log is not kept in the main database file. The filename of the transaction log can be set when the database is initialized, or at any other time when the server is not running, using the Log utility.
To protect against media failure, the transaction log should be written to a different device than the database file. Some machines with two or more hard drives have only one physical disk drive with several logical drives or partitions. If you want protection against media failure, make sure that you have a machine with two storage devices, or use a storage device on a network file server.
By default, the transaction log is put on the same device and in the same directory as the database—this does not protect against media failure.
Updates and deletes on tables that do not have a primary key or unique index cause the entire contents of the affected rows to be entered in the transaction log. If a primary key is defined, the engine needs only to record the primary key column values to uniquely identify a row. If the table contains many columns or wide columns, the transaction log pages will fill up much faster (reducing performance) if no primary key is defined. And if dbtran is used on the transaction log, it produces a very large command file.
This affects updates and deletes, but not inserts, which must always log all column values.
If a primary key does not exist, the engine looks for a UNIQUE NOT NULL index on the table (or a UNIQUE constraint). A UNIQUE index that allows null values is not sufficient.
Performance tip |
The transaction log is not human-readable. The Log Translation utility (dbtran) can be used to convert a transaction log into a SQL command file, which can serve as an audit trail of changes made to the database. The following command uses dbtran to convert a transaction log:
dbtran sample.log changes.sql
You can also convert a transaction log to a SQL command file from Sybase Central.
For more information on the log translation utility, see The Log Translation utility.
The transaction log contains a record of everything, including transactions that were never committed. By converting the transaction log to a SQL command file and choosing the option to include uncommitted transactions (for example, by running dbtran with the -a switch) you can recover transactions that were accidentally canceled by a user. If the -a option is not chosen, the log translation utility omits transactions that were rolled back. While this is not a common procedure, it can prove useful for exceptional cases.