User's Guide
PART 1. Working with Databases
CHAPTER 11. Importing and Exporting Data
You can import data into your database using one of the following methods:
The LOAD TABLE statement, for efficient import of text files.
The Interactive SQL INPUT statement, for slower but more flexible import of a variety of file formats.
Interactive input using the INSERT Statement or the Interactive SQL INPUT statement.
This section describes each of these methods, and also describes some tips for dealing with incompatible data structure and conversion errors.
The LOAD TABLE statement is used for efficient importing of data from a text file into a database table. The table must exist and have the same number of columns as the input file has fields, defined on compatible data types. In order to use the LOAD TABLE statement, the user must have INSERT permission on the table.
If the department table had all its rows deleted, the following statement would load the data from the file dept.txt into the department table:
LOAD TABLE department FROM 'dept.txt'
The LOAD TABLE statement appends the contents of the file to the existing rows of the table; it does not replace the existing rows in the table. You can use the TRUNCATE TABLE statement to remove all the rows from a table.
Neither the TRUNCATE TABLE statement nor the LOAD TABLE statement fires triggers, including referential integrity actions such as cascaded deletes.
The LOAD TABLE statement has many of the same options as the UNLOAD TABLE statement.
For a description of column delimiters, use of quotes, and file names, see Unloading data using the UNLOAD TABLE statement.
The LOAD TABLE statement has the additional STRIP clause. The default setting (STRIP ON) strips trailing blanks from values before they are inserted. To keep trailing blanks, use the STRIP OFF clause in your LOAD TABLE statement.
For a full description of the LOAD TABLE statement syntax, see LOAD TABLE statement.
Data with the same structure as existing database tables can be loaded into your database from a file using the Interactive SQL INPUT statement.
The Interactive SQL INPUT statement is less efficient than the LOAD TABLE statement for importing text files. However, the INPUT statement supports several different file formats, whereas the LOAD TABLE statement can be used only for text files.
The INPUT command can be entered in Interactive SQL as follows:
INPUT INTO t1 FROM file1 FORMAT ASCII; INPUT INTO t2 FROM file2 FORMAT FIXED COLUMN WIDTHS (5, 10, 40, 40 ); ...
These statements could be put in a command file which can then be executed in Interactive SQL for modification and reference.
For more information about command files, see the tutorial chapter Command Files.
There are two commands that can be used to input data interactively. You can use the insert command:
INSERT INTO T1 VALUES ( ... )
to insert a single row at a time or you can use the input command:
INPUT INTO T1 PROMPT
which gives you a full screen to type in data in the current input format (controlled by the Interactive SQL INPUT_FORMAT option).
When you are loading data from external sources, there may be errors in the data. For example, there may be dates that are not valid dates and numbers that are not valid numbers. The CONVERSION_ERROR database option allows you to ignore conversion errors by converting them to NULL values.
For information on setting Interactive SQL database options, see SET OPTION statement.
The structure of the data to be loaded into a table does not always match the structure of the destination table itself. For example, the column data types may be different, or in different order, or there may be extra values in the data to be imported that do not match columns in the destination table.
Using the LOAD TABLE statement, load the data into a temporary table that has a structure matching that of the input file.
Use the INSERT statement with a FROM SELECT clause to extract and summarize data from the temporary table and put it into one or more of the permanent database tables.
If you are loading a set of data once and for all, you should make the temporary table using the DECLARE TEMPORARY TABLE statement. A declared temporary table exists only for the duration of a connection or, if defined inside a compound statement, of the compound statement.
If you are loading data of a similar structure repeatedly, you should make the temporary table using the CREATE TABLE statement, specifying a global temporary table. The definition of the temporary table is held in the database permanently, but the rows exist only within a given connection.
Loading large volumes of data into a database can be very time consuming. There are a few things you can do to save time.
If you are using the INPUT command, run Interactive SQL or the client application on the same machine as the server. Loading data over the network adds extra communication overhead. This might mean loading new data during off hours.
Place data files on a separate physical disk drive from the database file. This could avoid excessive disk head movement during the load.
Increase the size of the database cache. Eliminate disk cache in favor of database cache if the machine is a dedicated database server.
For a description of how to control the cache size from the server command line option see The database server.
If you are using the INPUT command, start the server with the -b switch for bulk operations mode. In this mode, the server does not keep a rollback log or a transaction log, it does not perform an automatic COMMIT before data definition commands, and it does not lock any records.
Without a rollback log, you cannot use savepoints and aborting a command always causes transactions to roll back. Without automatic COMMIT, a ROLLBACK undoes everything since the last explicit COMMIT.
Without a transaction log, there is no log of the changes. You should back up the database file before and after using bulk operations mode because, in this mode, your database is not protected against media failure. For more information, see Backup and Data Recovery.
The server allows only one connection when you use the -b switch.
If you have data that requires many commits, running with the -b option may slow database operation. At each COMMIT, the server carries out a checkpoint; this frequent checkpointing can slow the server.
Extend the size of the database file, as described in ALTER DBSPACE statement. This command allows a database file to be extended in large amounts before the space is required, rather than the normal 32 pages at a time when the space is needed. As well as improving performance for loading large amounts of data, it also serves to keep the database files more contiguous within the file system.