Collection Contents Index Introduction to character sets and collations Choosing a database collation pdf/chap12.pdf

User's Guide
   PART 1. Working with Databases
     CHAPTER 12. Database Collations and International Languages       

Using character set translation


This section outlines how different components of Adaptive Server Anywhere handle code pages, and how character set translation enables systems where different components have different character sets to function properly.

Character set translation can be carried out among character sets that represent the same characters, but at different values. There needs to be a degree of compatibility between the character sets for this to be possible. For example, character set translation between EUC-JIS and Shift-JIS character sets, but not between EUC-JIS and OEM code page 850.

To enable character-set translation, you must start the database server using the -ct command-line option. For example:

dbeng6 -ct asademo.db

Top of page  Avoiding character-set translation

There is a performance cost associated with character set translation. If you can set up an environment such that no character set translation is required, then you do not have to pay this cost, and your setup is simpler to maintain.

If you work with a single-byte character set and are concerned only with seven-bit ASCII characters (values 1 through 127), then you do not need character set translation. Even if the code pages are different in the database and on the client operating system, they are compatible over this range of characters. Many English-language installations will meet these requirements.

If you do require use of extended characters, there are other steps you may be able to take:

Also, recall that character set translation takes place only if the database server is started using the -ct command-line switch.

Top of page  ODBC code page translation

Adaptive Server Anywhere provides an ODBC translation driver. This converts characters between OEM and ANSI code pages. It allows Windows applications using ANSI code pages to be compatible with databases that use OEM code pages in their collations.

Not needed if you use ANSI character sets    
If you use an ANSI character set in your database, and are using ANSI character set applications, you do not need to use this translation driver.

The translation driver carries out a mapping between the OEM code page in use in the "DOS box" and the ANSI code page used in the Windows operating system. If your database uses the same code page as the OEM code page, the characters are translated properly. If your database does not use the same code page as your machine's OEM code page, you will still have compatibility problems.

Embedded SQL does not provide any such code page translation mechanism.

Top of page  Sybase Central and Interactive SQL code page translation

Interactive SQL and Sybase Central both have OEM to ANSI internal code page translation if the database uses an OEM character set. As with the ODBC translation driver, there is an assumption that the OEM code page on the local machine is the same as that in the database.

For Interactive SQL, you can turn off the translation if you wish, by setting the Interactive SQL option CHAR_OEM_Translation to a value of OFF.

For Info     For more information on OEM to ANSI character set translation in Interactive SQL, see CHAR_OEM_TRANSLATION option.

Top of page  Character translation for database messages

Error and other messages from the database software are held in a language library. Localized versions of this library are provided with localized versions of Adaptive Server Anywhere. The messages for each language assume an ANSI code page.

Some database messages have placeholders that are filled in from the database. For example, if you execute a query with a column that does not exist, the returned error messages is:

Column column-name not found

where column-name is filled in from the database.

If the database collation uses a character set that is different from the Language DLL but compatible with it (such as EUC-JIS compared to Shift-JIS), then the database server automatically translates database messages into the database collation prior to sending to the client. If necessary, further character set translation from database to client is carried out in order to ensure that the message reaches the client in the appropriate character set.

Messages are always translated, if necessary, into the database collation character set, regardless of whether the -ct command-line option is used. The -ct command-line option affects only the character set conversion on the way to the client.

Top of page  Connection strings and character sets

Connection strings present a special case for character set translation. The connection string is parsed by the client library, in order to locate or start a database server. This parsing is done with no knowledge of the server character set or language.

The connection string is parsed as follows:

  1. It is broken down into its keyword = value components. This can be done independently of character set, as long as you do not use the { curly braces }around CommLinks parameters. Instead, use the recommended ( parentheses ).

  2. The server is located. The server name must be constructed from lower page (seven-bit) ASCII characters.

  3. The DatabaseName or DatabaseFile parameter is interpreted in the server character set.

  4. Once the database is located, the remaining connection parameters are interpreted according to its character set.

Top of page  

Collection Contents Index Introduction to character sets and collations Choosing a database collation pdf/chap12.pdf