Java database design

User's Guide
PART 3. Java in the Database
CHAPTER 17. Using Java in the Database

Java database design

There is a large body of theory and practical experience available to help you design a relational database. You can find descriptions Entity-Relationship design and other approaches not only in introductory form (see Designing Your Database) but also in more advanced books.

No comparable body of theory and practice to develop object-relational databases exists, and this certainly applies to Java-relational databases. Here we offer some suggestions for how to use Java to enhance the practical usefulness of relational databases.

Entities and attributes in relational and object-oriented data

In relational database design, each table describes an entity. For example, in the sample database there are tables named Employee, Customer, Sales_order, and Department. The attributes of these entities become the columns of the tables: employee addresses, customer identification numbers, sales order dates, and so on. Each row of the table may be considered as a separate instance of the entity(a specific employee, sales order, or department.

In object-oriented programming, each class describes an entity, and the methods and fields of that class describe the attributes of the entity. Each instance of the class (each object) holds a separate instance of the entity.

It may seem unnatural, therefore, for relational columns to be based on Java classes. A more natural correspondence may seem to be between table and class.

Entities and attributes in the real world

The distinction between entity and attribute may sound clear, but a little reflection shows that it is commonly not at all clear in practice:

An address may be seen as an attribute of a customer, but an address is also an entity, with its own attributes of street, city, and so on.
A price may be seen as an attribute of a product, but may also be seen as an entity, with attributes of amount and currency.

The utility of the object-relational database lies in exactly the fact that there are two ways of expressing entities. You can express some entities as tables, and some entities as classes in a table. The next section describes an example.

Relational database limitations

Consider an insurance company that wishes to keep track of its customers. A customer may be considered as an entity, so it is natural to construct a single table to hold all customers of the company.

However, insurance companies handle several kinds of customer. They handle policy holders, policy beneficiaries, and people who are responsible for paying policy premiums. For each of these customer types, the insurance company needs different information. For a beneficiary, little is needed beyond an address. For a policy holder, health information is required. For the customer paying the premiums, information may be needed for tax purposes.

Is it best to handle the separate kinds of customers as separate entities, or to handle the customer type as an attribute of the customer? There are limitations to both approaches:

Building separate tables for each type of customer can lead to a very complex database design, and to multi-table queries when information relating to all customers is required.
It is difficult, if a single customer table is used, to ensure that the correct information is entered for each customer. Making columns that are required for some customers, but not for others, nullable permits the entry of correct data, but does not enforce it. There is no simple way in relational databases to tie default behavior to an attribute of the new entry.

Using classes to overcome relational database limitations

You can use a single customer table, with Java class columns for some of the information, to overcome the limitations of relational databases.

For example, suppose different contact information is needed for policy holders than for beneficiaries. You could approach this by defining a column based on a ContactInformation class. Then define classes named HolderContactInformation and BeneficiaryContactInformation, which are subclasses of the ContactInformation class. By entering new customers according to their type, you can be sure that the correct information is entered.

Levels of abstraction for relational data

Data in a relational database can be categorized by its purpose. Which of this data belongs in a Java class, and which is best kept in simple data type columns?

Referential integrity columns Primary key columns and foreign key columns commonly hold identification numbers. These identification numbers may be called referential data; their primary purpose is to define the structure of the database and to define the relationships between tables.

Referential data does not generally belong in Java classes. Although you can make a Java class column a primary key column, integers and other simple data types are more efficient for this purpose.
Indexed data Columns that are commonly indexed may also belong outside a Java class. However, the dividing line between data that needs to be indexed and data that is not to be indexed is not a well-defined one.

With computed columns you can selectively index on a Java field or method (or, in fact, some other expression). If you define a Java class column and then find that it would be useful to index on a field or method of that column, you can use computed columns to make a separate column from that field or method.

For more information, see Using computed columns with Java classes.
Descriptive data It is common for some of the data in each row to be descriptive. It is not used for referential integrity purposes, and is possibly not frequently indexed, but it is data commonly used in queries. For an employee table, this may include information such as start date, address, benefit information, salary, and so on. This data can often benefit from being combined into fewer columns of Java class data types.

Java classes are useful for abstracting at a level between that of the single relational column and the relational table.