User's Guide
PART 1. Working with Databases
CHAPTER 6. Joins: Retrieving Data from Several Tables
A relational database stores information about different types of objects in different tables. For example, you should store information particular to employees in one table, and information that pertains to departments in another. The employee table contains information such as an employee's name and address. The department table contains information about one department, such as the name of the department and who is the department head.
Most questions can only be answered using a combination of information from the various tables; for example, the question "Who manages the Sales department?" To find the name of this person, you must identify the correct person using information from the department table, then look up that person's name in the employee table.
Joins are a means of answering such questions by forming a new virtual table that includes information from multiple tables. For example, you could create a list of the department heads by combining the information contained in the employee table and the department table. You specify which tables contain the information you need using the FROM clause.
To make the join useful, you must combine the correct columns of each table. To list department heads, each row of the combined table should contain the name of a department and the name of the employee who manages it. You control how columns are matched in the composite table either by specifying a particular type of join operation or by using the ON phrase.
The join operation is the hallmark of the relational model of database management. More than any other feature, the join distinguishes relational database management systems from other types of database management systems.
In structured database management systems, often known as network and hierarchical systems, relationships between data values are predefined. Once a database has been set up, it is difficult to make queries about unanticipated relationships among the data.
In a relational database management system, on the other hand, relationships among data values are left unstated in the definition of a database. They become explicit when the data is manipulated: when you query the database, not when you create it. You can ask any question that comes to mind about the data stored in the database, regardless of what was intended when the database was set up.
According to the rules of good database design, called normalization rules, each table should describe one kind of entity—a person, place, event, or thing. That is why, when you want to compare information about two or more kinds of entities, you need the join operation. Relationships among data stored in different tables are discovered by joining them.
A corollary of this rule is that the join operation gives you unlimited flexibility in adding new kinds of data to your database. You can always create a new table that contains data about a different kind of entity. If the new table has a field with values similar to those in some field of an existing table or tables, it can be linked to those other tables by joining.