Data Replication with SQL Remote
PART 2. Replication Design for SQL Remote
CHAPTER 7. SQL Remote Design for Adaptive Server Anywhere
There are cases where a row may need to be included in several subscriptions. For example, we may have a many-to-many relationship. In this section, we use a case study to illustrate how to handle this situation.
The Policy database illustrates why and how to partition tables when there is a many-to-many relationship in the database.
Here is a simple database that illustrates the problem.
Each sales representative sells to several customers, and some customers deal with more than one sales representative. In this case, the relationship between Customer and SalesRep is thus a many-to-many relationship.
The three tables are described in more detail as follows:
Table |
Description |
---|---|
SalesRep |
All sales representatives that work for the company. The SalesRep table has the following columns:
The SQL statement creating this table is as follows: CREATE TABLE SalesRep ( rep_key CHAR(12) NOT NULL, name CHAR(40) NOT NULL, PRIMARY KEY (rep_key) ); |
Customer |
All customers that do business with the company. The Customer table includes the following columns:
The SQL statement creating this table is as follows: CREATE TABLE Customer ( cust_key CHAR(12) NOT NULL, name CHAR(40) NOT NULL, PRIMARY KEY (cust_key) ); |
Policy |
A three-column table that maintains the many-to-many relationship between customers and sales representatives. The Policy table has the following columns:
The SQL statement creating this table is as follows. CREATE TABLE Policy ( policy_key CHAR(12) NOT NULL, cust_key CHAR(12) NOT NULL, rep_key CHAR(12) NOT NULL, FOREIGN KEY ( cust_key ) REFERENCES Customer ( cust_key ) FOREIGN KEY ( rep_key ) REFERENCES SalesRep (rep_key ), PRIMARY KEY ( policy_key ) ); |
The goals of the replication design are to provide each sales representative with the following information:
The entire SalesRep table.
Those rows from the Policy table that include sales relationships involving the sales rep subscribed to the data.
Those rows from the Customer table listing customers that deal with the sales rep subscribed to the data.
The many-to-many relationship between customers and sales representatives introduces new challenges in maintaining a proper sharing of information:
We have a table (in this case the Customer table) that has no reference to the sales representative value that is used in the subscriptions to partition the data.
Again, this problem is addressed by using a subquery in the publication.
Each row in the Customer table may be related to many rows in the SalesRep table, and shared with many sales representatives databases.
Put another way, the rows of the Contact table in Partitioning tables that do not contain the subscription expression were partitioned into disjoint sets by the publication. In the present example there are overlapping subscriptions.
To meet the replication goals we again need one publication and a set of subscriptions. In this case, we use two triggers to handle the transfer of customers from one sales representative to another.
A single publication provides the basis for the data sharing:
CREATE PUBLICATION SalesRepData ( TABLE SalesRep, TABLE Policy SUBSCRIBE BY rep_key, TABLE Customer SUBSCRIBE BY ( SELECT rep_key FROM Policy WHERE Policy.cust_key = Customer.cust_key ), );
The subscription statements are exactly as in the previous example.
The publication includes part or all of each of the three tables. To understand how the publication works, it helps to look at each article in turn:
SalesRep table There are no qualifiers to this article, so the entire SalesRep table is included in the publication.
... TABLE SalesRep, ...
Policy table This article uses a subscription expression to specify a column used to partition the data among the sales reps:
... TABLE Policy SUBSCRIBE BY rep_key, ...
The subscription expression ensures that each sales rep receives only those rows of the table for which the value of the rep_key column matches the value provided in the subscription.
The Policy table partitioning is disjoint: there are no rows that are shared with more than one subscriber.
Customer table A subscription expression with a subquery is used to define the partition. The article is defined as follows:
... TABLE Customer SUBSCRIBE BY ( SELECT rep_key FROM Policy WHERE Policy.cust_key = Customer.cust_key ), ...
The Customer partitioning is non-disjoint: some rows are shared with more than one subscriber.
The subquery in the Customer article returns a single column (rep_key) in its result set, but may return multiple rows, corresponding to all those sales representatives that deal with the particular customer. When a subscription expression has multiple values, the row is replicated to all subscribers whose subscription matches any of the values. It is this ability to have multiple-valued subscription expressions that allows non-disjoint partitionings of a table.
The problem of territory realignment (reassigning rows among subscribers) requires special attention, just as in the section Territory realignment in the Contact example.
You need to write triggers to maintain proper data throughout the installation when territory realignment (reassignment of rows among subscribers) is allowed.
In this example, we require that a customer transfer be achieved by deleting and inserting rows in the Policy table.
To cancel a sales relationship between a customer and a sales representative, a row in the Policy table is deleted. In this case, the Policy table change is properly replicated to the sales representative, and the row no longer appears in their database. However, no change has been made to the Customer table, and so no changes to the Customer table are replicated to the subscriber.
In the absence of triggers, this would leave the subscriber with incorrect data in their Customer table. The same kind of problem arises when a new row is added to the Policy table.
The solution is to write triggers that are fired by changes to the Policy table, which include a special syntax of the UPDATE statement. The special UPDATE statement makes no changes to the database tables, but does make an entry in the transaction log that SQL Remote uses to maintain data in subscriber databases.
Here is a trigger that tracks INSERTS into the Policy table, and ensures that remote databases contain the proper data.
CREATE TRIGGER InsPolicy BEFORE INSERT ON Policy REFERENCING NEW AS NewRow FOR EACH ROW BEGIN UPDATE Customer PUBLICATION SalesRepData SUBSCRIBE BY ( SELECT rep_key FROM Policy WHERE cust_key = NewRow.cust_key UNION ALL SELECT NewRow.rep_key ) WHERE cust_key = NewRow.cust_key; END;
Here is a corresponding trigger that tracks DELETES from the Policy table:
CREATE TRIGGER DelPolicy BEFORE DELETE ON Policy REFERENCING OLD AS OldRow FOR EACH ROW BEGIN UPDATE Customer PUBLICATION SalesRepData SUBSCRIBE BY ( SELECT rep_key FROM Policy WHERE cust_key = OldRow.cust_key AND rep_key <> OldRow.rep_key ) WHERE cust_key = OldRow.cust_key; END;
Some of the features of the trigger are the same as in the previous section. The major new features are that the INSERT trigger contains a subquery, and that this subquery can be multi-valued.
The subquery in the BEFORE INSERT trigger is a UNION expression, and can be multi-valued:
... SELECT rep_key FROM Policy WHERE cust_key = NewRow.cust_key UNION ALL SELECT NewRow.rep_key ...
The second part of the UNION is the rep_key value for the new sales representative dealing with the customer, taken from the INSERT statement.
The first part of the UNION is the set of existing sales representatives dealing with the customer, taken from the Policy table.
This illustrates the point that the result set of the subscription query must be all those sales representatives receiving the row, not just the new sales representatives.
The subquery in the BEFORE DELETE trigger is multi-valued:
... SELECT rep_key FROM Policy WHERE cust_key = OldRow.cust_key AND rep_key <> OldRow.rep_key ...
The subquery takes rep_key values from the Policy table. The values include the primary key values of all those sales reps who deal with the customer being transferred (WHERE cust_key = OldRow.cust_key), with the exception of the one being deleted (AND rep_key <> OldRow.rep_key).
This again emphasizes that the result set of the subscription query must be all those values matched by sales representatives receiving the row following the DELETE.
Data in the Customer table is not identified with an individual subscriber (by a primary key value, for example) and is shared among more than one subscriber. This allows the possibility of the data being updated in more than one remote site between replication messages, which could lead to replication conflicts. You can address this issue either by permissions (allowing only certain users the right to update the Customer table, for example) or by adding RESOLVE UPDATE triggers to the database to handle the conflicts programmatically.
UPDATES on the Policy table have not been described here. They should either be prevented, or a BEFORE UPDATE trigger is required that combines features of the BEFORE INSERT and BEFORE DELETE triggers shown in the example.
When the Subscribe_by_remote option is ON, operations from remote databases on rows with a subscribe by value of NULL or an empty string will assume the remote user is subscribed to the row. By default, the Subscribe_by_remote option is set to ON. In most cases, this setting is the desired setting.
The Subscribe_by_remote option solves a problem that otherwise would arise with some publications, including the Policy example. This section describes the problem, and how the option automatically avoids it.
The publication uses a subquery for the Customer table subscription expression, because each Customer may belong to several Sales Reps:
CREATE PUBLICATION SalesRepData ( TABLE SalesRep, TABLE Policy SUBSCRIBE BY rep_key, TABLE Customer SUBSCRIBE BY ( SELECT rep_key FROM Policy WHERE Policy.cust_key = Customer.cust_key ), );
Marc Dill is a Sales Rep who has just arranged a policy with a new customer. He inserts a new Customer row and also inserts a row in the Policy table to assign the new Customer to himself.
As the INSERT of the Customer row is carried out by the Message Agent at the consolidated database, Adaptive Server Anywhere records the subscription value in the transaction log, at the time of the INSERT.
Later, when the Message Agent scans the log, it builds a list of subscribers from the subscription expression, and Marc Dill is not on the list, as the row in the Policy table assigning the customer to him has not yet been applied. If Subscribe_by_remote were set to OFF, the result would be that the new Customer is sent back to Marc Dill as a DELETE operation.
As long as Subscribe_by_remote is set to ON, the Message Agent assumes the row belongs to the Sales Rep that inserted it, the INSERT is not replicated back to Marc Dill, and the replication system is intact.
If Subscribe_by_remote is set to OFF, you must ensure that the Policy row is inserted before the Customer row, with the referential integrity violation avoided by postponing checking to the end of the transaction.