Data Replication with SQL Remote
PART 2. Replication Design for SQL Remote
CHAPTER 7. SQL Remote Design for Adaptive Server Anywhere
An UPDATE conflict occurs when the following sequence of events takes place:
User 1 updates a row at remote site 1.
User 2 updates the same row at remote site 2.
The update from User 1 is replicated to the consolidated database.
The update from User 2 is replicated to the consolidated database.
When the SQL Remote Message Agent replicates UPDATE statements, it does so as a separate UPDATE for each row. Also, the message contains the old row values for comparison. When the update from user 2 arrives at the consolidated database, the values in the row are not those recorded in the message.
By default, the UPDATE still proceeds, so that the User 2 update (the last to reach the consolidated database) becomes the value in the consolidated database, and is replicated to all other databases subscribed to that row.
In general, the default method of conflict resolution is that the most recent operation (in this case that from User 2) succeeds, and no report is made of the conflict. The update from User 1 is lost. SQL Remote also allows custom conflict resolution, using a trigger to resolve conflicts in a way that makes sense for the data being changed.
Conflict resolution does not apply to primary key updates |
This section describes how you can build conflict resolution into your SQL Remote installation at the consolidated database.
SQL Remote replication messages include UPDATE statements as a set of single row updates, each with a VERIFY clause that includes values prior to updating.
An UPDATE conflict is detected by the database server as a failure of the VERIFY clause values to match the rows in the database.
Conflicts are detected and resolved by the Message Agent, but only at a consolidated database. When an UPDATE conflict is detected in a message from a remote database, the Message Agent causes the database server to take two actions:
Any conflict resolution (RESOLVE UPDATE) triggers are fired.
The UPDATE is applied.
UPDATE statements are applied even if the VERIFY clause values do not match, whether or not there is a RESOLVE UPDATE trigger.
Conflict resolution can take several forms. For example:
In some applications, resolution could mean reporting the conflict into a table.
You may wish to keep updates made at the consolidated database in preference to those made at remote sites.
Conflict resolution can be more sophisticated, for example in resolving inventory numbers in the face of goods deliveries and orders.
The method of conflict resolution is different at an Adaptive Server Enterprise consolidated database. For more information, see How SQL Remote handles conflicts.
This section describes what you need to do to implement custom conflict resolution in SQL Remote for Adaptive Server Anywhere. The concepts are the same in SQL Remote for Adaptive Server Enterprise, but the implementation is different.
SQL Remote allows you to define conflict resolution triggers to handle UPDATE conflicts. Conflict resolution triggers are fired only at a consolidated database, when messages are applied by a remote user. When an UPDATE conflict is detected at a consolidated database, the following sequence of events takes place.
Any conflict resolution triggers defined for the operation are fired.
The UPDATE takes place.
Any actions of the trigger, as well as the UPDATE, are replicated to all remote databases, including the sender of the message that triggered the conflict.
In general, SQL Remote for Adaptive Server Anywhere does not replicate the actions of triggers: the trigger is assumed to be present at the remote database. Conflict resolution triggers are fired only at consolidated databases, and so their actions are replicated to remote databases.
At remote databases, no RESOLVE UPDATE triggers are fired when a message from a consolidated database contains an UPDATE conflict.
The UPDATE is carried out at the remote databases.
At the end of the process, the data is consistent throughout the setup.
UPDATE conflicts cannot happen where data is shared for reading, but each row (as identified by its primary key) is updated at only one site. They only occur when data is being updated at more than one site.
This section describes how to use RESOLVE UPDATE, or conflict resolution triggers.
Conflict resolution triggers are fired by the failure of values in the VERIFY clause of an UPDATE statement to match the values in the database before the update. An UPDATE statement with a VERIFY clause takes the following form:
UPDATE table-list
SET column-name = expression, ...
[ FROM table-list ]
[ VERIFY (column-name, ...)
VALUES ( expression, ...) ]
[ WHERE search-condition ]
The VERIFY clause compares the values of specified columns to a set of expected values, which are the values that were present in the publisher database when the UPDATE statement was applied there.
The verify clause is useful only for single-row updates. However, multi-row update statements entered at a database are replicated as a set of single-row updates by the Message Agent, so this imposes no constraints on client applications.
The syntax for a RESOLVE UPDATE trigger is as follows:
CREATE TRIGGER trigger-name
RESOLVE UPDATE
OF column-name ON table-name
[ REFERENCING [ OLD AS old_val ]
[ NEW AS new_val ]
[ REMOTE AS remote_val ] ]
FOR EACH ROW
BEGIN
...
END
RESOLVE UPDATE triggers fire before each row is updated. The REFERENCING clause allows access to the values in the row of the table to be updated (OLD), to the values the row is to be updated to (NEW) and to the rows that should be present according to the VERIFY clause (REMOTE). Only columns present in the VERIFY clause can be referenced in the REMOTE AS clause; other columns produce a "column not found" error.
The database option VERIFY_ALL_COLUMNS is OFF by default. If it is set to ON, all columns are verified on replicated updates, and a RESOLVE UPDATE trigger is fired whenever any column is different. If it is set to OFF, only those columns that are updated are checked.
Setting this option to ON makes messages bigger, because more information is sent for each UPDATE.
If this option is set at the consolidated database before remote databases are extracted, it will be set at the remote databases also.
You can set the VERIFY_ALL_COLUMNS option either for the PUBLIC group or just for the user contained in the Message Agent connection string.
The CURRENT REMOTE USER special constant holds the user ID of the remote user sending the message. This can be used in RESOLVE UPDATE triggers that place reports of conflicts into a table, to identify the user producing a conflict.
This section describes some ways of using RESOLVE UPDATE triggers to handle conflicts.
Suppose a table in a contact management system has a column holding the most recent contact with each customer.
One representative talks with a customer on a Friday, but does not upload his changes to the consolidated database until the next Monday. Meanwhile, a second representative meets the customer on the Saturday, and updates the changes that evening.
There is no conflict when the Saturday UPDATE is replicated to the consolidated database, but when the Monday UPDATE arrives it finds the row already changed.
By default, the Monday UPDATE would proceed, leaving the column with the incorrect information that the most recent contact occurred on Friday.
Update conflicts on this column should be resolved by inserting the most recent date in the row.
The following RESOLVE UPDATE trigger chooses the most recent of the two new values and enters it in the database.
CREATE TRIGGER contact_date RESOLVE UPDATE ON contact REFERENCING OLD AS old_name NEW AS new_name FOR EACH ROW BEGIN IF new_name.contact_date < old_name.contact_date THEN SET new_name.contact_date = old_name.contact_date END IF END
If the value being updated is later than the value that would replace it, the new value is reset to leave the entry unchanged.
Consider a warehouse system for a manufacturer of sporting goods. There is a table of product information, with a quantity column holding the number of each product left in stock. An update to this column will typically deplete the quantity in stock or, if a new shipment is brought in, add to it.
A sales representative at a remote database enters an order, depleting the stock of small tank top tee shirts by five, from 28 to 23, and enters this in on her database. Meanwhile, before this update is replicated to the consolidated database, a new shipment of tee shirts comes in, and the warehouse enters the shipment, adding 40 to the quantity column to make it 68.
The warehouse entry gets added to the database: the quantity column now shows there are 68 small tank-top tee shirts in stock. When the update from the sales representative arrives, it causes a conflict-Adaptive Server Anywhere detects that the update is from 28 to 23, but that the current value of the column is 68.
By default, the most recent UPDATE succeeds, and the inventory level is set to the incorrect value of 23.
In this case the conflict should be resolved by summing the changes to the inventory column to produce the final result, so that a final value of 63 is placed into the database.
A suitable RESOLVE UPDATE trigger for this situation would add the increments from the two updates. For example:
CREATE TRIGGER resolve_quantity RESOLVE UPDATE OF quantity ON "DBA".product REFERENCING OLD AS old_name NEW AS new_name REMOTE AS remote_name FOR EACH ROW BEGIN SET new_name.quantity = new_name.quantity + old_name.quantity - remote_name.quantity END
This trigger adds the difference between the old value in the consolidated database (68) and the old value in the remote database when the original UPDATE was executed (28) to the new value being sent, before the UPDATE is implemented. Thus, new_val.quantity becomes 63 (= 23 + 68 - 28), and this value is entered into the quantity column.
Consistency is maintained at the remote database as follows:
The original remote UPDATE changed the value from 28 to 23.
The warehouse's entry is replicated to the remote database, but fails as the old value is not what was expected.
The changes made by the RESOLVE UPDATE trigger are replicated to the remote database.
In some cases, you may not want to alter the default way in which SQL Remote resolves conflicts; you may just want to report the conflicts by storing them in a table. In this way, you can look at the conflict table to see what, if any, conflicts have occurred, and if necessary take action to resolve the conflicts.
The tables in a relational database are related through foreign key references. The referential integrity constraints applied as a consequence of these references ensure that the database remains consistent. If you wish to replicate only a part of a database, there are potential problems with the referential integrity of the replicated database.
Referential integrity errors stop replication |
By paying attention to referential integrity issues while designing publications you can avoid these problems. This section describes some of the more common integrity problems and suggests ways to avoid them.
The sales publication described in Publishing a set of tables includes the sales_order table:
CREATE PUBLICATION pub_sales ( TABLE customer, TABLE sales_order, TABLE sales_order_items, TABLE product )
The sales_order table has a foreign key to the employee table. The id of the sales rep is a foreign key in the sales_order table referencing the primary key of the employee table. However, the employee table is not included in the publication.
If the publication is created in this manner, new sales orders would fail to replicate unless the remote database has the foreign key reference removed from the sales_order table.
If you use the extraction utility to create the remote databases, the foreign key reference is automatically excluded from the remote database, and this problem is avoided. However, there is no constraint in the database to prevent an invalid value from being inserted into the sales_rep_id column of the sales_order table, and if this happens the INSERT will fail at the consolidated database. To avoid this problem, you can include the employee table (or at least its primary key) in the publication.
Actions performed by triggers are not replicated: triggers that exist at one database in a SQL Remote setup are assumed by the replication procedure to exist at other databases in the setup. When an action that fires a trigger at the consolidated database is replicated at the replicate site, the trigger is automatically fired. By default, the database extraction utility extracts the trigger definitions, so that they are in place at the remote database also.
If a publication includes only a subset of a database, a trigger at the consolidated database may refer to tables or rows that are present at the consolidated database, but not at the remote databases. You can design your triggers to avoid such errors by making actions of the trigger conditional using an IF statement. The following list suggests some ways in which triggers can be designed to work on consolidated and remote databases.
Have actions of the trigger be conditional on the value of CURRENT PUBLISHER. In this case, the trigger would not execute certain actions at the remote database.
Have actions of the trigger be conditional on the object_id function not returning NULL. The object_id function takes a table or other object as argument, and returns the ID number of that object or NULL if the object does not exist.
Have actions of the trigger be conditional on a SELECT statement which determines if rows exist.
The RESOLVE UPDATE trigger is a special trigger type for the resolution of UPDATE conflicts, and is discussed in the section Conflict resolution examples. The actions of RESOLVE UPDATE triggers are replicated to remote databases, including the database that caused the conflict.