Data Replication with SQL Remote
PART 2. Replication Design for SQL Remote
CHAPTER 8. SQL Remote Design for Adaptive Server Enterprise
Users at physically distinct sites can each INSERT new rows to a table, so there is an obvious problem ensuring that primary key values are kept unique.
If two users INSERT a row using the same primary key values, the second INSERT to reach a given database in the replication system will fail. As SQL Remote is a replication system for occasionally-connected users, there can be no locking mechanism across all databases in the installation. It is necessary to design your SQL Remote installation so that primary key errors do not occur.
For primary key errors to be designed out of SQL Remote installations; the primary keys of tables that may be modified at more than one site must be guaranteed unique. There are several ways of achieving this goal. This chapter describes a general, economical and reliable method that uses a pool of primary key values for each site in the installation.
The primary key pool is a table that holds a set of primary key values for each database in the SQL Remote installation. Each remote user receives their own set of primary key values. When a remote user inserts a new row into a table, they use a stored procedure to select a valid primary key from the pool. The pool is maintained by periodically running a procedure at the consolidated database that replenishes the supply.
The method is described using a simple example database consisting of sales representatives and their customers. The tables are much simpler than you would use in a real database; this allows us to focus just on those issues important for replication.
The pool of primary keys is held in a separate table. The following CREATE TABLE statement creates a primary key pool table:
CREATE TABLE KeyPool ( table_name VARCHAR(40) NOT NULL, value INTEGER NOT NULL, location VARCHAR(6) NOT NULL, PRIMARY KEY (table_name, value), ) go
The columns of this table have the following meanings:
Column |
Description |
---|---|
table_name |
Holds the names of tables for which primary key pools must be maintained. In our simple example, if new sales representatives were to be added only at the consolidated database, only the Customer table needs a primary key pool and this column is redundant. It is included to show a general solution. |
value |
Holds a list of primary key values. Each value is unique for each table listed in table_name. |
location |
In some setups, this could be the same as the rep_key value of the SalesRep table. In other setups, there will be users other than sales representatives and the two identifiers should be distinct. |
For performance reasons, you may wish to create an index on the table:
CREATE INDEX KeyPoolLocation ON KeyPool (table_name, location, value) go
You can either incorporate the key pool into an existing publication, or share it as a separate publication. In this example, we create a separate publication for the primary key pool.
Create a publication for the primary key pool data.
sp_create_publication 'KeyPoolData' go sp_add_remote_table 'KeyPool' go sp_add_article 'KeyPoolData', 'KeyPool', NULL, 'location' go
Create subscriptions for each remote database to the KeyPoolData publication.
sp_subscription 'create', KeyPoolData, field_user, rep1 go
The subscription argument is the location identifier.
In some circumstances it makes sense to add the KeyPool table to an existing publication and use the same argument to subscribe to each publication. Here we keep the location and rep_key values distinct to provide a more general solution.
Every time a user adds a new customer, their pool of available primary keys is depleted by one. The primary key pool table needs to be periodically replenished at the consolidated database using a procedure such as the following:
CREATE PROCEDURE ReplenishPool AS BEGIN DECLARE @CurrTable VARCHAR(40) DECLARE @MaxValue INTEGER DECLARE EachTable CURSOR FOR SELECT table_name, max(value) FROM KeyPool GROUP BY table_name DECLARE @CurrLoc VARCHAR(6) DECLARE @NumValues INTEGER DECLARE EachLoc CURSOR FOR SELECT location, count(*) FROM KeyPool WHERE table_name = @CurrTable GROUP BY location OPEN EachTable WHILE 1=1 BEGIN FETCH EachTable INTO @CurrTable, @MaxValue IF @@sqlstatus != 0 BREAK OPEN EachLoc WHILE 1=1 BEGIN FETCH EachLoc INTO @CurrLoc, @NumValues IF @@sqlstatus != 0 BREAK -- make sure there are 10 values WHILE @NumValues < 10 BEGIN SELECT @MaxValue = @MaxValue + 1 SELECT @NumValues = @NumValues + 1 INSERT INTO KeyPool (table_name, location, value) VALUES (@CurrTable, @CurrLoc, @MaxValue) END END CLOSE EachLoc END CLOSE EachTable END go
This procedure fills the pool for each user up to ten values. You may wish to use a larger value in a production environment. The value you need depends on how often users are inserting rows into the tables in the database.
The ReplenishPool procedure must be run periodically at the consolidated database to refill the pool of primary key values in the KeyPool table.
The ReplenishPool procedure requires at least one primary key value to exist for each subscriber, so that it can find the maximum value and add one to generate the next set. To initially fill the pool you can insert a single value for each user, and then call ReplenishPool to fill up the rest. The following example illustrates this for three remote users and a single consolidated user named Office:
INSERT INTO KeyPool VALUES( 'Customer', 40, 'rep1' ) INSERT INTO KeyPool VALUES( 'Customer', 41, 'rep2' ) INSERT INTO KeyPool VALUES( 'Customer', 42, 'rep3' ) INSERT INTO KeyPool VALUES( 'Customer', 43, 'Office') EXEC ReplenishPool go
Cannot use a trigger to replenish the key pool |
When a sales representative wants to add a new customer to the Customer table, the primary key value to be inserted is obtained using a stored procedure. This example shows a stored procedure to supply the primary key value, and also illustrates a stored procedure to carry out the INSERT.
The procedures takes advantage of the fact that the Sales Rep identifier is the CURRENT PUBLISHER of the remote database.
NewKey procedure The NewKey procedure supplies an integer value from the key pool and deletes the value from the pool.
CREATE PROCEDURE NewKey @TableName VARCHAR(40), @Location VARCHAR(6), @Value INTEGER OUTPUT AS BEGIN DECLARE @NumValues INTEGER SELECT @NumValues = count(*), @Value = min(value) FROM KeyPool WHERE table_name = @TableName AND location = @Location IF @NumValues > 1 DELETE FROM KeyPool WHERE table_name = @TableName AND value = @Value ELSE -- Never take the last value, -- because RestorePool will not work. -- The key pool should be kept large -- enough so this never happens. SELECT @Value = NULL END
NewCustomer procedure The NewCustomer procedure inserts a new customer into the table, using the value obtained by NewKey to construct the primary key.
CREATE PROCEDURE NewCustomer @name VARCHAR(40), @loc VARCHAR(6) AS BEGIN DECLARE @cust INTEGER DECLARE @cust_key VARCHAR(12) EXEC NewKey 'Customer', @loc, @cust output SELECT @cust_key = 'cust' + convert( VARCHAR(12), @cust ) INSERT INTO Customer (cust_key, name, rep_key ) VALUES ( @cust_key, @name, @loc ) END
You may want to enhance this procedure by testing the @cust value obtained from NewKey to check that it is not NULL, and preventing the insert if it is NULL.
Re-extract a remote database using the field_user user ID.
Try this sample INSERT at the remote and consolidated sites:
EXEC NewCustomer 'Great White North', rep1
The primary key pool technique requires the following components:
Key pool table A table to hold valid primary key values for each database in the installation.
Replenishment procedure A stored procedure keeps the key pool table filled.
Sharing of key pools Each database in the installation must subscribe to its own set of valid values from the key pool table.
Data entry procedures New rows are entered using a stored procedure that picks the next valid primary key value from the pool and delete that value from the key pool.