Monday, July 10, 2006

Unidirectional synchronisation

In the newsgroups a frequent question is on how to synchronise data between two locations without the use of replication.

Once again, the set-based aproach proves to be the most transparent, the most efficient and the most easy to uderstand.

In this example we'll look at unidirectional synchronisation, which as a matter of facts isn't as much about synchronisation as much as it is about the propagation of data changes from the source to the destination.

In brief, these are the rules involved:

  • Insert the rows from the Source table that do not exist in the Destination table;
  • Update the rows in the Destination table that are different from the rows in the Source table; and
  • Delete rows from the Destination table that do not exist in the Source table.

Elementary.

First of all, here are the tables. We'll be propagating data from the Remote table to the Local table. A primary key, or at least a candidate key, is - of course - a must. In this example the Product name is the primary key.

-- Remote Table
create table dbo.RemoteProductStatus
 (
 ProductStatusId int  identity (1, 1) not null
 ,DateChanged datetime   not null
 ,ProductName varchar(512)   not null
 ,ProductStatus varchar(10)   not null
 )
go

alter table dbo.RemoteProductStatus
 add constraint pk_RemoteProductStatus_ProductName
   primary key nonclustered
   (
   ProductName
   )
go

create unique clustered index ux_RemoteProductStatus_ProductStatusId
 on dbo.RemoteProductStatus
  (
  ProductStatusId
  )
go

alter table dbo.RemoteProductStatus
 add constraint chk_RemoteProductStatus_ProductStatus
   check (ProductStatus in ('out', 'returned', 'destroyed'))
go

-- Local Table
create table dbo.LocalProductStatus
 (
 ProductStatusId int  identity (1, 1) not null
 ,DateChanged datetime   not null
 ,ProductName varchar(512)   not null
 ,ProductStatus varchar(10)   not null
 )
go

alter table dbo.LocalProductStatus
 add constraint pk_LocalProductStatus_ProductName
   primary key nonclustered
   (
   ProductName
   )
go

create unique clustered index ux_LocalProductStatus_ProductStatusId
 on dbo.LocalProductStatus
  (
  ProductStatusId
  )
go

alter table dbo.LocalProductStatus
 add constraint chk_LocalProductStatus_ProductStatus
   check (ProductStatus in ('out', 'returned', 'destroyed'))
go

Now here comes the data:

insert dbo.LocalProductStatus
 (
 DateChanged
 ,ProductName
 ,ProductStatus
 )
 select dateadd(m, -6, getdate()) as DateChanged
  ,'Winter coat' as ProductName
  ,'out' as ProductStatus
 union all
 select dateadd(m, -6, getdate())
  ,'Skis'
  ,'out'
 union all
 select dateadd(m, -6, getdate())
  ,'Helmet'
  ,'out'
 union all
 select dateadd(m, -6, getdate())
  ,'Bathing suit'
  ,'destroyed'
go

insert dbo.RemoteProductStatus
 (
 DateChanged
 ,ProductName
 ,ProductStatus
 )
 select getdate() as DateChanged
  ,'Sun glasses' as ProductName
  ,'out' as ProductStatus
 union all
 select getdate()
  ,'Beach towel'
  ,'out'
 union all
 select getdate()
  ,'Winter coat'
  ,'returned'
 union all
 select getdate()
  ,'Skis'
  ,'returned'
 union all
 select getdate()
  ,'Helmet'
  ,'destroyed'
go

The three tasks described above are extremely simple with the set-based logic:

  1. Insert - rows from the Remote table that do not exist in the Local table:
    insert dbo.LocalProductStatus
     (
     DateChanged
     ,ProductName
     ,ProductStatus
     )
     select dbo.RemoteProductStatus.DateChanged as DateChanged
      ,dbo.RemoteProductStatus.ProductName as ProductName
      ,dbo.RemoteProductStatus.ProductStatus as ProductStatus
      from dbo.RemoteProductStatus
      where (not exists (
        select *
         from dbo.LocalProductStatus
         where (dbo.LocalProductStatus.ProductName
            = dbo.RemoteProductStatus.ProductName)
        ))
  2. Update - change the rows in the Local table to correspond to the rows in the Remote table:
    update dbo.LocalProductStatus
     set dbo.LocalProductStatus.DateChanged
       = dbo.RemoteProductStatus.DateChanged
      ,dbo.LocalProductStatus.ProductStatus
       = dbo.RemoteProductStatus.ProductStatus
     from dbo.LocalProductStatus
      inner join dbo.RemoteProductStatus
        on dbo.RemoteProductStatus.ProductName
          = dbo.LocalProductStatus.ProductName
     where (dbo.LocalProductStatus.DateChanged
        != dbo.RemoteProductStatus.DateChanged)
      or (dbo.LocalProductStatus.ProductStatus
        != dbo.RemoteProductStatus.ProductStatus)
  3. Delete - simply remove rows from the Local table that no longer exist in the Remote table:
    delete dbo.LocalProductStatus
     where (not exists (
       select *
        from dbo.RemoteProductStatus
        where (dbo.RemoteProductStatus.ProductName
           = dbo.LocalProductStatus.ProductName)
       ))

No fuss.


ML

No comments: