Sunday, 7 January 2018

How to Configure Transnational Replication with Multiple Publishers, Single Subscriber



While providing a replication solution to one of my client in SQL Server 2016, I came across a scenario where the client had retail stores at various locations and all data from those stores had to be synchronized at a central location in real time. The aim of this synchronization was to make it easy for senior management to analyze all stores performance from a single location. After giving considerations to different options, I finally came up with a solution that worked for my client.

Let’s assume we have server  RBTSERV1 and RBTSERV2, which will act as publisher for 2 stores. RBTSERV3 will act as a subscriber and data from both servers RBTSERV1 and RBTSERV2 will be synchronized to RBTSERV3.

There were several tables that needed to be synchronized but to keep it simple, I will take 1 table in our example.

Notice that All stores have a unique StoreId assigned to them.Let’s create some sample table and data on all servers.

--RBTSERV1:PUBLISHER 1
create database DBPub1
use DBPub1
create table CustomerOrders(StoreID int ,OrderId int, ItemName varchar(100), Qty int, OrderDate datetime default getdate()
, CONSTRAINT PK_CustomerOrders PRIMARY KEY (StoreID,OrderId))

INSERT INTO CustomerOrders VALUES (1,1,'Books',5,Getdate())
INSERT INTO CustomerOrders VALUES (1,2,'Toys',3,Getdate())
---------------------------------
--RBTSERV2: PUBLISHER2
create database DBPub2
use DBPub2
create table CustomerOrders(StoreID int ,OrderId int, ItemName varchar(100), Qty int, OrderDate datetime default getdate()
, CONSTRAINT PK_CustomerOrders PRIMARY KEY (StoreID,OrderId))

INSERT INTO CustomerOrders VALUES (2,1,'fans',15,Getdate())
INSERT INTO CustomerOrders VALUES (2,2,'pens',13,Getdate())
---------------------------------
--RBTSERV3:SUBSCRIBER
create table CustomerOrders(StoreID int ,OrderId int, ItemName varchar(100), Qty int, OrderDate datetime default getdate()
, CONSTRAINT PK_CustomerOrders PRIMARY KEY (StoreID,OrderId))

So publisher RBTSERV1 has 2 records in the table and publisher RBTSERV2 also has 2 records in the table. Subscriber doesn’t have any records.

Now let’s start with our replication.

Configuring Publication on RBTSERV1:
To setup a publication on RBTSERV1, Follow the below steps:
1.       Right Click on folder “Local Publication” and select “New Publication...”



2.       In the publication wizard, select the database to be used for Publication. In our case, we are using DBPub1.



3.       Since we have to make the sync almost real time, so we choose Transaction replication.





4.       Next we pick the table we wish to use for our replication. Here the important point is to choose the “Properties for all Table Articles” under “Article Properties” as below:
Action if name is in use: Keep Existing Object Unchaged



5.       Next we leave the “Create a Snapshot Immediately...” option unchecked.




6.       Under Agent security, choose the options below. Note that on my server, SQL and Agent services are running under administrator account.




7.       Give your publication a name and click Finish.



8.       The publication wizard should finish without any error.




9.       Next repeat the same steps for RBTSERV2 and give your publication a name SERV2_Publication.
10.   Now right click on Publication SERV1_Publication and choose “New Subscriptions...”.  In the Subscription wizard, choose your publication as SERV1_Publication





11.    Then you select if you wish to have a push or pull subscription. Here I have used Push Subscriptions






12.   Now you select your subscriber name and the subscription database. Here I have used RBTSERV3 as subscriber and DBSubs as subscriber database.



13.   Configure the distributor Agent security.



14.   Choose to run the distributor Agent Continuously.



15.   It is important not to initialize the subscription as we will be manually synchronizing the old data.




16.   Finally finish the subscription wizard.



17.   Now if you check the log reader agent status of RBTSERV1 or RBTSERV2, you should see it running fine. 



18.   And now if you check data in subscriber table using below query, there should be no data. SELECT *   FROM [DBSubs].[dbo].[CustomerOrders]

Now let’s add some new records to both servers.

--RBTSERV1
INSERT INTO CustomerOrders VALUES (1,5,'notebooks',15,Getdate())
--RBTSERV2
INSERT INTO CustomerOrders VALUES (2,6,'erasers',150,Getdate())


19. Now finally check that the new records from both stores have come to subscriber table.



20.   Now you can sync the old records from both stores either using a SQL query or export import wizard. Now in future if you wish to reinitialize just one store, you may delete the data of that store from subscriber and manually sync the data from that store using queries
or export import wizard.


 So that's that. Hope it helped. If you have any issue in understanding the article, feel free to contact us and we will surely help you. Call or whatsapp - +91 997148322

email- support@redbushtechnologies.

Author- Suresh Kumar - A seasoned SQL DBA with more than 15 years of experience in working with fortune 500 MNCs.

Thank you
Team RedBush



Sunday, 17 December 2017

Which Indexes are required for SQL Server Query ?

'


A lot of time Jr. DBA/Developer and students are not clear which Index should be created for performance tuning a query. In my last 16 years as SQL DBA, I have taken more than 500 interviews at different levels and I have seen that even very Senior DBA are not clear at times as to what index will properly support my query.

So let’s see a very brief and clear explanation on what index will improve performance of my query and why:
We will use the below Sample queries:

CREATE TABLE [dbo].[Buildings](
       [Buildingid] [int] NOT NULL,
       [BuildingName] [varchar](50) NULL,
       [BuildingLocation] [varchar](50) NULL,
PRIMARY KEY CLUSTERED
(
       [Buildingid] ASC
)
)


--INSERT SOME RECORDS
INSERT INTO Buildings values (1,'Empire State','NYC')
INSERT INTO Buildings values (2,'Building2','NDLS')
INSERT INTO Buildings values (3,'Building3','NDLS')
INSERT INTO Buildings values (4,'Building4','Mumbai')
INSERT INTO Buildings values (5,'Building5','Mumbai')
INSERT INTO Buildings values (6,'Building6','Mumbai')
INSERT INTO Buildings values (7,'Building7','LA')
INSERT INTO Buildings values (8,'Building8','LA')
INSERT INTO Buildings values (9,'Building9','LAS')
INSERT INTO Buildings values (10,'Building10','LAS')

--Clustered Index Seek
select Buildingid FROM Buildings WHERE Buildingid=1




As you can see above, since we have where clause on Clustered Index and SELECT also have a clustered column, a clustered Index seek will happen.


--Clustered Index Scan
select Buildingid FROM Buildings WHERE BuildingName='Empire State'



Now since where clause has a non clustered column, there is no way for query optimizer to fetch data for BuildingName. So as shown above, entire clustered index will be scanned. NOTE that clustered index in nothing but table itself which is sorted based on some column, BuildingId here.

--Create a supporting Index
create index IDX_Buildings_BuildingName on Buildings(BuildingName)

--Scan Is converted to non clustered Index Seek
select Buildingid FROM Buildings WHERE BuildingName='Empire State'


As you can see above, Scan Is converted to non clustered Index Seek and new index is used.


--A scan or Lookup

select BuildingLocation FROM Buildings WHERE BuildingName='Empire State'


The above query will give Clustered Index Scan or a Key Lookup depending upon amount of data. Here data is less, so optimizer decides to do a scan . Why this Happened? Because BuildingLocation is not part of any index. So there is no way for optimizer to get this data, so it has to do a scan. To do away with this scan, we need to create a covering index.
--Covering Index
create index IDX_Buildings_BuildingName2 on Buildings(BuildingName) INCLUDE(BuildingLocation)
--A Perfect Seek now
select BuildingLocation FROM Buildings WHERE BuildingName='Empire State'


Now the index IDX_Buildings_BuildingName2 is used and a perfect seek happens because query has all supporting indexes now.
Now do we really need IDX_Buildings_BuildingName any more. Not actually. Figure out why not.