Troubleshooting Transactional Replication - PART 1

“Repl Talk”

--Chris Skorlinski
--Microsoft SQL Server Escalation Services

This new “Repl Talk” series of posting will cover troubleshooting Microsoft SQL Server Transactional Replication.  I’ll cover everything from setting up Tran Repl through troubleshooting common problems.

Most of the material comes from presentation and training I deliver to both internal SQL Support teams and to external Microsoft customers. My intent is for you to use this material as a step-by-step for learning Transactional Replication. All script I use will be posted in the blog.  Along with the scripts will be key “take a ways” which you can use while planning and troubleshooting Tran Replication.

Future sessions will include Peer-2-Peer and Merge Replication.

Now lets begin.

Transactional Replication Overview

Some of this content is taken from SQL BOOK ONLINE (BOL).  Sound fair since I helped write SQL BOL, now don’t blame me for it all, I didn’t write that much.

Transactional replication typically starts with a snapshot of the publication database objects and data. As soon as the initial snapshot is taken, subsequent data changes and schema modifications made at the Publisher are usually delivered to the Subscriber as they occur (in near real time). The data changes are applied to the Subscriber in the same order and within the same transaction boundaries as they occurred at the Publisher; therefore, within a publication, transactional consistency is guaranteed.

Transactional replication is typically used in server-to-server environments and is appropriate in each of the following cases:

  • You want incremental changes to be propagated to Subscribers as they occur.
  • The application requires low latency between the time changes are made at the Publisher and the changes arrive at the Subscriber.
  • The application requires access to intermediate data states. For example, if a row changes five times, transactional replication allows an application to respond to each change (such as firing a trigger), not simply the net data change to the row.
  • The Publisher has a very high volume of insert, update, and delete activity.
  • The Publisher or Subscriber is a non-SQL Server database, such as Oracle.

By default, Subscribers to transactional publications should be treated as read-only, because changes are not propagated back to the Publisher.

 

image

 
Transactional Replication Workflow

image

 

 

As I mentioned above, there are 3 Transactional Replication Roles to move data from one Microsoft SQL Server to another Microsoft SQL Server. These are the Publisher, Distributor, and Subscriber.  Actually, out-of-the-box we also support Moving from Oracle to SQL and SQL to Oracle, but I’ll not be covering that here.

To move the data SQL Server includes 2 Executables; LogRead.exe and Distrib.exe.  When troubleshooting Transactional Replication we break down the work of LogRead and Distrib each into there respective Reader Threads and Writer Threads.  This help isolate where we need to investigate.

These executable call Stored Procedures to accomplish work and record status in “system tables”. When we’re troubleshooting performance , we look to see how long certain SPs are taking to run. When error, we trace SP execution and compare SP log to data in replication tracking table.

Combination of Profile Trace and “meta data” from Replication “system tables” often provide the clues we need to understand what is occurring under-the-covers.

Once in a while we need to debug .EXE execution. 

We’ll take a closer look at different ways to configure Replication roles.

 

Replication Roles

image

There are many ways we can implement these roles for Transactional Replication.  Which configuration you select depends on your business needs.

We can have ALL roles on same SQL Server, but with different databases for each role.  Each DB/Table can have own set of indexes.  This isolates the work needed in one database vs. another database.  For example, the Publisher could have minimal indexes optimized for online web application and the Subscribe indexes for daily, weekly, or monthly reporting.

Another role would be as an ARCHIVE or AUDIT server.  You can customize Replication SPs to commend out DELETE commands and thereby only replicate INSERT/UPDATES.

It is common to the see combination of roles on the same SQL Server such as Pub/Dist or Dist/Sub.  If you combine roles, created databases on different physical drives (or LUNS) to spread the IO demands.

As you work with Replication you’ll near the terms PUSH or PULL.  Just remember, the only real difference between PUSH and PULL is on which server the EXE executes.  For example, the Distribution Server could be running PULL LogRead.exe and PUSH Distrib.exe on save server.  But be careful, depending on your setup you could have too many executables running on the same computer.  For example, 100 database PUSHING each to 20 subscribers would by default have 2000 DISTRIB.exe(s) running on one Distributor.

If you would like more information on terms used, most can be found in the SQL Server Books Online or by searching MSDN ( http://msdn.microsoft.com/en-us/default.aspx ).

Initial Snapshot

Part-2 will cover setting up Transactional Replication using a SNAPSHOT.  For this and all other walk-through scripts you’ll need to download and install the AdventureWorks “light” sample database.  The download path may change so just search WEB for AdventureWorksLT.msi and I’m sure you’ll find the link.