Divide and Conquer Transactional Replication using Tracer Tokens

Chris Skorlinski
Microsoft SQL Server Escalation Services

As we know, Transactional Replication moves data from the Publisher to the Distributor then to the Subscribers.  When working, data flows near real-time through both of these steps.  When problems occur which interrupt this flow, latency is introduced.  The challenge is to find which step is contributing to the overall latency.  Tracer Tokens help you determine of the latency problem is 1) Publisher to Distributor, or 2) Distributor to Subscriber.

Trace Tokens

Available from Replication Monitor or via TSQL statements, Tracer Tokens are special timestamp transactions written to the Publisher’s Transaction Log and picked up by the Log Reader.  They are then read by the Distribution Agent and written to the Subscriber.  Timestamps for each step are recorded in tracking tables in the Distribution Database and can be displayed in Replication Monitor or via TSQL statements.  Lets take a look.

From SQL Server Management Studio, right-click Replication, then Launch Replication Monitor

image

In Replication Monitor select the Transactional Replication publication, then “Tracer Tokens”.  To insert new Token click “Insert Tracer”.  When Tracer Token is inserted Replication Monitor indicates “Pending…”

image

When Log Reader picks up Token it records time in MStracer_tokens table in the Distribution database.  The Distribution Agent then picks up the Token and records Subscriber(s) write time in the MStracer_history tables also in the Distribution database.

Examples Tracer Token Readings

In this example latency is running about 6 seconds.  Writes to the subscribers are 1 to 2 seconds however its taking 5 seconds for the Tracer Token to move from the Publisher to the Distributor.  Focus of the Transactional Latency clearly should start with the Log Reader and not with the Distributors. 

image

In this example it took 3 seconds for the Token to move to the Distributor, 1 seconds to Subscriber 1 and 4 seconds to Subscribers 2.  While Log Readers not as fast as expected, improving latency to Subscriber 2 will cut in half overall latency.

image

TSQL Commands to explore Tracer Tokens

The TSQL command below sp_posttracertoken can be used to POST a Tracer Token.  This command can be executed in SQL Server Management Studio or for running historical reference create a SQL Agent Job and post a Tracer Token at scheduled interval.

--For a historical perspective create

--a SQL Agent JOB to insert a new

--Tracer Token in the publication database.

USE [AdventureWorksLT]

Go

EXEC sys.sp_posttracertoken @publication = 'TranProducts'

Go

--Token Tracking Tables

USE Distribution

Go

 

--publisher_commit

SELECT Top 20 * FROM MStracer_tokens

Order by tracer_id desc

 

--subscriber_commit

SELECT Top 20 * FROM MStracer_history

Order by parent_tracer_id desc

 

“Publisher to Distributor” (Log Reader) still showing “Pending”. 

“Null” in Tracer Token tables or “Pending” in Replication Monitor indicates Log Reader is not running or there are so many un-replicated transactions the Log Reader has yet to read the Tracer Token from the Transaction Log.  First, verify the Log Reader is running and no errors have been reported in the Log Reader History.

Using Replication Monitor selecting Agents we clearly see the Log Reader Agent is not running.  Last Action shows it didn’t fail, but was Stopped.

image

Right-click the Log Reader Agent to “View Details”.  Agent statistics along with errors encountered will be displayed in the “Details”.  Note the “Stop Agent” is grayed out and “Start Agent” is selectable.  Again confirming the Log Reader Agent is stopped.  This menu also provides access to the Agent Profiles and individual command line properties for customizing the Log Reader Agent settings.

image

Tracer Token Tracking Tables

You can also cross verify by looking at the last updates recorded in the MSlogreader_history table.  If you have more then one Publication include a WHERE clause for the correct agent_id.

--Check Log Reader Agents Table for correct Agent_id

SELECT id, publisher_db

FROM distribution.dbo.MSlogreader_agents

 

id          publisher_db

----------- ---------------------------------------------

1           AdventureWorksLT

 

 

--Check Agent History Table for last updates

SELECT top 10 agent_id, start_time,comments

FROM distribution.dbo.MSlogreader_history

ORDER BY time desc

 

agent_id    start_time              comments

----------- ----------------------- ---------------------------------------------

1           2010-03-07 12:46:21.503 The process was successfully stopped.

1           2010-03-07 12:46:21.503 No replicated transactions are available.

Log Reader Latency

If you see NULL showing, again this indicates the Log Reader has not yet read the Tracer Token in the published database’s transaction log. If you’ve verify the Log Reader Agent IS running but Tracer Token are still not being picked up, check out ReplTalk blog posting on Troubleshooting Replication Conversations for additional troubleshooting steps.

--publisher_commit

SELECT Top 20 * FROM MStracer_tokens

Order by tracer_id desc

 

tracer_id   publication_id publisher_commit        distributor_commit

----------- -------------- ----------------------- -----------------------

-2147483431 1              2010-03-10 20:08:00.400 NULL

 

Subscriber Offline

As this screenshot shows, even without Tracer Token I can see the Distribution Agents have either been stopped or the subscribers have been offline for a long time.  Next steps would be 1) Verify Distribution Agents are running, 2) Check Agent History in Replication Monitor for any errors, 3) Confirm Subscribers are up and running and you can connect.

image

Subscriber Latency

Tracer Token tables can help explore subscriber latency.  Queries below show end-to-end timestamps for Tracer Tokens.  The script and output look best if pasted into SQL Server Management Studio.

--End to End time

SELECT publication_id, agent_id,

       t.publisher_commit, t.distributor_commit, h.subscriber_commit

FROM MStracer_tokens t

JOIN MStracer_history h

ON t.tracer_id = h.parent_tracer_id

ORDER BY t.publisher_commit DESC

 

 

--Historical Breakdown (in seconds)

SELECT publication_id 'PubID', agent_id 'AgentID',convert(smalldatetime,t.publisher_commit) 'PubCommit',

       convert(smalldatetime,t.distributor_commit) 'DistCommit',

       convert(smalldatetime,h.subscriber_commit) 'SubCommit',

       Datediff(s,t.publisher_commit,t.distributor_commit) as 'Dist(sec)',

       Datediff(s,t.distributor_commit,h.subscriber_commit) as 'Sub(sec)'

FROM MStracer_tokens t

JOIN MStracer_history h

ON t.tracer_id = h.parent_tracer_id

ORDER BY t.publisher_commit DESC

 

/*

--Slow LogReader

PubID       AgentID     PubCommit               DistCommit              SubCommit               Dist(sec)   Sub(sec)

----------- ----------- ----------------------- ----------------------- ----------------------- ----------- -----------

1           3           2010-02-25 20:38:00     2010-02-25 20:53:00     2010-02-25 20:54:00     902         5

1           4           2010-02-25 20:38:00     2010-02-25 20:53:00     2010-02-25 20:54:00     902         7

1           3           2010-02-25 20:46:00     2010-02-25 20:53:00     2010-02-25 20:54:00     427         5

1           4           2010-02-25 20:46:00     2010-02-25 20:53:00     2010-02-25 20:54:00     427         7

1           3           2010-02-25 20:46:00     2010-02-25 20:53:00     2010-02-25 20:54:00     425         5

1           4           2010-02-25 20:46:00     2010-02-25 20:53:00     2010-02-25 20:54:00     425         7

 

 

--Slow Distributor on 1 subscriber

PubID       AgentID     PubCommit               DistCommit              SubCommit               Dist(sec)   Sub(sec)

----------- ----------- ----------------------- ----------------------- ----------------------- ----------- -----------

1           3           2010-03-04 17:10:00     2010-03-04 17:10:00     2010-03-04 17:11:00     1           42

1           4           2010-03-04 17:10:00     2010-03-04 17:10:00     2010-03-04 17:11:00     1           4

1           3           2010-03-04 17:28:00     2010-03-04 17:28:00     2010-03-04 17:29:00     6           36

1           4           2010-03-04 17:28:00     2010-03-04 17:28:00     2010-03-04 17:28:00     6           3

1           3           2010-03-04 17:30:00     2010-03-04 17:30:00     2010-03-04 17:31:00     5           33

1           4           2010-03-04 17:30:00     2010-03-04 17:30:00     2010-03-04 17:30:00     5           2

 

*/

 

--By Agent_id

SELECT publication_id 'PubID', agent_id 'AgentID',convert(smalldatetime,t.publisher_commit) 'PubCommit',

       convert(smalldatetime,t.distributor_commit) 'DistCommit',

       convert(smalldatetime,h.subscriber_commit) 'SubCommit',

       Datediff(s,t.publisher_commit,t.distributor_commit) as 'Dist(sec)',

       Datediff(s,t.distributor_commit,h.subscriber_commit) as 'Sub(sec)'

FROM MStracer_tokens t

JOIN MStracer_history h

ON t.tracer_id = h.parent_tracer_id

ORDER BY agent_id,t.publisher_commit DESC

 

/*

PubID       AgentID     PubCommit               DistCommit              SubCommit               Dist(sec)   Sub(sec)

----------- ----------- ----------------------- ----------------------- ----------------------- ----------- -----------

1           3           2010-03-03 09:38:00     2010-03-03 09:38:00     2010-03-03 09:38:00     2           3

1           3           2010-03-03 09:39:00     2010-03-03 09:39:00     2010-03-03 09:39:00     5           3

1           3           2010-03-04 16:43:00     2010-03-04 16:43:00     2010-03-04 16:43:00     3           1

1           3           2010-03-04 17:10:00     2010-03-04 17:10:00     2010-03-04 17:11:00     1           42

1           3           2010-03-04 17:28:00     2010-03-04 17:28:00     2010-03-04 17:29:00     6           36

1           3           2010-03-04 17:30:00     2010-03-04 17:30:00     2010-03-04 17:31:00     5           33

1           3           2010-03-04 17:32:00     2010-03-04 17:32:00     2010-03-04 17:33:00     2           36

*/

 

---Find worst performing subscribers.

SELECT

convert(varchar(10),agent_id) as 'agent id',

max(Datediff(s,distributor_commit,subscriber_commit)) as 'MAXTime To Sub (sec)',

avg(Datediff(s,distributor_commit,subscriber_commit)) as 'AVG Time To Sub (sec)'

FROM MStracer_tokens t

JOIN MStracer_history h

ON t.tracer_id = h.parent_tracer_id

group by agent_id

order by 2 desc

 

/*

agent id   MAXTime To Sub (sec) AVG Time To Sub (sec)

---------- -------------------- ---------------------

3          57391                21372

4          286                  9

*/

 

--Well, so who is Agent 3?

SELECT id, name, publisher_db, publication

FROM distribution.dbo.MSdistribution_agents

 

/*

id    name                                          publisher_db          publication

----- --------------------------------------------- --------------------- ------------

3     CHRISSKACER-AdventureWorksLT-TranProducts-CH  AdventureWorksLT      TranProducts

4     CHRISSKACER-AdventureWorksLT-TranProducts-CH  AdventureWorksLT      TranProducts

 

*/

 

--Use “name” from Agents to retrieve Job Properties

--     command line to get Subscriber Name

SELECT sjs.step_name, sjs.command from msdb.dbo.sysjobsteps sjs

join msdb.dbo.sysjobs sj

on sj.job_id = sjs.job_id

Where name = 'CHRISSKACER-AdventureWorksLT-TranProducts-CHRISSKACER-3'

 

/*

step_name                               command

--------------------------------------- -------------------------------------------

Distribution Agent startup message.      sp_MSadd_distribution_history @perfmon_i

Run agent.                               -Subscriber [CHRISSKACER]

   -SubscriberDB [AdventureWorks_Tran

Detect nonlogged agent shutdown.         sp_MSdetect_nonlogged_shutdown @subsystem =

 

(3 row(s) affected)

*/

 

Divide and Conquer

Know where the latency problem is occurring allows you to focus on the real problem.  Tracer Token can help divide Transactional Replication latency into Log Reader Agent problem v. Distribution Agent Problems.  Check out ReplTalk blog posting on Troubleshooting Replication Conversations for additional Log Reader and Distributor troubleshooting steps.