Distribution Agent is blocking users on Subscriber while replicating INSERT, UPDATE, and DELETE statements.

Chris Skorlinski
Microsoft SQL Server Escalation Services

In the middle of a huge phased migration from SQL 2000 to SQL 2008 we noticed the SQL 2008 publisher\distributor was causing blocking when replicating data changes to the SQL 2000 subscriber.

We discovered the SQL Server 2005\2008\2012 Distribution Agent was configured to replicate as native SQL commands instead of default parameterized stored procedures.  This option is controlled by the @ins_cmd = N'SQL', @del_cmd = N'SQL', @upd_cmd = N'SQL'  settings for sp_addarticle.

delete from [dbo].[build_pricing] where [effort_code] = 'ALFKI'

When this is sent to the subscriber, the Distribution Agent converts this to sp_executesql such as following in attempt to optimize the execution.

exec sp_executesql N'delete from [dbo].[build_pricing] where [effort_code] = @P1 N'@P1 nvarchar(5)

By default, the Distribution Agent passed parameters are nvarchar() data types. If the subscriber data type is char() and not nvarchar(), the Query Optimizer will perform a SCAN and CONVERT rows on the table values to match the passed parameter value data type. Results is high IO hit, large number of locks, high memory usage, long blocking.

We also verified the high IO demands of a single row update or delete statement by examining the query plan.

1           1                  |--Parallelism(Gather Streams)
1           8                       |--Index Scan(OBJECT:([dbo].[build_pricing]….Convert([effort_code])….

This query plan shows 8 parallel streams (MaxDop = 8) was used to Scan (read) the Index while Converting char(5) [effort_code] data type to match the @P1 nvarchar(5)”.  For a single row update or delete and NO conversion the expected query plan would be an INDEX SEEK with 5 not 50,000 reads.

“where [effort_code] = @P1 N'@P1 nvarchar(5)”

If you enable Verbose History You may also see Distribution WRITE taking much longer then READS.

Time to Apply Cmds (ms)   : 1016331  Cmds/Sec            : 2.54 

Time to Retrieve Cmds (ms): 6269  Cmds/Sec            : 411.39   

Adding "-T 1" to the Distribution Agent Job Properties reverted the behavior to SQL 2000 which does not attempt to optimize the statements using sp_executesql, but instead submits them directly as SQL INSERT, UPDATE, and DELETE statements.  In testing the IO overhead dropped from 50K READS to 5, no latency, little or no blocking.

Another option would be to replicated the data using the default Replication Stored Procedures instead of SQL commands.

Verified sp_addarticle the STATUS=0 and CMD='SQL'
http://blogs.msdn.com/b/chrissk/archive/2009/05/08/troubleshooting-slow-distribution-agent-performance-in-sql-2005-2008.aspx