Notes on Log Reader Parameters MaxCmdsInTran and ReadBatchSize

Jose Moreira B. Neto
Microsoft SQL Server Escalation Services

From SQL Server BOL we have:

-ReadBatchSize

Is the maximum number of transactions read out of the transaction log of the publishing database per processing cycle, with a default of 500.

-MaxCmdsInTran

Specifies the maximum number of statements grouped into a transaction as the Log Reader writes commands to the distribution database. Using this parameter allows the Log Reader Agent and Distribution Agent to divide large transactions (consisting of many commands) at the Publisher into several smaller transactions when applying commands at the Subscriber. Specifying this parameter can reduce contention at the Distributor and reduce latency between the Publisher and Subscriber. Because the original transaction is applied in smaller units, the Subscriber can access rows of a large logical Publisher transaction prior to the end of the original transaction, breaking strict transactional atomicity. The default is 0, which preserves the transaction boundaries of the Publisher. This parameter does not apply to Oracle Publishers.

What you should take into consideration:

MaxCmdsInTran was offered to customers as a way to decrease latency from the publisher to subscriber for very large transactions. It does this by breaking transactional consistency, and breaking up a single large transaction into smaller transactions. You may see poor logreader performance with low settings of MaxCmdsInTran because it causes the logreader to deliver less data into the distribution database per log harvesting “batch”. In other words, there is some fixed overhead cost of delivering a batch, so with respect to throughput, you probably want your batch size to be as large as possible. Lower MaxCmdsInTran values will increase latency.

A larger batch though can cause blocking, timeouts and increased transaction log size on the subscriber. Many customers work around blocking and deadlocks on the subscriber due to the locks held by the extended transaction when the change is replicated to the subscriber by implementing MaxCmdsInTran.

MaxCmdsInTran was never designed to be always turned on. It exists in the product to work around cases where someone accidentally performed a large number of DML operations in a single transaction (causing delay in distribution of commands until the entire transaction is in distribution database, locks being held, etc.). If the customer is routinely falling into this situation, they should serious look at their applications and ways to reduce transaction size.

Irrespective of ReadBatchSize, if you set a low value for MaxCmdsInTran the log reader needs to write small chunks to the distribution db and that will cause latency compared to writing one big chunk at once

Readbatchsize determines how many transactions to harvest from the transaction log per processing cycle. MaxCmdsInTran determines the maximum number of commands on a single transaction, possibly breaking transaction ACID. Suppose readbatch size is 10, and each transaction has 1,000 commands, in this example the log reader will harvest 10,000 commands contained in 10 transactions. These 10,000 commands will be applied to the subscriber as 10 individual transactions. If MaxCmdsInTran is also added to the agent profile and set to 500, there will now be 20 transactions to be applied to the subscriber, each with 500 commands, hence breaking the original transaction scope.

Summary

I would recommend not using maxcmdsintran at all since it was designed to be used temporarily for a short time in rare cases. If you see logreader going slow, try to modify your app to reduce transaction size, if transaction size is small you can then use ReadBatchSize to speed up the log reader.