Welcome to MSDN Blogs Sign in | Join | Help

TCP Chimney Offload – Possible Performance and Concurrency Impacts to SQL Server Workloads

TCP Chimney is enabled by default if you apply Windows Server 2003 Sp2.  This is an operating system feature that provides capability to offload TCP/IP packet processing from the processor to the network adapters and some other balancing options.  (For a full description of this feature see http://support.microsoft.com/kb/912222.)

TCP Chimney has been known to cause issues on SQL Server systems such as general network errors and working set trimming.  The following articles document these known issues:

http://support.microsoft.com/kb/942861

http://support.microsoft.com/kb/918483

We’ve also identified situations where TCP Chimney has impacted transaction throughput and caused delays between when a statement has been completed by the SQL engine and the time to receive the begin event of the next statement.  This impact can be significant especially in application workloads that have throughput requirements to execute a series of statements within a certain time boundary. 

For example, your application has a key transaction that consists of multiple statements.  Each individual statement on the engine side is optimized and has very short duration.  The overall duration of the transaction is short because each statement has low duration and the time in between the batches is short as well.  A profiler trace of this transaction typically shows a pattern like the following.  Note that there is very short time in between the complete of one batch and the start of the next batch:

image

However with TCP Chimney enabled, you notice there is a marked delay between a batch completed and the start of the next batch for the exact same series of statements and work.  In this example, note how there is approximately a 500 ms. delay in between the complete and start of the next batch:

image

In this scenario with the 500 ms. delay in between statements you would see the SPID spend most of its time awaiting command in sys.sysprocesses with a waittype of 0x000.

This type of delay can affect application throughput as well as concurrency.  For example if the above statements are all encompassed in an implicit transaction, with the added delay the overall duration of the implicit transaction is significantly increased, locks would then be held longer than normal and you may see unexpected blocking.  If you do a comparison test of the same implicit transaction between two systems, one with TCP Chimney enabled and the other with TCP Chimney disabled and you compare the sum of the duration of the individual statements vs. the total duration of the entire transaction, you may see that the overall transaction is significantly increased when TCP Chimney is enabled.  With TCP Chimney enabled, the delta between the sum of the statement duration from the overall transaction duration shows that the majority of time is spent awaiting the next batch/command. 

Here is an example comparison of the same workload with TCP Chimney enabled and disabled.  Note the significant increase in transaction duration and the large delta (difference between transaction duration vs. the sum duration of all statements within transaction) when TCP Chimney is enabled:

Implicit Transaction Summary TCP Chimney Enabled

spid    TransactionID  TranStart     TranEnd       TranDuration  sum_batch_duration   batch_count    delta
------- -------------- ------------- ------------  ------------- -------------------- -------------- --------
57      916972         09:40:24.450  09:41:17.623  53173         601                  516            52572
57      896243         09:39:31.620  09:40:01.840  30220         322                  301            29898
57      877227         09:39:12.120  09:39:15.293  3173          306                  161            2867
57      876313         09:38:58.590  09:38:58.603  13            0                    1              13
57      895388         09:39:18.510  09:39:18.527  16            16                   4              0
57      915675         09:40:02.653  09:40:02.670  16            16                   4              0

Implicit Transaction Summary TCP Chimney Disabled

spid    TransactionID  TranStart     TranEnd       TranDuration  sum_batch_duration   batch_count    delta
------- -------------- ------------  ------------  ------------- -------------------- -------------- --------
54      127910         11:13:47.287  11:13:52.490  5203          4060                 516            1143
54      107344         11:13:23.380  11:13:24.427  1046          382                  301            664
51      87187          11:12:50.067  11:12:50.550  483           0                    1              483
54      88182          11:13:03.987  11:13:07.237  3250          2878                 161            372
51      106432         11:13:10.487  11:13:10.487  0             0                    1              0
54      126550         11:13:25.490  11:13:26.007  516           516                  4              0

 

If you observe a similar pattern and suspect TCP Chimney, you may want to disable TCP Chimney to provide immediate relief.  Another option is to follow up with your network adapter vendor to see if they have an updated driver that will address the problem and allow for use of TCP Chimney.  For additional information see  http://support.microsoft.com/default.aspx?scid=kb;EN-US;948496

 

TCP Chimney is off by default in Windows Server 2008 - see http://support.microsoft.com/kb/951037.

Sarah Henwood | Microsoft SQL Server Escalation Services

Published Wednesday, October 01, 2008 5:06 PM by psssql

Comments

# re: Windows Scalable Networking Pack – Possible Performance and Concurrency Impacts to SQL Server Workloads

I appreciate the information on the impact of SNP on SQL Server, and the desire to disable SNP to avoid the impact as a short-term work-around.

I would assume that a feature which offloads network processing workload from the system processor(s) to the network adapter(s) would result in greater capacity and better performance of the main system workload.  Evidently, the specific implementation has flaws that block this goal.  Are there resolutions in process to correct these defects and allow the desired gains – and in what time frame?  None were mentioned in the article, but the question begs to be asked.

Since the feature leverages capabilities within the OS and within specific network adapter(s), it is possible that the defect is in the network adapter (a hardware vendor responsibility), the OS (a Microsoft responsibility), or both.

I am interested in hearing more information on the root cause(s) and possible resolution(s) for this issue.

Thanks,

Scott R.

Friday, October 03, 2008 12:55 PM by ScottR007

# 安装Windows Server 2003 SP2可能会影响SQL Server 性能

ScalableNetworkingPack(常见问题详细描述)在Server2003安装SP2后会默认被打开,打开此功能可以把对网络数据包的处理从CPU分担一部分到网卡上,但这可能会引起S...

Sunday, October 05, 2008 5:02 AM by 付博

# performance issues with SQL Server queries on Windows 2003 SP2

By now most of us are using Windows 2003 SP2 unless there are strict application requirements and your

Saturday, October 25, 2008 3:53 PM by Saurabh Srivastava
Anonymous comments are disabled
 
Page view tracker