Network Load Balancing in R2: Using ETW Tracing

Network Load Balancing in R2: Using ETW Tracing

Rate This
  • Comments 2

Hi,

 

We are going to talk about a new feature in Windows Server 2008 R2 for Failover Clustering, ETW tracing.  With this added functionality, we have provided a mechanism of tracing why NLB has decided to drop or accept a given network packet.

 

This blog provides the following information on the new ETW tracing for NLB delivered in Windows Server 2008 R2.

 

·         Overview of ETW Tracing

·         How to Setup ETW Tracing

·         How to Enable, Disable and View the Traces

·         How to Uninstall ETW Tracing Manifest

·         Examples of Tracing Output

 

Details on how to interpret the results and use them for advanced debugging purposes will be covered in the future blog posts.

 

Overview of ETW Tracing

 

ETW is best described by the following MSDN article:

 

Event Tracing for Windows (ETW) is a general-purpose, high-speed tracing facility provided by the operating system. Using a buffering and logging mechanism implemented in the kernel, ETW provides a tracing mechanism for events raised by both user-mode applications and kernel-mode device drivers. Additionally, ETW gives you the ability to enable and disable logging dynamically, making it easy to perform detailed tracing in production environments without requiring reboots or application restarts.

 

NLB leverages this infrastructure to provide the end users with more detailed information regarding why packets are accepted or rejected by NLB.  While ETW is designed and implemented with performance in mind, you want to be aware that these logs consume storage space.  For example, a server handling 100 connections per second, could fill up many MB of data in less than a minute due to the detailed level of analysis it is doing, so it is important to be aware of this if you plan to run ETW tracing for extended periods of time.  Below, in the installation section, you can see the command line for finding out where the ETW log file is located and how to delete it when done with debugging.

 

How to Setup ETW Tracing

 

You can find a manifest file here: http://blogs.msdn.com/clustering/pages/9944942.aspx.  The text on that page should be saved as networkloadbalancing-core-diagnostic.events.man and copied to your NLB cluster nodes.  

 

Important: this is an unsupported script, please use this script at your own risk.  Microsoft’s Customer Support Services (CSS/PSS) will not support issues associated with this script.

 

Then run the following command from the directory where you copied the manifest file:

 

> wevtutil im networkloadbalancing-core-diagnostic.events.man

 

Note that this needs to be done from an elevated console window.  The above command will only register the NLB manifest.  The tracing is not yet being collected, but the following sections describe how to do this.

 

How to Enable, Disable and View the Traces

 

On the NLB cluster node you can to collect traces through the UI or Command Line.

 

UI (Event Viewer – eventvwr.msc)

 

·         Enable Analytics and Debugging Logs (one time)

·         Make sure you’ve installed the manifest

·         Click “View” menu and select “Show Analytic and Debug Log”

 

 

 

·         To start tracing.

·         Navigate to the channel: Events Viewer\Applications and Services Logs\Microsoft\Windows\NLB\Diagnostics.  Right click on the channel and select “Enable Log”.

 

 

 

·         Run your scenario

 

·         To stop and view collected events

o   Navigate to the channel. Right click on the channel and select Disable Log. You will now see events show up in the list.

 

 

 

·         At this point you should see the NLB ETW tracing in the Diagnostics pain on the middle of the screen.

 

 

 

 

 

Command Line (Event Viewer - wevtutil.exe)

 

·         To see provider information:

 

           >  wevtutil gl Microsoft-Windows-NLB/Diagnostic

 

 

 

This tells us that the ETW tracing file that is being generated is stored at:

%SystemRoot%\System32\Winevt\Logs\Microsoft-Windows-NLB%4Diagnostic.etl

 

·         To see statistics:

> wevtutil gli Microsoft-Windows-NLB/Diagnostic

 

·         To start use:

> wevtutil sl Microsoft-Windows-NLB/Diagnostic /e:true /q

 

·         To stop use:

> wevtutil sl Microsoft-Windows-NLB/Diagnostic /e:false /q

 

·         To view events as a text file first stop the provider and then use:

> wevtutil qe Microsoft-Windows-NLB/Diagnostic /f:text > events.txt

 

How to Uninstall the NLB Tracing Manifest

 

From an elevated console window, run:

 

> wevtutil um networkloadbalancing-core-diagnostic.events.man

 

 

 

 

 

Example of Tracing Output

 

 

Node1: Node 1 accepted the connection

1.  Log Name:      Microsoft-Windows-NLB/Diagnostic

2.  Source:        Microsoft-Windows-NLB-Diagnostic

3.  Date:          10/30/2009 2:36:50 PM

4.  Event ID:      1

5.  Task Category: Filtering Receive Accept

6.  Level:         Information

7.  Keywords:      Accept,Receive,Filtering,NLB

8.  User:          N/A

9.  Computer:      G10C3N8X64N2.ctdev.nttest.microsoft.com

10.         Description:

NLB cluster on interface {10000000-0000-0006-7b00-310030003000} received traffic from 10.30.4.198:63691 destined to 10.30.4.157:5001 [protocol: TCP (0x0), flags: 0x2]. This cluster node will accept this traffic (reason: Unconditional Ownership). Source port 63691, destination port 5001, and protocol TCP have been used for the accept/drop decision.

 

In the above tracing from Node1, we see that the connection we defined in our user scenario is being accepted (line 5). The event ID “1” (line 4) indicates that this event pertains to an “Accepted Connection”.  The highlighted green segment depicts that the reason this connection was accepted was that this packet was “unconditionally owned” by the current node.  We will see more reasons in future blog posts regarding debugging NLB with ETW tracing.

 

 

Node2: Node 2 rejected the connection

1.  Log Name:      Microsoft-Windows-NLB/Diagnostic

2.  Source:        Microsoft-Windows-NLB-Diagnostic

3.  Date:          10/30/2009 2:36:50 PM

4.  Event ID:      2

5.  Task Category: Filtering Receive Drop

6.  Level:         Information

7.  Keywords:      Drop,Receive,Filtering,NLB

8.  User:          N/A

9.  Computer:      G10C3N8X64N1.ctdev.nttest.microsoft.com

10.         Description:

NLB cluster on interface {10000000-0000-0006-7b00-300036004200} received traffic from 10.30.4.198:63691 destined to 10.30.4.157:5001 [protocol: TCP (0x0), flags: 0x2]. This cluster node will drop this traffic (reason: Owned Elsewhere). Source port 63691, destination port 5001, and protocol TCP have been used for the accept/drop decision.

 

Similarly, Node2 has rejected this packet, and the reason in green highlighting shows that this packet was “Owned Elsewhere” (Node1 as per above).  The even ID “2” (line 4) can be used in event viewer to filter for only dropped packets.

 

 

Thanks,

Rohan Mutagi & Ahmed Bisht

Clustering and High-Availability Team

Microsoft

Leave a Comment
  • Please add 3 and 5 and type the answer here:
  • Post
  • can you tell me the detail how to install the manifest... ?

  • works very well, very helpful

    Thanks

Page 1 of 1 (2 items)