Welcome to MSDN Blogs Sign in | Join | Help

Hello All, I am continuing the "WinHTTP Questions" series with some questions on WinHTTP callbacks.

Is it correct that WinHTTP Callbacks will occur *only* during an in-progress WinHTTP operation? Is it possible that an external event (such as the remote server resetting the underlying tcp connection) result in a callback even when there's no outstanding operation?

No. For external callbacks, WinHTTP will indicate informational callbacks only when there is a pending operation. Essentially, there are two types of callbacks in WinHTTP. Completion callbacks and Informational callbacks.

  • Completion callbacks are invoked as a completion of a pending API call e.g. read complete is invoked when the read API completes. Similarly, handle-closed callback is invoked in response to the close handle API.
  • Informational callbacks, on the other hand, are invoked to inform the application about important events seen during the processing of a call.

So, when an async API is invoked and it returns TRUE, the application will receive zero or more informational callbacks and then a completion callback.

When an operation like WinHttpReadData fails immediately (i.e., this call returned FALSE), does it guarantee that user will not receive a STATUS_REQUEST_ERROR callback?

Yes.

Does WinHTTP synchronize WinhttpSetStatusCallback with worker threads? Here is our usage scenario: We are implementing a DLL that get dynamically loaded implements an asynchronous Download function with progress notifications and supports a Cancel call. After Cancel or Download completion our DLL should be able to be safely unloaded. Cancel calls WinHttpSetStatusCallback to unregister notifications, closes the request, connection and sessions handles. However we occasionally still get status callbacks after the DLL is unloaded.

No, WinHTTP does not synchronize WinHttpSetStatusCallback with worker threads. For e.g.  If a  callback originating in another thread is in progress when an application calls WinHttpSetStatusCallback, the application still receives a callback notification even after WinHttpSetStatusCallback successfully sets the callback function to NULL and returns. The thread will continue to process, so one will need to wait until the current call finishes.

Please also refer to the following links for more on callbacks.

Next time we will go a bit deeper into safe request cancellation.

  -- Deepak & Ari

1 Comments
Filed under:

Hello, my name is Deepak and I'm a SDET in serviceability. We handle a bunch of questions from developers using WinHTTP, and thought we might share then in a new posting series, "WinHTTP Questions".

Can I cancel a synchronous WinHttpSendRequest call by closing the request handle from a different thread? Or, are there any requirements that I need to use the asynchronous WinHttpSendRequest if I need the ability to cancel it?

The short answer is don't do that.

Let's think about the implications of this proposed code. Thread 1 starts a synchronous request and Thread 2 comes along an issues a close. We can't synchronize between Thread 1 and Thread 2 by definition because Thread 1's call is synchronous, we either can deterministically call the close before the request call starts or after the call is completed. So to close the handle during the call results in a race condition. If thread 2 attempts to close the connection during the call, the handle passed in to WinHttpSendRequest may become invalid before WinHttp has a chance to work with it. While you might get lucky, you might also non-deterministically get a crash or memory corruption.

  -- Deepak and Ari

2 Comments
Filed under:

Windows Server 2008 incorporates the completely rewritten TCP/IP stack we shipped in Windows Vista and since this is the first server release with the stack we'd like to ask our readers to load up Windows Server RC1 and see if your applications are ready.

The new features (compared to Windows Server 2003) that may pose potential compatibility issues include: removed filter hooks, receive window auto tuning, dual IP layer architecture for IPv6, compound TCP, ECN support, default strong host model, easier kernel mode programming, extensive protocol offload.

Windows Server 2008 is also the very first Microsoft server product to have the firewall ON by default. Third party applications that are not aware of this behavior may break. 

-- Ari and Katarzyna

Just wanted to drop a quick Happy New Year to our readers and see if there is any topics that you want to hear about. We've been pretty busy with Windows Server 2008, Windows Vista SP1 and future windows releases. With the RTM of Windows Server 2008 and Windows Vista SP1 drawing near, I hope to get some posts on things that are new or have changed, but I'd be happy to hear whatever people are interested in.

-- Ari

In his introductory post about the legacy Traffic Control (TC) API, Gabe discussed the host-based model that TC provides. In this post, we will see how Traffic Control APIs can be used to achieve the following for TCP/IPv4 and UDP/IPv4 traffic sent from a host:

  • Throttle (rate-limit) outgoing traffic
  • Add DSCP value in layer-3 (IPv4) header
  • Indicate that an 802.1p tag value should be added in layer-2 header

At the bottom of the post, we’ve provided a link to the Networking Connect site to download full source and binaries to a tool which implements all of the above functionality.

The following are the steps involved:

1. The first step in the process is to obtain a handle to the Traffic Control subsystem through a call to TcRegisterClient().

2. Next, make a call to TcEnumerateInterfaces() using the registration handle obtained in step #1. This call returns a list of all TC enabled interfaces on the system. Iterate through the list to find the interface(s) on which you want to prioritize and/or throttle outgoing traffic.

3. For each interface of interest, issue a TcOpenInterface() using the pInterfaceName from the corresponding TC_IFC_DESCRIPTOR for that interface from the list returned in step#2. Store the handle returned by TcOpenInterface(); let’s call it the ifchandle.

4. At this point, create a TC Flow and add it to the interface(s) of interest.

a. Create the TC Flow:
A TC flow is a way of describing various QoS characteristics to be applied to a set of packets and is represented by a TC_GEN_FLOW structure. The following code snippet shows how to create a TC Flow given the DSCP value, 802.1p value and the throttle rate.

BOOL CreateFlow(PTC_GEN_FLOW * _ppTcFlowObj, USHORT DSCPValue, USHORT OnePValue, ULONG ThrottleRate)
{
  BOOL status = FALSE;

  //
  // Flow Parameters
  //
  ULONG TokenRate = QOS_NOT_SPECIFIED;
  ULONG TokenBucketSize = QOS_NOT_SPECIFIED;
  ULONG PeakBandwidth = QOS_NOT_SPECIFIED;
  ULONG Latency = QOS_NOT_SPECIFIED;
  ULONG DelayVariation = QOS_NOT_SPECIFIED;
  SERVICETYPE ServiceType = SERVICETYPE_BESTEFFORT;
  ULONG MaxSduSize=QOS_NOT_SPECIFIED;
  ULONG MinimumPolicedSize=QOS_NOT_SPECIFIED;
  PVOID pCurrentObject;
  PTC_GEN_FLOW _pTcFlowObj = NULL;
  int Length = 0;

  //
  // Calculate the memory size required for the optional TC objects
  //

  Length += (OnePValue == NOT_SPECIFIED ? 0:sizeof(QOS_TRAFFIC_CLASS)) + (DSCPValue == NOT_SPECIFIED ? 0:sizeof(QOS_DS_CLASS));

  //
  // Print the Flow parameters
  //

  printf("Flow Parameters:\n");
  DSCPValue == NOT_SPECIFIED ? printf("\tDSCP: *\n"):printf("\tDSCP: %u\n", DSCPValue);
  OnePValue == NOT_SPECIFIED ? printf("\t802.1p: *\n"):printf("\t802.1p: %u\n", OnePValue);
  ThrottleRate == QOS_NOT_SPECIFIED ? printf("\tThrottleRate: *\n"):printf("\tThrottleRate: %u\n", ThrottleRate);
  TokenRate = TokenBucketSize = ThrottleRate;

  //
  // Allocate the flow descriptor
  //
  _pTcFlowObj = (PTC_GEN_FLOW)malloc(FIELD_OFFSET(TC_GEN_FLOW, TcObjects) + Length);

  if (!_pTcFlowObj)
  {
    printf("Flow Allocation Failed\n");
    goto Exit;
  }

  _pTcFlowObj->SendingFlowspec.TokenRate = TokenRate;
  _pTcFlowObj->SendingFlowspec.TokenBucketSize = TokenBucketSize;
  _pTcFlowObj->SendingFlowspec.PeakBandwidth = PeakBandwidth;
  _pTcFlowObj->SendingFlowspec.Latency = Latency;
  _pTcFlowObj->SendingFlowspec.DelayVariation = DelayVariation;
  _pTcFlowObj->SendingFlowspec.ServiceType = ServiceType;
  _pTcFlowObj->SendingFlowspec.MaxSduSize = MaxSduSize;
  _pTcFlowObj->SendingFlowspec.MinimumPolicedSize = MinimumPolicedSize;

  //
  // Currently TC only supports QoS on the send path
  // ReceivingFlowSpec is legacy and ignored
  //

  memcpy(&(_pTcFlowObj->ReceivingFlowspec), &(_pTcFlowObj->SendingFlowspec), sizeof(_pTcFlowObj->ReceivingFlowspec));
  _pTcFlowObj->TcObjectsLength = Length;

  //
  // Add any requested objects
  //
  pCurrentObject = (PVOID)_pTcFlowObj->TcObjects;
  if(OnePValue != NOT_SPECIFIED)
  {
    QOS_TRAFFIC_CLASS *pTClassObject = (QOS_TRAFFIC_CLASS*)pCurrentObject;
    pTClassObject->ObjectHdr.ObjectType = QOS_OBJECT_TRAFFIC_CLASS;
    pTClassObject->ObjectHdr.ObjectLength = sizeof(QOS_TRAFFIC_CLASS);
    pTClassObject->TrafficClass = OnePValue; //802.1p tag to be used
    pCurrentObject = (PVOID)(pTClassObject + 1);
  }

  if(DSCPValue != NOT_SPECIFIED)
  {
    QOS_DS_CLASS *pDSClassObject = (QOS_DS_CLASS*)pCurrentObject;
    pDSClassObject->ObjectHdr.ObjectType = QOS_OBJECT_DS_CLASS;
    pDSClassObject->ObjectHdr.ObjectLength = sizeof(QOS_DS_CLASS);
    pDSClassObject->DSField = DSCPValue; //Services Type
  }

  DeleteFlow(_ppTcFlowObj);
  *_ppTcFlowObj = _pTcFlowObj;
  status = TRUE;
  Exit:

  if(!status)
  {
    printf("Flow Creation Failed\n");
    DeleteFlow(&_pTcFlowObj);
  }
  else
    printf("Flow Creation Succeeded\n");

  return status;
}

b. Add the Flow on the interface:
After obtaining a TC_GEN_FLOW structure with the desired characteristics using a function similar to the one above, issue a call to TcAddFlow() with the ifchandle (obtained in step #3) and a pointer to the TC_GEN_FLOW object (obtained in step #4a). Store the handle returned by TcAddFlow();let’s call it the flowhandle.

5. The next step is to create a TC Filter and add it to the TC Flow created above.

a. Create the TC Filter:
A TC Filter is a way of describing which packets to apply the QoS characteristics to. The QoS characteristics defined in the TC_GEN_FLOW will only apply to the packets matching the filter(s) associated with the Flow.
The following code snippet describes how to create a TC Filter given the destination address, the destination port, and the protocol (TCP,UDP or IP).

BOOL CreateFilter(PTC_GEN_FILTER * ppFilter, SOCKADDR_STORAGE Address, USHORT Port, UCHAR ProtocolId)
{

  BOOL status = FALSE;
  USHORT AddressFamily = Address.ss_family;
  PTC_GEN_FILTER pFilter = NULL;
  PIP_PATTERN pPattern = NULL;
  PIP_PATTERN pMask = NULL;

  if(AddressFamily != AF_INET)
    goto Exit;

  //
  // Allocate memory for the filter
  //
  pFilter = (PTC_GEN_FILTER)malloc(sizeof (TC_GEN_FILTER));
  if(!pFilter)
    goto Exit;

  ZeroMemory(pFilter, sizeof(TC_GEN_FILTER));

  //
  // Allocate memory for the pattern and mask
  //
  pPattern = (PIP_PATTERN)malloc( sizeof(IP_PATTERN));
  pMask = (PIP_PATTERN)malloc( sizeof(IP_PATTERN));
  if(!pPattern || !pMask)
    goto Exit;

  memset ( pPattern, 0, sizeof(IP_PATTERN) );
  pPattern->DstAddr = ((SOCKADDR_IN *)&Address)->sin_addr.s_addr;
  pPattern->tcDstPort = htons(Port);
  pPattern->ProtocolId = ProtocolId;
  memset ( pMask, (ULONG) -1, sizeof(IP_PATTERN) );

  //
  // Set the source address and port to wildcard
  // 0 -> wildcard, 0xFF-> exact match
  //
  pMask->SrcAddr = 0;
  pMask->tcSrcPort = 0;

  //
  // if the user specified 0 for dest port, dest address or protocol
  // set the appropriate mask as wildcard
  // 0 -> wildcard, 0xFF-> exact match
  //
  if(pPattern->tcDstPort == 0)
    pMask->tcDstPort = 0;

  if(pPattern->ProtocolId == 0)
    pMask->ProtocolId = 0;

  if(pPattern->DstAddr == 0)
    pMask->DstAddr = 0;

  pFilter->AddressType = NDIS_PROTOCOL_ID_TCP_IP;
  pFilter->PatternSize = sizeof(IP_PATTERN);
  pFilter->Pattern = pPattern;
  pFilter->Mask = pMask;

  //
  // Delete any previous instances of the Filter
  //
  DeleteFilter(ppFilter);
  *ppFilter = pFilter;
  status = TRUE;

  Exit:
  if(!status)
  {
    printf("Filter Creation Failed\n");
    DeleteFilter(&pFilter);
  }
  else
    printf("Filter Creation Succeeded\n");

  return status;
}

b. Adding the Filter to the TC Flow:
Once a TC Filter structure is obtained using a function similar to the one above, issue a call to TcAddFilter() passing the flowhandle obtained in step #4b and a pointer to the TC_GEN_FILTER structure obtained in step #5a. Store the filter handle returned by TcAddFilter(); let’s call it filterhandle.
You can add multiple filters on the same flow causing different sets of packets matching each filter to get the same QoS characteristics applied to them.

6. At this point, your application is applying QoS on all matching outgoing packets as specified in the TC Filter and TC Flow. Finally, once your purpose is served, make sure you call the respective close calls on all the open TC handles – TcDeletefilter(), TcDeleteFlow(), TcCloseInterface() and TcDeregisterClient().

You can download the full source and binaries of a simple command line tool – tcmonlite, which takes the filter and flow parameters as input, creates TC Flow and Filter, and configures the QoS subsystem with them using the Traffic Control API. All the outgoing traffic on the system matching the filter gets the desired QoS characteristics as long as tcmonlite is running. Go to the Microsoft Connect website, choose Available Connections on the left-hand side of the page, and select Windows Networking from the available connections (bottom half of the page). On the left-hand side of the Windows Networking page, choose Downloads, and select TCMonLite.

This tool can be used in conjunction with the NDIS LWF driver to detect 802.1p tags in the Ethernet header and DSCP in the IP header of packets. Let us know what you think!

-- Hemant Banavar 

8 Comments
Filed under:

Disclaimer: Traffic Control (TC) APIs have been marked as deprecated, and will be phased out (eventually removed) when a suitable replacement API is available. No advancements will be made to these APIs (including adding IPv6 support) in their deprecated state; however, application compatibility will be maintained until their eventual removal.

Since the introduction of a QoS platform in Windows 2000, there have been two models for applying prioritization and/or send-rate throttling to TCP/IP and UDP/IP network traffic sent from a Windows PC: host-based and application-based. These terms have been used in QoS documentation such as this recent CableGuy article; however, I acknowledge the meaning of these terms are not immediately obvious.

An application-based model means only the application which owns the socket handle can add/remove/modify a QoS flow for its traffic. Because the application sending data onto the wire (or air) is applying throttling or priority to its own traffic (the connected socket), no elevation of privileges is required. A host-based model means some other process (not the application sending traffic through the socket) on the PC is applying prioritization or throttling to this traffic it doesn't own. Because the process doesn't own the socket handle, elevation of privileges (administrator) is required. While it is certainly possible for the process that owns the socket handle to leverage a host-based model, the added complexity is unnecessary considering socket-based QoS APIs are available for this purpose. A host-based model is a much more complex model than application-based for the following reasons:

  1. The process applying QoS properties to traffic it doesn't own has to run as a service or some other out-of-band means (in typical use cases)
  2. Administrative privileges are required
  3. Because the socket handle is not known, a filter has to be applied to match the traffic of interest, based on: source/destination IPv4 address, source/destination port, and protocol (TCP or UDP)

In Windows, only the Traffic Control (traffic.h/traffic.dll) interface provides programmatic access to a host-based model. There is value gained from this complexity; however. Because this API requires administrative privileges, the caller can specify any arbitrary layer-2 (802.1p) or layer-3 (DSCP) priority value; whereas application-based API models abstract specific priority values with traffic-classes based on established industry standards. It is worth noting that a new policy-based feature has been added to Windows Vista and Windows Server 2008 which enables a host-based model for IT administrators (no coding necessary), which enables significantly richer classification than what TC provides. Policy-based QoS; however, does not provide programmatic access and does not allow for setting layer-2 802.1p tags, only layer-3 DSCP.

Innovation has been focused on application-based APIs such as qWAVE (qos2.h/qwave.dll) to significantly simplify *safely* adding prioritization and throttling to traffic, as well as policy-based mechanisms for host-based needs.

Stay tuned for follow-up posts on how to use TC for adding 802.1p tags to the Ethernet header, DSCP to the IPv4 header, and applying throttling to outgoing traffic.

-- Gabe Frost

0 Comments
Filed under:

A number of partners who author wireless drivers for Vista have asked how they can ensure their WiFi Wireless Multimedia (WMM) implementation is correct, so I thought I'd be explicit about this very important topic. To begin, read the 4-part series WiFi QoS Support in Windows Vista, which describes how Vista internally indicates a WMM Access Category (WMM_AC), how to detect whether an Access Point supports this capability, and how to observe the behavior of prioritized traffic. Next, be sure you explicitly validate your driver does the following:

  • Confirm the QoS header exists in 802.11 data frames when DSCP and 802.1p are set independently:

    • DSCP [56, 48, 40, 32, 24, 16, 8, 0]

    • 802.1p [7, 6, 5, 4, 3, 2, 1, 0]

  • Ensure the miniport indicates NDIS_MAC_OPTION_8021P_PRIORITY in OID_GEN_MAC_OPTIONS

  • Miniport behavior on receive:

    • If WMM/11e header is stripped, WMM bit in FrameControlSubtype must be cleared

    • If WMM/11e header is not stripped, WMM bit in FrameControlSubtype must *not* be cleared

    • Must *not* strip 802.1Q tag in SNAP header (nwifi.sys will strip if necessary

  • Miniport behavior on transmit:

    • Only use NDIS_NET_BUFFER_LIST_8021Q_INFO.WMMInfo field to ascertain correct WMM_AC (*not* UserPriority, or IP DSCP field)

    • Must not strip 802.1Q tag in SNAP header (may be added by nwifi.sys)

-- Gabe Frost

0 Comments
Filed under: ,

Parts 1 and 2 of this series discussed how to determine whether an 802.1p tag was added to traffic, and how to modify the NDIS light-weight-filter (LWF) sample driver source code to accomplish this task. We do know that you're all very busy and not everyone is a developer, so we've added to the package: full source code for the complete filter driver, a command-line tool for accessing the filtered packets, and compiled binaries for you non-developers. We also added the ability to validate *both* 802.1p and DSCP (from the IPv4/IPv6 header). This additional package was added to the existing download, so if you haven't already read part-2 of this series, do so now and you'll find download instructions there. For installation instructions, read the README file in the zip archive.

What do you think? Is this approach (source and tools) helpful? We do actively monitor the QoS forum, so let us know how we can improve your understanding of Windows QoS capabilities.

-- Gabe Frost

-- Huge thanks to Hemant Banavar who authored the tools

5 Comments
Filed under: ,

In Gabe’s last post on detecting 802.1p priority tags, he described at a relatively high-level why it is difficult to detect a priority tag using packet tracing applications, as well as the proper way to determine whether a tag was present in a packet that was sent onto the wire (or air). In this post, I’ll describe how to programmatically access this information by modifying the NDIS Light Weight Filter (LWF) sample driver found in the Windows Driver Kit (WDK). To begin, download and install the WDK and navigate to the LWF sample found in the WinDDK\6000\src\network\ndis\filter directory. The remainder of this post will describe how to modify this sample to gain access to the UserPriority value in the out-of-band (OOB) data of a Net Buffer List (NBL) structure, and whether the miniport driver actually stripped the 1Q tag from the Ethernet header like it was supposed to. Based on what part-1 of this series describes about driver (miniport/LWF) layering and framing details, the modified driver will inspect received packets (meaning the sender added the tag to outgoing traffic).

The source file in the LWF sample of particular interest is filter.c, where we’ll modify the FilterReceiveNetBufferLists() function. FilterReceiveNetBufferLists() is an optional function for filter drivers, which if provided, processes receive indications made by the underlying miniport or filter drivers beneath in the stack. If this handler is NULL, NDIS will skip calling this filter when processing a receive indication and will call the next filter (or protocol driver) above in the stack with a non-NULL FilterReceiveNetBufferLists handler. The remainder of this post goes into detail about which areas of filter.c need to be modified.

To begin, let’s explore how to access the received packets in this function so they can be inspected. The second parameter to the function, NetBufferLists, is a linked list of NetBufferList structures allocated by the underlying driver. Each NetBufferList contains one NetBuffer structure, which represents a received packet. This means, in order to inspect each packet in the NetBufferLists linked list, we need a loop of the following kind within FilterReceiveNetBufferLists():

if (pFilter->TrackReceives)

{

     FILTER_ACQUIRE_LOCK(&pFilter->Lock, DispatchLevel);

     pFilter->OutstandingRcvs += NumberOfNetBufferLists;

     Ref = pFilter->OutstandingRcvs;

     currNbl = NetBufferLists;

     while(currNbl)

     {

          // Call the function to parse the packet in each

          // NetBufferList (one net buffer per NBL )

          inspectNetBuffer(currNbl, pFilter);

          currNbl = currNbl->Next;

     }

     FILTER_LOG_RCV_REF(1, pFilter, NetBufferLists, Ref);

     FILTER_RELEASE_LOCK(&pFilter->Lock, DispatchLevel);

}

In the above code snippet, observe that the loop is run only if pFilter->TrackReceives is TRUE. The idea here is to only inspect packets if the user requests the driver to do receive-side inspections (our focus is on a debugging tool). The TrackReceives flag can be set to FALSE at FilterAttach and can be set to TRUE through an IOCTL. Look at the FilterDeviceIoControl() function in device.c for defining IOCTLs.

Now that we have a pointer to the NetBufferList structure which holds the packet information, let’s explore how to inspect the packet for the 802.1p tag (note you could also inspect the IPv4 or IPv6 header for DSCP here if you really wanted). The inspectNetBuffer function in the above code snippet starts by extracting the NetBuffer pointer from the NetBufferList. Please see the bottom of this post for instructions on where to download the full implementation of this function. Next, if an 802.1p tag is available in the OOB data, it is extracted and stored in the variable called UserPriority as follows:

if (NET_BUFFER_LIST_INFO

(pNetBufferList, Ieee8021QNetBufferListInfo) != 0)

{

     Ndis8021QInfo.Value = NET_BUFFER_LIST_INFO(pNetBufferList, Ieee8021QNetBufferListInfo);

     UserPriority = (UCHAR)Ndis8021QInfo.TagHeader.UserPriority;

}

At this point, a pointer to the start of the packet is obtained and stored in packetBuffer. The Ethernet header is parsed and RecdUnStrippedPackets (which is initialized to zero at the beginning before starting to track received packets) is incremented if it is found that the Ethernet header still contains the 802.1p tag - indicating the underlying miniport did not strip the tag as per NDIS documentation.

To download the full implementation of inspectNetBuffer(), as used to do the majority of parsing work, go to Microsoft Connect website and login using your passport account (create one if you don’t already have one). Once you have logged in, choose Available Connections on the left-hand side of the page, and select Windows Networking from the available connections (bottom half of the page). On the left-hand side of the Windows Networking page, choose Downloads, and select NDIS LWF Sample With Packet Priority Detection.

-- Hemant Banavar

0 Comments
Filed under: ,

Consider a case where a network application calls Windows QoS APIs to add a layer-2 IEEE 802.1Q UserPriority tag (almost always referred to as 802.1p) to outgoing traffic. Ascertaining whether the tag actually got added to an outgoing packet is not as simple as it seems due to the nature of how the Windows network stack is designed, and how framing actually occurs. From an internal implementation perspective, The QoS Packet Scheduler (Pacer.sys in Vista/2008 Server, and Psched.sys in XP/2003 Server) in the network stack merely updates an out-of-band structure (not the actual formed packet) that an 802.1Q UserPriority tag should be added. The specific NDIS structure is NDIS_NET_BUFFER_LIST_8021Q_INFO, which contains member variables for both VlanID and UserPriority, and is passed to the NDIS miniport driver for implementing both priority tagging (UserPriority) and VLAN (VlanId). It is up to the NDIS miniport driver to actually insert the 802.1Q tag into the frame based on these values before transmitting on the wire. A miniport driver will only insert this tag if the feature is supported and enabled in the advanced properties of the NIC driver; typically layer-2 priority tagging is disabled by default.

From a network stack layering perspective, it’s important to understand that Pacer.sys is an NDIS Lightweight Filter (LWF) driver, and will always be inserted above a miniport driver, which will always be the lowest network software in the stack because it communicates directly with the NIC hardware. Also note that network sniffing applications like NetMon and WireShark are also network stack filters, and will always be inserted above the miniport driver. This is important knowledge because it should be clear that taking a network sniff of traffic on the sending PC will never show the tag in a packet (because the tag gets added below the sniffing software). Also, the QoS Packet Scheduler can't know for absolute certainty whether the miniport driver added the tag to the outgoing packet. 

What about trying to do a network sniff on the receiving PC? Good question, but also will show the layer-2 tag not present in packets. The reason for this is NDIS developer documentation clearly states that miniport drivers must strip the tag when received, and populate the NDIS_NET_BUFFER_LIST_8021Q_INFO UserPriority and VlanId fields with the values in the tag. This out-of-band structure can then be used by NDIS filter drivers higher up in the stack for implementing these features. The functional reason for stripping the layer-2 tag is because Tcpip.sys will drop any received packet that contains this tag. Therefore, if a misbehaving miniport driver does not strip the tag, the packet will never be received by the user-mode application because it will be dropped internally.

In conclusion:

  • A network sniffing app on the sending PC will never see a tag
  • A network sniffing app on the receiving PC will never see a tag
    • Unless the miniport driver is misbehaving, which will result in dropped packets
  • Monitoring tagged packets from intermediate network elements (such as a switch) is hard if at all possible
    • Perhaps a clever SNMP counter could be used, but would depend on the device manufacturer

If you have not came to this conclusion already, the only way to determine whether a layer-2 tag got added on the sending PC and/or received by the receiving PC is to monitor the NDIS_NET_BUFFER_LIST_8021Q_INFO.UserPriority field. Stay tuned for a follow-up post which describes how to do this.

-- Gabe Frost

12 Comments
Filed under:

Mark Russinovich has a great post today on the what and how of the network/multimedia vista issue that people have recently been talking about. Amusingly enough a couple people on /. more or less figured it out, but are only modded 3 and lower. Go Figure.

-- Ari

2 Comments
Filed under: ,

Ask Perf, the blog of the Enterprise Platforms Windows Server Performance Team, is spending some time explaining a bit of how WinInet/WinHTTP and their surrounding components work with each other. Go check it out!

  -- Ari

0 Comments
Filed under: ,

Hi, my name is Katarzyna and I am the Program Manager within the Internet Protocols team. I have been asked a few times about the Receive Window Auto-Tuning feature on Vista and some associated issues people are having.

One of the many cool new features on Windows Vista, Receive Window Auto-Tuning enables the networking stack to receive data more efficiently than on XP. Auto-Tuning allows the operating system to continually monitor the routing conditions (bandwidth, network delay, application delay) and configure connections (scale the TCP Receiving Window) so as to maximize the network performance.In some high bandwidth, high latency links, we have seen SMB performance improvement up to 20 times!

In every TCP packet there is a "window" field, which informs the receiver how much data the sender can accept back. This window controls the flow by setting a threshold on data kept "in flight" and prevents overwhelming the receiver with data that it cannot accept.

The TCP window field is 16 bits wide, allowing for a maximum window size of 64KB, which used to meet requirements of many older networks. Nowadays, however, network interfaces can handle larger packets and keep more of them in flight at any given time. Thus, a larger TCP window has become necessary; especially on high-speed, high latency networks. To fill such a long, fat pipe and make use of the available bandwidth, the sending system can often require very large windows for good performance.

The solution to this demand is called "window scaling”, described back in 1992 in RFC 1323. It introduces an eight-bit scale factor, which serves as a multiplication factor for the window width. After the factor has been negotiated, window values used by that system on a given connection will be shifted to the left by that scale factor; a window scale of zero, thus, implies no scaling at all, while a scale factor of six implies that window sizes should be shifted six bits, thus multiplied by 2^6 = 64. Now a window greater than 64KB can be easily expressed (e.g., 128KB) by setting the scale factor (e.g., 6) and keeping the window field under the original 16 bits (here, 2048).

The window size included in all packets is modified by the scale factor, which is negotiated once at the very beginning of a TCP connection. The connection requestor suggests window scaling factor in its original SYN packet and if the SYN+ACK packet sent in response contains the option, then this particular value will be used on this connection. The scale factor cannot be changed after the initial setup handshake; remaining data transfers on this connection will implicitly use the negotiated value.

Older routers and firewalls however do not handle window scaling correctly leaving the option in the original SYN packet but setting the connection’s scale factor to zero. Seeing the option on, the receiver responds with its own window scale factor. Believing that its scale factor has been accepted, the initiator scales the window appropriately while the receiver thinks that a scale factor of zero is applied and thus a small window of data should follow. As a result, the communication is slow at best. Sometimes, small window packets are dropped by the routers, essentially breaking the connection.

The resulting slow data transfers or loss of connectivity, users may experience as slow or hung networking applications. Remote Desktop Connection and network file copy are two scenarios particularly hurt by misbehaving routers.

If your connection from a Vista machine appears slow or hung, here are some steps to isolate the cause:

  • First, make sure that your firewall and router can support window scaling. Some devices from Linksys, Cisco, NetApp, SonicWall, Netgear, Checkpoint, D-Link were reported as having problems with window scaling. (Some of the incompatible devices are given here. You can check with the manufacturer or run the connectivity diagnostic suite (especially, TCP High Performance Test) provided by Microsoft to determine your gateway device’s compliance.
  • Second, check with the manufacturer if a firmware update has been issued for your device that can fix the problem. Replace the problematic device or update the firmware as suggested by the manufacturer. If the router cannot be replaced or if it the device is remote (e.g., a firewall of your ISP or corporation)
  • Third, If the problem still persists, you can restrict autotuning by running “netsh interface tcp set global autotuninglevel=restricted” from the command prompt. We have found that restricted mode will often allow some of the benefits of autotuning with a number of problematic devices.
  • Lastly, if all else fails, in order to disable this feature, run "netsh interface tcp set global autotuninglevel=disabled".
  • (In order to reenable autotuning, run “netsh interface tcp set global autotuninglevel=normal”.)

Please refer to the following KB articles for more information:

-- Katarzyna

Updated: Broken link to KB 932170
Update 2: Changed the guidance to do restricted before disabled.
Update 3: tunning doesn't have two "n"s. :)
Update 4: no really, tuning doesn't have two "n"s.

I recently needed to generate XML from PowerShell and was disappointed to see the PowerShell blog use the old ASP model of doing text insertion into the middle of a big string. It might be the tester in me, or the security training, but with some unexpected input you can easily end up with malformed XML or even worse maliciously malformed XML. The other extreme would be to use .Net XML APIs to build up the XML doc from scratch, but I didn't like how much redundant code I would need to build the xml header and I didn't find any good sample code in C# to create an XML document from scratch, so I ended up developing a middle approach. Start with a string that has nothing but the xml header and an empty top level element, convert that over to an XMLDocument, use the APIs to add data and then write out the doc to the disk.

For example:

$doc = [xml] "<?xml version=""1.0"" encoding=""utf-8""?><ns:RoleInstance xmlns:ns=""http://namespace.microsoft.com/2007/Whatever""/>"

$elem = $doc.CreateElement("ns:TestBuild")
$elem.SetAttribute("Product", $Product);
$elem.SetAttribute("Lab", $Lab);
$elem.SetAttribute("BuildNumber", $OSBuildNumber);
$elem.SetAttribute("SPBuildNumber", $SPBuildNumber);
$elem.SetAttribute("TimeStamp", $BuildLabString.Split(".")[4]);
$elem.SetAttribute("SKU", $SKU);
$elem.SetAttribute("Language", $SystemLocale.Split("-")[0].Trim());
$elem.SetAttribute("Culture", $SystemLocale.Split("-")[1].Trim());
$elem.SetAttribute("Architecture", $Processor);
$elem.SetAttribute("Type", $Type);
$doc.get_ChildNodes().Item(1).AppendChild($elem) | out-null

$elem = $doc.CreateElement("ns:Implementation");
$elem.SetAttribute("type", "WTTResource");
$elem.SetAttribute("ResourceName", $Name );
$elem.SetAttribute("ResourceId", $ResourceId );
$elem.SetAttribute("ResourceConfigurationId", $Id );               
$doc.get_ChildNodes().Item(1).AppendChild($elem) | out-null

$doc.get_ChildNodes().Item(1).SetAttribute("GUID", [GUID]::NewGuid().ToString() );
$doc.save((Join-path $RolePath "RoleInstance.xml")) 

This code uses CreateElement and SetAttribute and then associates as a child to the 2nd item in the doc ($doc.get_ChildNodes().Item(1)) which is my RoleInstance tag. By using the XML APIs I can feel confident that the information in each attribute gets properly encoded and I keep my document well formed. The only down side is that it is slightly harder to visualize the structure of the resulting XML.

  -- Ari

Update: Bruce over at the powershell blog teaches me a new trick using Domain Specific Languages to do a cleaner job of constructing the XML. Thanks Bruce!

I noticed that there was a demo of Rally technologies at the WinHEC keynote the other day, so I created a link to part of the keynote with the Rally demo. Enjoy.

 -- Ari Pernick

0 Comments
Filed under: ,

Attachment(s): Media Link.asx
More Posts Next page »
 
Page view tracker