In this article we will explore a practical use for ETW tracing, and discover what ETW (Event Tracing for Windows) tracing is available for a popular Windows user-mode component, Internet Explorer. In my previous article ETW Introduction and Overview, we covered what ETW tracing is and how it could be used.
The goal in this exercise is to learn about ETW tracing in general, how to self-discover what tracing is available in a component, and some ways you can leverage the tracing to self-troubleshoot issues. ETW logging is essentially allowing Microsoft code to speak for itself. It tells you what code ran, what that code did, and any errors produced. Also ETW logs can be used along with a more traditional toolset for troubleshooting that class of issue such as the SysInternals tools, Network Monitor, etc.
For this article, we are using the RC Build of Windows 7. The concepts and examples should be very similar and work fine for Windows Vista.
First we start off by opening Internet Explorer 8 whose home page is set to the NTDebugging Blog - http://blogs.msdn.com/ntdebugging
In order to find what Internet Explorer is logging with ETW, we list all of the registered ETW providers on a box, which includes all of the installed code on the machine (the component doesn’t need to be actively running).
We start off by running “logman query providers” and look through the list for relevant hits. However, the results below return more than 400+ hits on Vista, and 600+ on Win7. You will probably quickly see though that searching through this large list of providers might not always be best way to go about finding which providers Internet Explorer is logging with.
Using this list we can also filter or browse for our component. This may work just fine, however, one issue with this approach is that sometimes you may not know the particular naming convention for a process. Also, many software components use shared dlll’s to do a variety of the under the hood work. As such, you may not know what each of those pieces of software are, or how they are named.
Here we try running a built-in cmd line tool logman ‘query providers’ and searching for “Internet”, which didn’t turn up anything on this particular search. It looks like the naming convention for Internet Explorer might be a bit different than our first search.
While we could continue to browse the large list of providers, I have decided to use another useful method to find which providers IE uses, which is to filter all providers that a certain ProcessId uses. We start out by getting the ProcessId of Internet Explorer using two methods shown here, Task Manager and the command line tasklist filtering on iexplore.exe.
Now that we know Internet Explorer is running with process id 6200, we can do another query with logman to find out the ETW providers.
‘Logman query providers –pid 6200’ is used to list all of the user-mode ETW providers that Internet Explorer uses and the associated GUIDs (GUIDs are the Globally Unique ids that enable tracing for a component).
In looking at the screenshot below, notice that a wide variety of providers are shown, and at first glance, you may wonder how some relate to Internet Explorer. As you investigate further though, all these components enable certain features within Internet Explorer, but even with tracing enabled, may not log anything unless you specifically use the part of Internet Explorer which runs that code at run-time.
It is also important to note that Internet Explorer will also use a lot of kernel services to eventually do its work, such as NDIS. There is ETW tracing for these kernel components, but would not show up under a specific process as they are used by all processes at the kernel level. Here we see a specific component of interest that we want to follow – Microsoft-Windows-WinINet. A quick Bing search of WinINet turns up this on MSDN, which sounds relevant.
Extracted from MSDN: The Microsoft Windows Internet (WinINet) application programming interface (API) enables applications to access standard Internet protocols, such as FTP and HTTP. For ease of use, WinINet abstracts these protocols into a high-level interface.
So Microsoft-Windows-WinINet looks like a good component to enable tracing in.
From using SysInternals Process Explorer you can also confirm and that WININET.dll is loaded into the address space of iexplore.exe as shown below.
Now that we know the ETW Provider name and GUID, we will launch Computer Mgmt and turn on tracing for that component. Note that not every Provider can be decoded this way (due to architectural and security reasons), but many can, and at the very least, the log can be enabled and provided to Microsoft Support to fully decode.
Launch Computer Management by typing ‘compmgmt.msc’ into the start search box or right clicking Manage on Computer Management.
Once in Computer Management, Navigate to and click on Event Viewer -> Applications and Services Logs. Once there, make sure View -> Show Analytic and Debug Logs is enabled as shown below, which will provide a much greater set of logs to look at. Most logs are under Event Viewer -> Applications and Services Logs -> Microsoft -> Windows, where we will spend the majority of our time. There are generally four channels that can show up under each provider. Channels are targeted at different roles. Admin and Operational channels target IT professionals and Administrators and are enabled by default, while Analytic and Debug Channels are more in depth, and not usually enabled by default.
The logs that show up here are XML Manifest based tracing called Windows Events. ETW tracing splits up the collection and decoding of traces into two separate steps for performance and security reasons. In the manifest based tracing first available with Vista, most components and events are defined in a XML manifest compiled along with the binary, and which are defined in a resource file language dll. Most user-mode resource dlls show up under C:\Windows\System32\en-US (for English US). The good part about manifest based tracing is that many logs are self-discoverable and customer decodable!
Next we will browse to: Event Viewer -> Applications and Services Logs -> Microsoft -> Windows -> WinINet, right click on the Analytic channel, and choose ‘Enable Log’. Remember that we had to enable this log because Analytic and Debug logs are not enabled by default. Enabling the channel will automatically enable the ETW tracing for that component and logs will start being decoded similar to the Event Log.
After enabling the log, I am going to close Internet Explorer which is running code that is now logging using ETW and navigate to the website http://blogs.msdn.com/ntdebugging.
Now we can look at the result of our work and tracing, which decodes the tracing for us. Below, we are on the WININET_REQUEST_HEADER category, showing the GET request. Note that the tracing here is showing us proxy interaction, DNS requests and responses, cookies, TCP requests/responses, as well as HTTP requests and responses.
Now, one might be able to get the same information using a network capturing program such as NetMon or WireShark, but the nice thing about the ETW tracing here is that is
a) In box without having to install additional tools
b) The aggregate view of what the WinINet component is seen from its point of view (composed of DNS, TCP, Proxy, Cookies, etc).
This is a good example of using this tracing in combination with other troubleshooting tools (if required), which may or may not be useful depending on the scenario.
You can use the Event Viewer GUI, or if you prefer to view the log in a text editor (or spreadsheet). You can dump the log to text or CSV format using a couple of different methods.
The first method is using the Windows Event Log GUI to export or save the event log
Or you can output the entire log to screen in human readable text format:
We can also output to the screen filtering for any messages with error level using an XPath Query. We can start right away if you know the XPath Syntax for your query, or you can use the GUI to help build the query for you. Once you choose ‘Filter Current Log’ you can filter by Event Level, or a variety of other criteria.
The XML view will show you the raw XPATH syntax needed to use wevtutil to query for only Error events.
Now we actually run the XPath query using ‘wevtutil query-events Microsoft-Windows-WinINet/Analytic /f:Text "/q:*[System[(Level=2)]]"’
If you prefer a more complex format containing processors, threads, etc, then you can output to a csv log file. This is useful for sorting, filtering, comparing, and doing more complex analysis on the log file entries.
Here we use ‘tracerpt c:\windows\system32\winevet\logs\Microsoft-Windows-WinINet%4Analytic.etl –of CSV –o c:\temp\Microsoft-Windows-WinINet%4Analytic.csv’ to output to a CSV.
Finally, you can use PowerShell v2 to dump out the log. The PowerShell scenario is interesting, because you can write powerful scripts around the event logs.
Look at ‘get-help get-winevent –examples” for many more really cool examples about how to list and filter these logs:
We also need to explore behind the scenes of what is happening when you enable the analytic channel of the WinInet provider. Part of the definition of that channel is to enable an ETW Trace Session with certain pre-defined Keywords (Flags) and Levels. Keywords usually specify functional sub-components (HTTP, COOKIES, CONNECTION, etc) while Levels control the level of detail (Error, Informational, Verbose) events are logged. This is useful to fine-tune logging, but the channels make it easy not to have to mess with these unless required.
Under Performance -> Data Collector Sets -> Event Trace Sessions we see a new session created for us and auto-named “EventLog-Microsoft-Windows-WinINet-Analytic”. This ETW logger was automatically created for us when we enabled the WinInet Analytic channel. You can also manually create the logger, which we may cover in another blog post. If you open up the properties for this session, you'll notice our provider listed as well as all the Keywords (Any) and Level specified. The screenshot also show which Keywords/Flags or categories a provider supports.
Once you know a provider name, you can also query for it on the command line, which is useful to get all the Keywords(Flags) and Levels a provider supports, along with the processes that are using that provider.
For the curious, we wanted to finish this post and show you how to find what tracing is available, and its format. Here we again use the wevtutil to get full publishing event and message information.
‘wevtutil get-publisher Microsoft-Windows-WinInet /getevents /getmessage’
This can be combined with another version of the same utility searching for a certain string or log – e.g. “wevtutil gp Microsoft-Windows-WinINet /ge /gm | findstr /i httpopen”
To wrap-up, here is the overall architecture of ETW covered in a previous blog post, along with the specifics of the scenario we covered. In our scenario the MMC Event Viewer console has two roles – enabling the tracing and decoding the real-time delivery of events.
This was meant as a practical introduction to using and self-discovering how to use the ETW tracing that is available right now in newer Windows releases (Vista/2008 and above). There are many, many more components that use tracing – 400+ in Vista and 600+ in Win7, which you can start exploring.
For these specific examples, it is worth noting that as far as the networking aspect is concerned, similar information might be obtained from using a packet capture utility such as NetMon. What is interesting about NetMon in Win7 is that it actually uses ETW under the hood to do most of its tracing. Here is an interesting post about Network Tracing in Windows 7 from the Netmon team.
The use of ETW under the hood of many tools is a general trend in Windows and Microsoft software. You might notice different diagnostics and tracing tools in Windows using ETW under the hood. These tools can add value to the raw ETW tracing such as further parsing, filtering, and rich views into the data. You can mix and match these tools along with the tracing for powerful views into your Windows box or server.
It’s also important to note that ETW tracing isn’t just for Microsoft software. Since any software running on Windows uses many services provided by the OS, there is a variety of ETW logging that is available for all software. For example, try running ‘logman query providers –pid PROCESSID_OF_FIREFOX’ on FireFox! You should find a great deal of logging similar to Internet Explorer.
In future posts, we hope to explore other ways of using ETW tracing, and touch on some of the kernel level tracing available.
Share this post :
Pushlocks were a new locking primitive first introduced in Windows Server 2003 and are primarily used in place of spinlocks to protect key kernel data structures. Unfortunately, Pushlocks are not documented in the WDK, and are not available for public use; however, a few internal drivers do use them, so you might see them while debugging a machine. Also, I was digging around on MSDN for Pushlocks, and I found that the Filter Manager does expose certain APIs to use Pushlocks, so you are in luck if you are developing a filter driver!
Pushlocks are built on primitive gate objects, which are defined by KGATE structures. The gate object is a highly optimized version of the basic event object. By not having both the notification and synchronization versions of the basic event object, and by being the exclusive object that a thread can wait on, the code for acquiring and releasing a gate is heavily optimized. Gates even have their own dispatcher lock instead of acquiring the entire dispatcher database.
Unlike spinlocks, which must be acquired exclusively for all operations on a data structure, pushlocks can be shared by multiple “readers” and need only be acquired exclusively when a thread needs to modify the data structure.
When a thread acquires a normal pushlock, the pushlock code marks the pushlock as owned, if it is not owned already. If someone owns the pushlock exclusively, or the thread wants to own the pushlock exclusively while someone else has it in shared mode, the thread allocates a wait block on its stack, initializes a gate object in the wait block, and then add the wait block to the wait list associated with the pushlock. When the thread holding the pushlock finally releases it, it wakes the next waiter by signaling the event in the waiters wait block.
When debugging a machine, there is no easy way to figure out the current owner of the pushlock, apart from doing code review. By looking at the waitlist, you can always figure out the threads trying to get access to it, but since the gate does not keep track of the owner like a regular mutex, it is much harder to find the current owner.
For more details on the operation and structure of a pushlock, please review the Pushlocks section in Windows Internals book, under the System Mechanisms Chapter.
If a pushlock is held by one or more readers, threads that want to modify the data structure are queued for exclusive access. This queuing mechanism provides some of the same benefits of queued spinlocks—for example, FIFO ordering, elimination of race conditions, and reduced cache thrashing when more than one thread is waiting for the pushlock.
Another advantage of using a pushlock is the size. A regular resource object is 56 bytes, however a pushlock is the size of a pointer. Apart from a small memory footprint, this helps especially in the non-contended case, since pushlocks do not require lengthy operations to perform acquisition or release. Because the pushlock fits in one “machine word”, the CPU can use atomic operations to compare and exchange the old lock with the new one.
Pushlocks are also self-optimizing in the sense that the list of threads waiting on a pushlock will be periodically rearranged to provide fairer behavior when the pushlock is released.
A cache-aware pushlock adds to the basic pushlock by allocating a normal pushlock for each processor in the system and associating it with the cache-aware pushlock. When a thread wants to acquire a cache-aware pushlock for shared access, it simply acquires the pushlock on that processor; however if it needs to acquire the lock for exclusive access, it has to acquire the pushlocks for each processor in exclusive mode.
3: kd> !thread 8c9764c0
THREAD 8c9764c0 Cid 2410.1be4 Teb: 7ff9f000 Win32Thread: e5c6f298 GATEWAIT
Stack Init b386b000 Current b386a978 Base b386b000 Limit b3867000 Call 0
ChildEBP RetAddr Args to Child
b386a990 80833485 8c9764c0 8c9764e4 00000003 nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])
b386a9bc 8082ffe0 b06a6a03 e11e0b18 b386aa54 nt!KiSwapThread+0x2e5 (FPO: [Non-Fpo]) (CONV: fastcall)
b386a9e4 8087d722 00000000 e11e0b08 e11e0b18 nt!KeWaitForGate+0x152 (FPO: [Non-Fpo]) (CONV: fastcall)
e11e0b18 00000000 0c050204 7346744e e37b2808 nt!ExfAcquirePushLockExclusive+0x112 (FPO: [Non-Fpo]) (CONV: fastcall)
Above is a snipped output from a dump that I was recently looking at. From the stack, you can see the ExfAcquirePushLockExclusive call trying to acquire the pushlock, which then calls KEWaitForGate. In this case, the lock was already acquired, so this thread allocated a wait block on its stack, and then added itself to the waitlist.
Also, the stack is broken due to the fastcall, therefore the debugger cannot display it entirely. So we can manually reconstruct the stack by passing parameters to the kb command.
k[b|p|P|v] = BasePtr StackPtr InstructionPtr
To get the arguments, we first dump the stack manually using the dps command with the current esp.
3: kd> dps b386a978 l50
b386a978 b386ad40
b386a97c 00000000
b386a980 8088dafe nt!KiSwapContext+0x26
b386a984 b386a9bc
b386a988 b386aa00
b386a98c f773f120
b386a990 8c9764c0
b386a994 80833485 nt!KiSwapThread+0x2e5
b386a998 8c9764c0
b386a99c 8c9764e4
b386a9a0 00000003
b386a9a4 8c9764c0
b386a9a8 00000003
b386a9ac 00000002
b386a9b0 00000002
b386a9b4 f773fa7c
b386a9b8 008c0030
b386a9bc b386a9e4
b386a9c0 8082ffe0 nt!KeWaitForGate+0x152
b386a9c4 b06a6a03
b386a9c8 e11e0b18
b386a9cc b386aa54
b386a9d0 00000000
b386a9d4 8c976504
b386a9d8 00000000
b386a9dc 0000001c
b386a9e0 00000000
b386a9e4 b386aa40
b386a9e8 8087d722 nt!ExfAcquirePushLockExclusive+0x112
b386a9ec 00000000
b386a9f0 e11e0b08
b386a9f4 e11e0b18
b386a9f8 b386aa40
b386a9fc 8096e9a9 nt!SeOpenObjectAuditAlarm+0x1cf
b386aa00 00040007
b386aa04 00000000
b386aa08 8c976568
b386aa0c 8c976568
b386aa10 b06a6a00
b386aa14 b4ee0a00
b386aa18 b127cc10
b386aa1c 00000000
b386aa20 00000001
b386aa24 80a60456 hal!KfLowerIrql+0x62
b386aa28 b386ac04
b386aa2c 8d117800
b386aa30 00000000
b386aa34 00000000
b386aa38 b386aa20
b386aa3c 01943080
b386aa40 b386aa64
b386aa44 808b7a14 nt!CmpCheckRecursionAndRecordThreadInfo+0x2a
From the output above, we can see the stack. To reconstruct the stack, we can get the ebp, esp, and eip from the stack for the ExfAcquirePushLockExclusive frame, and pass it to the kb command. Voila!
3: kd> kb = b386aa40 b386a9e4 8087d722
b386aa40 808b7a14 b386ac04 e11e0b18 e11e0b18 nt!ExfAcquirePushLockExclusive+0x112
b386aa64 808b7b09 e11e0b18 b386aa80 e101bf40 nt!CmpCheckRecursionAndRecordThreadInfo+0x2a
b386aaa4 808da118 0000001c b386ab58 00000001 nt!CmpCallCallBacks+0x6b
b386ab90 80937942 e101bf40 00000000 89f13648 nt!CmpParseKey+0xd4
b386ac10 80933a76 00000000 b386ac50 00000040 nt!ObpLookupObjectName+0x5b0
b386ac64 808bb471 00000000 8e930480 00000d01 nt!ObOpenObjectByName+0xea
b386ad50 808897bc 0243eba0 00020019 0243eb68 nt!NtOpenKey+0x1ad
b386ad50 7c8285ec 0243eba0 00020019 0243eb68 nt!KiFastCallEntry+0xfc
WARNING: Frame IP not in any known module. Following frames may be wrong.
0243eba4 00000000 00000000 00000000 00000000 0x7c8285ec
It's Venkatesh with a WMI nugget. While troubleshooting or debugging WMI issues you may come across WMI queries wherein you don’t know which provider implemented the WMI class used in the query. You may want to know the binary and the product that implemented the provider so you can contact the vendor or upgrade the binary to see if it resolves the issue you are investigating.
To help make the investigation easier, Windows Vista and all post operating systems have the cool new 'Analytic and Debug log' for WMI tracing. To get similar tracing in Windows 2003 and earlier versions, WMI has a “Verbose Logging” option enabled through the WMI control (WmiMgmt.msc). Check out this blog for detailed steps to enable the WMI Analytic and Debug -
http://blogs.msdn.com/wmi/archive/2009/05/27/is-wmiprvse-a-real-villain.aspx
After enabling WMI logging you will see events like the ones pictured below. There may be a particular WMI query relevant to the issue you're troubleshooting requiring you to know the provider binary for the WMI class in the query. Apart from the class name (InstProvSamp) in the query you need the WMI namespace to which this class belongs to such as '\\.\root\default' as shown below.
Fig 1: Event ID 1 in the Windows Vista’s Event Viewer\Windows Logs\Applications and Services Logs\Microsoft\Windows\WMI-Activity\Trace log
Now we have the WMI Namespace and the class name so let’s find the provider for it.
1. Get the provider name from the Class Name and Namespace:
Every WMI class has a qualifier “provider” containing the name of the provider to which it belongs. Using wbemtest.exe connect to the namespace we mentioned above(\\.\root\default). Then get the class information for our class, InstProvSamp using the “Open Class…” button. After clicking "OK" the Object editor for InstrProvSamp displays the class information for InstProvSamp including the name of the provider it belongs to (CIM_STRING qualifier “provider” as shown below).
2. Get the class identifier (CLSID) of the provider from the provider name:
For every WMI provider WMI maintains an instance of system class __Win32Provider which contains information about the physical implementation of the provider like CLSID which contains the GUID that represents the class identifier (CLSID) of the provider COM object. Now query the repository to get the provider information for “InstProvSamp” provider as I did below.
3. Get the provider binary from the provider’s CLSID:
During registration every COM object adds an entry under HKLM\SOFTWARE\Classes\CLSID. Using regedit.exe you can browse to provider’s COM registration information using the CLSID obtained from Step 2 as shown below.
This should help you connect the dots while debugging your WMI issue. Let me know if you have additional questions!