• Ntdebugging Blog

    Join us on Facebook

    • 0 Comments

    We’ve set up a new Facebook group named “Windows Debugging” as a community for Windows Debugging industry experts.  If Windows debugging is an interest of yours, please consider joining the group!  Many of the regular contributors to this blog will be there as well.  Just login to Facebook and search for “Windows Debugging” to find us, or go here:

    http://www.facebook.com/n/?group.php&gid=23775552268.

  • Ntdebugging Blog

    Case Study - Software Restriction Policies and Large EXE Files

    • 0 Comments

    Recently I received a debugging request for a customer having problems running large executables. On their systems, they could run most EXEs without any problems, but they had two programs that were over 1.8 GB which, when run, would display the following error:

    clip_image002

    If they tried to run them in a command prompt, they received the message “Access is denied.” Both attempts were made with an administrator account and in neither case were the processes created. Through testing, they found that the programs worked if they were scheduled to run as System and also worked when run in safe mode as an administrator.

    When the case was brought to my attention, it was noted that when the failing executables were run, the following appeared in process monitor logs:

    clip_image004

    The engineer did not see this when one of the problematic EXEs was run (successfully) on his test machine. The customer provided a VM image of their system which we set up in HyperV with a named pipe kernel debugger. I then started kernel debugging to find the cause of the INVALID PARAMETER error, hoping that resolving it would fix the issue.

    To start, I looked at the call stack within process monitor for the invalid parameter:

    clip_image005

    The problem is this isn’t exactly where we return invalid parameter. Looking at the source code for Fltmgr, it doesn’t return invalid parameter – this was just where the error was caught in procmon. This call stack did provide some ideas for good starting places to debug, however. First, I looked up the hex value for STATUS_INVALID_PARAMETER in ntstatus.h – 0xC000000D. Knowing this, I decided to set a breakpoint on nt!IofCallDriver and ran the program. Once the debugger broke in, I planned to use wt -oR. This will trace through the calls displaying the return values next to each call. From there, I would just need to find 0xC000000D on the return column. Unfortunately, what I had forgotten was wt does not display return codes in kernel debugging, only when debugging user mode.

    With wt not an option, I decided to use a combination of debugger commands to approximate the output of wt. I knew the return value I was looking for, and I was also confident that I would find that code in the EAX register after the problem occurred. As such, I needed to write a loop that would walk through the instructions until it found 0xC000000D in EAX. The debugger provides two main options for walking instructions: p and t. p (Step) will execute a single instruction and display the register values. If the instruction is a call, it will not enter that function, but just display the results after that subroutine has been executed. t (Trace) also executes a single instruction, but it will enter into the function and will display each instruction.

    In this case I wanted trace so I could see which function was returning the invalid parameter status. Tracing though that many instructions/functions would take a long time, but there are some variations on t (and p) that can help. tc (or pc)will execute instructions until a call statement is reached, where it will break and show the register values. tt (or pt) will execute instructions until a return instruction is reached. tct (or pct) will run until either a call or return is reached. In this case, I opted for tct.

    Knowing that I would use tct, I had to find a way to execute tct statements until EAX was the value I was looking for. This can be accomplished with the z (While) debugger command. The syntax is pretty easy, it’s just z(expression) and it works just like a do-while loop. Putting it all together, I used this command in the debugger:

    tct; z(eax!=0xc000000d)

    I then waited for the debugger to break in so I could see where this status was being thrown. Regrettably, the code called ended up going in to some recursion which made my while loop take way too long. To resolve this, I set a new breakpoint just before we entered the recursion, reran the program, used p to step past the call then ran the tct loop.

    This time I was quickly brought to the code I was looking for. As soon as it broke in, I ran k to view the callstack:

    kd> k

    ChildEBP RetAddr
    b9541a3c f7b7fab9 Ntfs!NtfsCommonDeviceControl+0x40
    b9541aa0 f7b8b02f Ntfs!NtfsFsdDispatchSwitch+0xe4
    b9541bbc 8081df85 Ntfs!NtfsFsdDispatchWait+0x1c
    b9541bd0 f7876d28 nt!IofCallDriver+0x45
    b9541bfc 8081df85 fltmgr!FltpDispatch+0x152
    b9541c10 f7876d28 nt!IofCallDriver+0x45
    b9541c3c 8081df85 fltmgr!FltpDispatch+0x152
    b9541c50 808f5437 nt!IofCallDriver+0x45
    b9541c64 808f61bf nt!IopSynchronousServiceTail+0x10b
    b9541d00 808eed08 nt!IopXxxControlFile+0x5e5
    b9541d34 808897bc nt!NtDeviceIoControlFile+0x2a
    b9541d34 7c82860c nt!KiFastCallEntry+0xfc
    0012e960 7c826fe9 ntdll!KiFastSystemCallRet
    0012e964 77e416f9 ntdll!NtDeviceIoControlFile+0xc
    0012e9c8 77e6738d kernel32!DeviceIoControl+0x137
    0012ec44 77e67032 kernel32!GetVolumeNameForRoot+0x16d
    0012ec68 77e67782 kernel32!BasepGetVolumeNameForVolumeMountPoint+0x73
    0012ecd0 7d20b01d kernel32!GetVolumePathNameW+0x1c7
    0012ef18 7d20ae2c ADVAPI32!CodeAuthzFullyQualifyFilename+0xbc
    0012eff8 7d20b33f ADVAPI32!__CodeAuthzpIdentifyOneCodeAuthzLevel+0x19f
    0012f07c 77e6df9e ADVAPI32!SaferIdentifyLevel+0x163
    0012f278 77e6ce03 kernel32!BasepCheckWinSaferRestrictions+0x60c
    0012fa90 77e424b0 kernel32!CreateProcessInternalW+0xc0e
    0012fac8 4ad0256f kernel32!CreateProcessW+0x2c
    0012fc24 4ad01a2b cmd!ExecPgm+0x221
    0012fc58 4ad019b3 cmd!ECWork+0x84
    0012fc70 4ad03c58 cmd!ExtCom+0x40
    0012fe9c 4ad01447 cmd!FindFixAndRun+0xa9
    0012fee0 4ad0c30b cmd!Dispatch+0x137
    0012ff44 4ad07786 cmd!main+0x216
    0012ffc0 77e6f23b cmd!mainCRTStartup+0x12f
    0012fff0 00000000 kernel32!BaseProcessStart+0x23

    If we look at the assembly around Ntfs!NtfsCommonDeviceControl+0x40, we see that only if our return from NtfsDecodeFileObject is not equal to 4 it would move 0xC000000D in to esi, and then move esi to eax, :

    f7b7faf9 e8e904fdff call Ntfs!NtfsDecodeFileObject (f7b4ffe7)
    f7b7fafe 83f804 cmp eax,4
    f7b7fb01 0f848873ffff je Ntfs!NtfsCommonDeviceControl+0x54 (f7b76e8f)

    Ntfs!NtfsCommonDeviceControl+0x40:
    f7b7fafe 83f804 cmp eax,4
    f7b7fb01 0f848873ffff je Ntfs!NtfsCommonDeviceControl+0x54 (f7b76e8f)
    f7b7fb07 be0d0000c0 mov esi,0C000000Dh
    f7b7fb0c 56 push esi
    f7b7fb0d 53 push ebx
    f7b7fb0e 57 push edi
    f7b7fb0f e83506fdff call Ntfs!NtfsCompleteRequest (f7b50149)
    f7b7fb14 8bc6 mov eax,esi

    I looked at the source code for these functions, and it didn’t make sense that a failure here would cause the problems we were seeing; especially specific to large executables. Out of curiosity I ran notepad on the VM again with procmon and found that it too displayed INVALID PARAMETER, but the program started and ran correctly:

    clip_image007

    Since this wasn’t the problem, I stopped reviewing the code and decided on a new approach. We knew that when running the EXE in a command prompt we received an “Access is denied message”. At that point it made sense to switch to user mode debugging and take a look at the cmd.exe process that was trying to launch install.exe

    Doing user mode debugging in a VM can be a bit of a challenge, especially if you are trying to minimize changes to the VM (and in my case, avoid putting any symbols on the customer’s VM image). Since I already had a kernel debugger attached, one option would be to run ntsd.exe (debugger provided in the Debugging Tools for Windows) on the VM with the -p switch specifying the PID of the cmd.exe process I wanted to debug and -d switch which forwards the i/o of ntsd to the kernel debugger. The problem with this approach is the kernel debugger just becomes a front end for issuing commands and seeing the output from ntsd. That means all symbol resolution is still done on the target system running ntsd.

    I didn’t want to give the customer VM Internet or corporate network access, so I instead opted to run dbgsrv.exe on the VM. Running “dbgsrv -t tcp:port=9999” tells the debug server to listen on TCP port 9999 for debugger connections. Then, on the HyperV server computer I could just run windbg -premote tcp:server=(IP of VM),port=9999 -p (PID of cmd on VM) to debug it.

    I suspected that we may be calling CreateProcess but it was failing, so I set a breakpoint on kernel32!CreateProcessW. Sure enough, it hit when I tried to run install.exe in the command prompt:

    0:000> k
    ChildEBP RetAddr
    0012fac8 4ad0256f kernel32!CreateProcessW
    0012fc24 4ad01a2b cmd!ExecPgm+0x221
    0012fc58 4ad019b3 cmd!ECWork+0x84
    0012fc70 4ad03c58 cmd!ExtCom+0x40
    0012fe9c 4ad01447 cmd!FindFixAndRun+0xa9
    0012fee0 4ad0c30b cmd!Dispatch+0x137
    0012ff44 4ad07786 cmd!main+0x216
    0012ffc0 77e6f23b cmd!mainCRTStartup+0x12f
    0012fff0 00000000 kernel32!BaseProcessStart+0x23

    This time I could use wt -oR since this was a usermode debug. Looking in ntstatus.h again, the code for STATUS_ACCESS_DENIED is 0xC0000022. Running wt can take a very long time, so I used the -l switch, which limits the number of levels deep it will display. This would be something like using tct as I did above until you were a few calls deep then using pct. Using wt -l 3 -oR gave me the following:


    575      291       [ 1] kernel32!CreateProcessInternalW
    35        0           [ 2] kernel32!BasepCheckWinSaferRestrictions
    25        0           [ 3] ntdll!RtlEnterCriticalSection eax = 0
    48        25         [ 2] kernel32!BasepCheckWinSaferRestrictions
    1          0           [ 3] ntdll!NtOpenThreadToken
    3          0           [ 3] ntdll!ZwOpenThreadToken eax = ffffffff`c000007c
    57        29         [ 2] kernel32!BasepCheckWinSaferRestrictions
    1          0           [ 3] ntdll!ZwOpenProcessToken
    3          0           [ 3] ntdll!NtOpenProcessToken eax = 0
    116      33         [ 2] kernel32!BasepCheckWinSaferRestrictions
    113      0           [ 3] ADVAPI32!SaferIdentifyLevel eax = 0
    130      146       [ 2] kernel32!BasepCheckWinSaferRestrictions
    4           0          [ 3] ntdll!ZwClose eax = 0
    132      150       [ 2] kernel32!BasepCheckWinSaferRestrictions
    22        0           [ 3] ntdll!RtlLeaveCriticalSection eax = 0
    138      172       [ 2] kernel32!BasepCheckWinSaferRestrictions
    5           0          [ 3] kernel32!__security_check_cookie eax = ffffffff`c0000022

     Now we are getting close! I set a new breakpoint for kernel32!BasepCheckWinSaferRestrictions and reran the test. This gave me the following line:

    63 0 [ 3] ADVAPI32!__CodeAuthzpCheckIdentityHashRules eax = ffffffff`c0000022

    One last run with a new breakpoint at ADVAPI32!__CodeAuthzpCheckIdentityHashRules and I found what I was looking for:

    58 218 [ 1] ADVAPI32!__CodeAuthzpEnsureMapped eax = ffffffff`c0000022

    The depth is shown in brackets. As this call was 1 deep from __CodeAuthzpCheckIdentityHashRules and I was using 3 as my maximum depth in wt, I knew this is where the STATUS_ACCESS_DENIED was coming from. I reviewed the source code and found that this is the code that performs Software Restriction Policy checking. Specifically, we are attempting to map the executable into memory to perform hash checking on it. Since there isn’t 1.8 GB of contiguous available memory, it failed. Looking at the VM, I discovered that the customer had implemented a number of software restriction policies. As a test, I removed their restrictions on the VM, and the programs ran successfully. A search of the KB revealed that a hotfix was published for this problem: 973825. Installing the hotfix in the article also resolved the issue with their policies intact.

    -Matt Burrough

  • Ntdebugging Blog

    We Need Your XPERF Feedback

    • 0 Comments

    Hi,

     

    This is Tate asking for your direct and valuable feedback on XPerf (The Windows Performance Toolkit).  We are knee deep in working with the creators of the tool (The Windows Fundamentals Team) to provide end user feedback which will drive features for the next version. Now that you as our readers have seen several blog posts on the utility and hopefully have used it for a few of your own debugs, what would you like to see in terms of new features, fixes, data collection steps, etc?

     

    Particular areas of interest to the team are:

     

    -        Data Collection – easy or difficult, what would you change?

    -        Usage Scenarios – stack walking is powerful, what would you like the tool to expose that it does not?

    -        Any blocking factors keeping you from using the toolkit, what are they, how could we remedy that for you?

    -        Data Analysis – what would make your life as a debugger using the toolkit better?

     

    Please send us your feedback below and thanks in advance.

     

    -Tate

  • Ntdebugging Blog

    Working on an Application Compatibility Issue? Let us Help!

    • 0 Comments

    This isn’t our typical debugging type article however I found the information very useful for the developer community.

    Windows 7 is slated to launch in the next few weeks. If your applications aren’t quite ready for Windows 7 and having you’re having issues we may be able to help you out!

    Last Monday we launched a new pilot program in our Advisory Services space.  Advisory Services is a consultative support option that provides support beyond standard break-fix issues.  The new program involves remote, phone-based support for issues such as product migration, code review or new program development.  The service is intended for Developers and IT Professionals for shorter engagements that don’t require traditional onsite consulting or sustained account management services available via other Microsoft support options.

    For the Application Compatibility engagements, we’ll start off with some basic scoping questions such as whether the application is 16-, 32-, or 64-bit.  Is it a client-server application?  What compatibility issues are you experiencing?  Slow Performance?  Hang or Crash?  Installation problems?  The support engineers will be using tools such as the Application Compatibility Toolkit, the the Standard User Analyzer Wizard, and the Setup Analysis Tool.

    The KB Article referenced below has more details about the program and how to engage Microsoft.  So, if you’re working on a pesky Windows Vista or Windows 7 Application Compatibility issue, give us a call – we can help!

     

    Additional Resources:

    Cheers,

    Ronsto

     

    Share this post :
  • Ntdebugging Blog

    Push Locks – What are they?

    • 0 Comments

    Pushlocks were a new locking primitive first introduced in Windows Server 2003 and are primarily used in place of spinlocks to protect key kernel data structures. Unfortunately, Pushlocks are not documented in the WDK, and are not available for public use; however, a few internal drivers do use them, so you might see them while debugging a machine. Also, I was digging around on MSDN for Pushlocks, and I found that the Filter Manager does expose certain APIs to use Pushlocks, so you are in luck if you are developing a filter driver!

     

    Gate objects

     

    Pushlocks are built on primitive gate objects, which are defined by KGATE structures. The gate object is a highly optimized version of the basic event object. By not having both the notification and synchronization versions of the basic event object, and by being the exclusive object that a thread can wait on, the code for acquiring and releasing a gate is heavily optimized. Gates even have their own dispatcher lock instead of acquiring the entire dispatcher database.

     

    Unlike spinlocks, which must be acquired exclusively for all operations on a data structure, pushlocks can be shared by multiple “readers” and need only be acquired exclusively when a thread needs to modify the data structure.

     

    Operation

     

    When a thread acquires a normal pushlock, the pushlock code marks the pushlock as owned, if it is not owned already. If someone owns the pushlock exclusively, or the thread wants to own the pushlock exclusively while someone else has it in shared mode, the thread allocates a wait block on its stack, initializes a gate object in the wait block, and then add the wait block to the wait list associated with the pushlock. When the thread holding the pushlock finally releases it, it wakes the next waiter by signaling the event in the waiters wait block.

     

    When debugging a machine, there is no easy way to figure out the current owner of the pushlock, apart from doing code review. By looking at the waitlist, you can always figure out the threads trying to get access to it, but since the gate does not keep track of the owner like a regular mutex, it is much harder to find the current owner.

    For more details on the operation and structure of a pushlock, please review the Pushlocks section in Windows Internals book, under the System Mechanisms Chapter.

     

    Advantages of using a PushLock

     

    If a pushlock is held by one or more readers, threads that want to modify the data structure are queued for exclusive access. This queuing mechanism provides some of the same benefits of queued spinlocks—for example, FIFO ordering, elimination of race conditions, and reduced cache thrashing when more than one thread is waiting for the pushlock.

     

    Another advantage of using a pushlock is the size. A regular resource object is 56 bytes, however a pushlock is the size of a pointer. Apart from a small memory footprint, this helps especially in the non-contended case, since pushlocks do not require lengthy operations to perform acquisition or release. Because the pushlock fits in one “machine word”, the CPU can use atomic operations to compare and exchange the old lock with the new one.

     

    Pushlocks are also self-optimizing in the sense that the list of threads waiting on a pushlock will be periodically rearranged to provide fairer behavior when the pushlock is released.

     

    Cache Aware Pushlocks

     

    A cache-aware pushlock adds to the basic pushlock by allocating a normal pushlock for each processor in the system and associating it with the cache-aware pushlock. When a thread wants to acquire a cache-aware pushlock for shared access, it simply acquires the pushlock on that processor; however if it needs to acquire the lock for exclusive access, it has to acquire the pushlocks for each processor in exclusive mode.

     

    What does a Pushlock look like?

     

    3: kd> !thread 8c9764c0

    THREAD 8c9764c0  Cid 2410.1be4  Teb: 7ff9f000 Win32Thread: e5c6f298 GATEWAIT

    Stack Init b386b000 Current b386a978 Base b386b000 Limit b3867000 Call 0

    ChildEBP RetAddr  Args to Child             

    b386a990 80833485 8c9764c0 8c9764e4 00000003 nt!KiSwapContext+0x26 (FPO: [Uses EBP] [0,0,4])

    b386a9bc 8082ffe0 b06a6a03 e11e0b18 b386aa54 nt!KiSwapThread+0x2e5 (FPO: [Non-Fpo]) (CONV: fastcall)

    b386a9e4 8087d722 00000000 e11e0b08 e11e0b18 nt!KeWaitForGate+0x152 (FPO: [Non-Fpo]) (CONV: fastcall)

    e11e0b18 00000000 0c050204 7346744e e37b2808 nt!ExfAcquirePushLockExclusive+0x112 (FPO: [Non-Fpo]) (CONV: fastcall)

     

    Above is a snipped output from a dump that I was recently looking at. From the stack, you can see the ExfAcquirePushLockExclusive call trying to acquire the pushlock, which then calls KEWaitForGate. In this case, the lock was already acquired, so this thread allocated a wait block on its stack, and then added itself to the waitlist.

    Also, the stack is broken due to the fastcall, therefore the debugger cannot display it entirely. So we can manually reconstruct the stack by passing parameters to the kb command.

    k[b|p|P|v] = BasePtr StackPtr InstructionPtr

     

    To get the arguments, we first dump the stack manually using the dps command with the current esp.

    3: kd> dps b386a978 l50

    b386a978  b386ad40

    b386a97c  00000000

    b386a980  8088dafe nt!KiSwapContext+0x26

    b386a984  b386a9bc

    b386a988  b386aa00

    b386a98c  f773f120

    b386a990  8c9764c0

    b386a994  80833485 nt!KiSwapThread+0x2e5

    b386a998  8c9764c0

    b386a99c  8c9764e4

    b386a9a0  00000003

    b386a9a4  8c9764c0

    b386a9a8  00000003

    b386a9ac  00000002

    b386a9b0  00000002

    b386a9b4  f773fa7c

    b386a9b8  008c0030

    b386a9bc  b386a9e4

    b386a9c0  8082ffe0 nt!KeWaitForGate+0x152

    b386a9c4  b06a6a03

    b386a9c8  e11e0b18

    b386a9cc  b386aa54

    b386a9d0  00000000

    b386a9d4  8c976504

    b386a9d8  00000000

    b386a9dc  0000001c

    b386a9e0  00000000

    b386a9e4  b386aa40

    b386a9e8  8087d722 nt!ExfAcquirePushLockExclusive+0x112

    b386a9ec  00000000

    b386a9f0  e11e0b08

    b386a9f4  e11e0b18

    b386a9f8  b386aa40

    b386a9fc  8096e9a9 nt!SeOpenObjectAuditAlarm+0x1cf

    b386aa00  00040007

    b386aa04  00000000

    b386aa08  8c976568

    b386aa0c  8c976568

    b386aa10  b06a6a00

    b386aa14  b4ee0a00

    b386aa18  b127cc10

    b386aa1c  00000000

    b386aa20  00000001

    b386aa24  80a60456 hal!KfLowerIrql+0x62

    b386aa28  b386ac04

    b386aa2c  8d117800

    b386aa30  00000000

    b386aa34  00000000

    b386aa38  b386aa20

    b386aa3c  01943080

    b386aa40  b386aa64

    b386aa44  808b7a14 nt!CmpCheckRecursionAndRecordThreadInfo+0x2a

     

    From the output above, we can see the stack. To reconstruct the stack, we can get the ebp, esp, and eip from the stack for the ExfAcquirePushLockExclusive frame, and pass it to the kb command. Voila!

     

    3: kd> kb = b386aa40 b386a9e4 8087d722

    ChildEBP RetAddr  Args to Child             

    b386aa40 808b7a14 b386ac04 e11e0b18 e11e0b18 nt!ExfAcquirePushLockExclusive+0x112

    b386aa64 808b7b09 e11e0b18 b386aa80 e101bf40 nt!CmpCheckRecursionAndRecordThreadInfo+0x2a

    b386aaa4 808da118 0000001c b386ab58 00000001 nt!CmpCallCallBacks+0x6b

    b386ab90 80937942 e101bf40 00000000 89f13648 nt!CmpParseKey+0xd4

    b386ac10 80933a76 00000000 b386ac50 00000040 nt!ObpLookupObjectName+0x5b0

    b386ac64 808bb471 00000000 8e930480 00000d01 nt!ObOpenObjectByName+0xea

    b386ad50 808897bc 0243eba0 00020019 0243eb68 nt!NtOpenKey+0x1ad

    b386ad50 7c8285ec 0243eba0 00020019 0243eb68 nt!KiFastCallEntry+0xfc

    WARNING: Frame IP not in any known module. Following frames may be wrong.

    0243eba4 00000000 00000000 00000000 00000000 0x7c8285ec

     

     

    Share this post :

  • Ntdebugging Blog

    Part 2 - Exploring and Decoding ETW Providers using Event Log Channels

    • 0 Comments

    Introduction and Overview

    In this article we will explore a practical use for ETW tracing, and discover what ETW (Event Tracing for Windows) tracing is available for a popular Windows user-mode component, Internet Explorer. In my previous article ETW Introduction and Overview, we covered what ETW tracing is and how it could be used.

    The goal in this exercise is to learn about ETW tracing in general, how to self-discover what tracing is available in a component, and some ways you can leverage the tracing to self-troubleshoot issues. ETW logging is essentially allowing Microsoft code to speak for itself. It tells you what code ran, what that code did, and any errors produced. Also ETW logs can be used along with a more traditional toolset for troubleshooting that class of issue such as the SysInternals tools, Network Monitor, etc.

    For this article, we are using the RC Build of Windows 7. The concepts and examples should be very similar and work fine for Windows Vista.

    Getting Started “Exploring”

    First we start off by opening Internet Explorer 8 whose home page is set to the NTDebugging Blog - http://blogs.msdn.com/ntdebugging

    clip_image001

    In order to find what Internet Explorer is logging with ETW, we list all of the registered ETW providers on a box, which includes all of the installed code on the machine (the component doesn’t need to be actively running).

    We start off by running “logman query providers” and look through the list for relevant hits. However, the results below return more than 400+ hits on Vista, and 600+ on Win7. You will probably quickly see though that searching through this large list of providers might not always be best way to go about finding which providers Internet Explorer is logging with.

    clip_image003

    Using this list we can also filter or browse for our component. This may work just fine, however, one issue with this approach is that sometimes you may not know the particular naming convention for a process. Also, many software components use shared dlll’s to do a variety of the under the hood work. As such, you may not know what each of those pieces of software are, or how they are named.

    Here we try running a built-in cmd line tool logman ‘query providers’ and searching for “Internet”, which didn’t turn up anything on this particular search. It looks like the naming convention for Internet Explorer might be a bit different than our first search.

    clip_image005

    While we could continue to browse the large list of providers, I have decided to use another useful method to find which providers IE uses, which is to filter all providers that a certain ProcessId uses. We start out by getting the ProcessId of Internet Explorer using two methods shown here, Task Manager and the command line tasklist filtering on iexplore.exe.

    clip_image007

    clip_image009

    Now that we know Internet Explorer is running with process id 6200, we can do another query with logman to find out the ETW providers.

    Internet Explorer ETW Providers

    ‘Logman query providers –pid 6200’ is used to list all of the user-mode ETW providers that Internet Explorer uses and the associated GUIDs (GUIDs are the Globally Unique ids that enable tracing for a component).

    In looking at the screenshot below, notice that a wide variety of providers are shown, and at first glance, you may wonder how some relate to Internet Explorer. As you investigate further though, all these components enable certain features within Internet Explorer, but even with tracing enabled, may not log anything unless you specifically use the part of Internet Explorer which runs that code at run-time.

    It is also important to note that Internet Explorer will also use a lot of kernel services to eventually do its work, such as NDIS. There is ETW tracing for these kernel components, but would not show up under a specific process as they are used by all processes at the kernel level. Here we see a specific component of interest that we want to follow – Microsoft-Windows-WinINet. A quick Bing search of WinINet turns up this on MSDN, which sounds relevant.

    Extracted from MSDN: The Microsoft Windows Internet (WinINet) application programming interface (API) enables applications to access standard Internet protocols, such as FTP and HTTP. For ease of use, WinINet abstracts these protocols into a high-level interface.

    So Microsoft-Windows-WinINet looks like a good component to enable tracing in.

    clip_image011

    From using SysInternals Process Explorer you can also confirm and that WININET.dll is loaded into the address space of iexplore.exe as shown below.

    clip_image013

    Now that we know the ETW Provider name and GUID, we will launch Computer Mgmt and turn on tracing for that component. Note that not every Provider can be decoded this way (due to architectural and security reasons), but many can, and at the very least, the log can be enabled and provided to Microsoft Support to fully decode.

    Launch Computer Management by typing ‘compmgmt.msc’ into the start search box or right clicking Manage on Computer Management.

    clip_image014

    Enabling ETW Logging

    Once in Computer Management, Navigate to and click on Event Viewer -> Applications and Services Logs. Once there, make sure View -> Show Analytic and Debug Logs is enabled as shown below, which will provide a much greater set of logs to look at. Most logs are under Event Viewer -> Applications and Services Logs -> Microsoft -> Windows, where we will spend the majority of our time. There are generally four channels that can show up under each provider. Channels are targeted at different roles. Admin and Operational channels target IT professionals and Administrators and are enabled by default, while Analytic and Debug Channels are more in depth, and not usually enabled by default.

    The logs that show up here are XML Manifest based tracing called Windows Events. ETW tracing splits up the collection and decoding of traces into two separate steps for performance and security reasons. In the manifest based tracing first available with Vista, most components and events are defined in a XML manifest compiled along with the binary, and which are defined in a resource file language dll. Most user-mode resource dlls show up under C:\Windows\System32\en-US (for English US). The good part about manifest based tracing is that many logs are self-discoverable and customer decodable!

    clip_image016

    Next we will browse to: Event Viewer -> Applications and Services Logs -> Microsoft -> Windows -> WinINet, right click on the Analytic channel, and choose ‘Enable Log’. Remember that we had to enable this log because Analytic and Debug logs are not enabled by default. Enabling the channel will automatically enable the ETW tracing for that component and logs will start being decoded similar to the Event Log.

    clip_image017

    Capture and view WinINet logging for our scenario

    After enabling the log, I am going to close Internet Explorer which is running code that is now logging using ETW and navigate to the website http://blogs.msdn.com/ntdebugging.

    clip_image001[1]

    Now we can look at the result of our work and tracing, which decodes the tracing for us. Below, we are on the WININET_REQUEST_HEADER category, showing the GET request. Note that the tracing here is showing us proxy interaction, DNS requests and responses, cookies, TCP requests/responses, as well as HTTP requests and responses.

    Now, one might be able to get the same information using a network capturing program such as NetMon or WireShark, but the nice thing about the ETW tracing here is that is

    a) In box without having to install additional tools

    b) The aggregate view of what the WinINet component is seen from its point of view (composed of DNS, TCP, Proxy, Cookies, etc).

    This is a good example of using this tracing in combination with other troubleshooting tools (if required), which may or may not be useful depending on the scenario.

    clip_image019

    Dumping the trace log out using other methods:

    You can use the Event Viewer GUI, or if you prefer to view the log in a text editor (or spreadsheet). You can dump the log to text or CSV format using a couple of different methods.

    The first method is using the Windows Event Log GUI to export or save the event log

    clip_image020

    Or you can output the entire log to screen in human readable text format:

    clip_image022

    We can also output to the screen filtering for any messages with error level using an XPath Query. We can start right away if you know the XPath Syntax for your query, or you can use the GUI to help build the query for you. Once you choose ‘Filter Current Log’ you can filter by Event Level, or a variety of other criteria.

    clip_image024

    The XML view will show you the raw XPATH syntax needed to use wevtutil to query for only Error events.

    clip_image025

    Now we actually run the XPath query using ‘wevtutil query-events Microsoft-Windows-WinINet/Analytic /f:Text "/q:*[System[(Level=2)]]"’

    clip_image027

    If you prefer a more complex format containing processors, threads, etc, then you can output to a csv log file. This is useful for sorting, filtering, comparing, and doing more complex analysis on the log file entries.

    clip_image029

    Here we use ‘tracerpt c:\windows\system32\winevet\logs\Microsoft-Windows-WinINet%4Analytic.etl –of CSV –o c:\temp\Microsoft-Windows-WinINet%4Analytic.csv’ to output to a CSV.

    clip_image031

    Finally, you can use PowerShell v2 to dump out the log. The PowerShell scenario is interesting, because you can write powerful scripts around the event logs.

    clip_image033

    Look at ‘get-help get-winevent –examples” for many more really cool examples about how to list and filter these logs:

    clip_image035

    Behind the Scenes

    We also need to explore behind the scenes of what is happening when you enable the analytic channel of the WinInet provider. Part of the definition of that channel is to enable an ETW Trace Session with certain pre-defined Keywords (Flags) and Levels. Keywords usually specify functional sub-components (HTTP, COOKIES, CONNECTION, etc) while Levels control the level of detail (Error, Informational, Verbose) events are logged. This is useful to fine-tune logging, but the channels make it easy not to have to mess with these unless required.

    Under Performance -> Data Collector Sets -> Event Trace Sessions we see a new session created for us and auto-named “EventLog-Microsoft-Windows-WinINet-Analytic”. This ETW logger was automatically created for us when we enabled the WinInet Analytic channel. You can also manually create the logger, which we may cover in another blog post. If you open up the properties for this session, you'll notice our provider listed as well as all the Keywords (Any) and Level specified. The screenshot also show which Keywords/Flags or categories a provider supports.

    clip_image037

    Once you know a provider name, you can also query for it on the command line, which is useful to get all the Keywords(Flags) and Levels a provider supports, along with the processes that are using that provider.

    clip_image039

    For the curious, we wanted to finish this post and show you how to find what tracing is available, and its format. Here we again use the wevtutil to get full publishing event and message information.

    ‘wevtutil get-publisher Microsoft-Windows-WinInet /getevents /getmessage’

    clip_image041

    This can be combined with another version of the same utility searching for a certain string or log – e.g. “wevtutil gp Microsoft-Windows-WinINet /ge /gm | findstr /i httpopen”

    clip_image043

    To wrap-up, here is the overall architecture of ETW covered in a previous blog post, along with the specifics of the scenario we covered. In our scenario the MMC Event Viewer console has two roles – enabling the tracing and decoding the real-time delivery of events.

    clip_image045

    Where to go from here?

    This was meant as a practical introduction to using and self-discovering how to use the ETW tracing that is available right now in newer Windows releases (Vista/2008 and above). There are many, many more components that use tracing – 400+ in Vista and 600+ in Win7, which you can start exploring.

    For these specific examples, it is worth noting that as far as the networking aspect is concerned, similar information might be obtained from using a packet capture utility such as NetMon. What is interesting about NetMon in Win7 is that it actually uses ETW under the hood to do most of its tracing. Here is an interesting post about Network Tracing in Windows 7 from the Netmon team.

    The use of ETW under the hood of many tools is a general trend in Windows and Microsoft software. You might notice different diagnostics and tracing tools in Windows using ETW under the hood. These tools can add value to the raw ETW tracing such as further parsing, filtering, and rich views into the data. You can mix and match these tools along with the tracing for powerful views into your Windows box or server.

    It’s also important to note that ETW tracing isn’t just for Microsoft software. Since any software running on Windows uses many services provided by the OS, there is a variety of ETW logging that is available for all software. For example, try running ‘logman query providers –pid PROCESSID_OF_FIREFOX’ on FireFox! You should find a great deal of logging similar to Internet Explorer.

    In future posts, we hope to explore other ways of using ETW tracing, and touch on some of the kernel level tracing available.

     

     

    Share this post :

  • Ntdebugging Blog

    WMI Nugget: How to Gather the Provider Binary from a WMI Class Name

    • 0 Comments

    It's Venkatesh with a WMI nugget. While troubleshooting or debugging WMI issues you may come across WMI queries wherein you don’t know which provider implemented the WMI class used in the query. You may want to know the binary and the product that implemented the provider so you can contact the vendor or upgrade the binary to see if it resolves the issue you are investigating.

    To help make the investigation easier, Windows Vista and all post operating systems have the cool new 'Analytic and Debug log' for WMI tracing. To get similar tracing in Windows 2003 and earlier versions, WMI has a “Verbose Logging” option enabled through the WMI control (WmiMgmt.msc). Check out this blog for detailed steps to enable the WMI Analytic and Debug -

    http://blogs.msdn.com/wmi/archive/2009/05/27/is-wmiprvse-a-real-villain.aspx

    After enabling WMI logging you will see events like the ones pictured below. There may be a particular WMI query relevant to the issue you're troubleshooting requiring you to know the provider binary for the WMI class in the query. Apart from the class name (InstProvSamp) in the query you need the WMI namespace to which this class belongs to such as '\\.\root\default' as shown below.

    image

    Fig 1: Event ID 1 in the Windows Vista’s Event Viewer\Windows Logs\Applications and Services Logs\Microsoft\Windows\WMI-Activity\Trace log

    Now we have the WMI Namespace and the class name so let’s find the provider for it.

    1. Get the provider name from the Class Name and Namespace:

    Every WMI class has a qualifier “provider” containing the name of the provider to which it belongs. Using wbemtest.exe connect to the namespace we mentioned above(\\.\root\default). Then get the class information for our class, InstProvSamp using the “Open Class…” button. After clicking "OK" the Object editor for InstrProvSamp displays the class information for InstProvSamp including the name of the provider it belongs to (CIM_STRING qualifier “provider” as shown below).

    image

    image

    2. Get the class identifier (CLSID) of the provider from the provider name:

    For every WMI provider WMI maintains an instance of system class __Win32Provider which contains information about the physical implementation of the provider like CLSID which contains the GUID that represents the class identifier (CLSID) of the provider COM object. Now query the repository to get the provider information for “InstProvSamp” provider as I did below.

    image

    image 

    image

    3. Get the provider binary from the provider’s CLSID:

    During registration every COM object adds an entry under HKLM\SOFTWARE\Classes\CLSID. Using regedit.exe you can browse to provider’s COM registration information using the CLSID obtained from Step 2 as shown below.

    image

    This should help you connect the dots while debugging your WMI issue. Let me know if you have additional questions!

    Share this post :
  • Ntdebugging Blog

    Ntdebugging on Twitter

    • 0 Comments

    We're now on Twitter @ http://twitter.com/ntdebugging. If you want to chime in with feedback or suggestions for future articles feel free to drop us a "tweet". I thought this was would give us a better venue to interact with you guys.

     Cheers,

    Ron Stock

  • Ntdebugging Blog

    Debug Nugget: DumpConfigurator Utility

    • 0 Comments

    Hi - my name is Naresh and I am a Sr. Escalation Engineer on the Microsoft GES platforms team. Today I'm discussing a simple, yet powerful GUI tool used to configure a Windows system locally or remotely for a memory dump. The name of the tool is DumpConfigurator.hta and it can be accessed from CodePlex.  Check out the following Microsoft KB article which references the use of the tool.

    969028  How to generate a kernel or a complete memory dump file in Windows Server 2008

    http://support.microsoft.com/default.aspx?scid=kb;EN-US;969028

     

    The tool can be used with all currently supported versions of the Windows Operating System. Once you download it, launch it with Administrator privileges to get the following UI: 

    GUI

    The GUI is self-explanatory and all the settings can be edited and saved by clicking Save Settings. The system will have to be rebooted for the settings to take effect.
    NOTE: Read the Warranty Disclaimer for the tool before use:)

     

    Share this post :
  • Ntdebugging Blog

    Debugging a Bugcheck 0xF4

    • 0 Comments

    My name is Ryan Mangipano (ryanman) and I am a Sr. Support Escalation Engineer at Microsoft.  Today’s blog will be a quick walkthrough of the analysis of a bugcheck 0xF4 and how I determined that the action plan going forward should consist of enabling pool tagging on this system.

     

    I began my review with !analyze –v. From the output I can see that a process required for the system to function properly unexpectedly exited or was terminated. The goal of this debugging session will be to determine what failed and why.

     

    0: kd> !analyze -v

    *******************************************************************************

    *                                                                             *

    *                        Bugcheck Analysis                                    *

    *                                                                             *

    *******************************************************************************

     

    CRITICAL_OBJECT_TERMINATION (f4)

    A process or thread crucial to system operation has unexpectedly exited or been

    terminated.

    Several processes and threads are necessary for the operation of the

    system; when they are terminated (for any reason), the system can no

    longer function.

    Arguments:

    Arg1: 00000003, Process                                A value of 0x3 in this parameter indicated that it was a process that terminated, not a thread

    Arg2: 8a03ada0, Terminating object        This value is a pointer to the _EPROCESS object that terminated

    Arg3: 8a03af14, Process image file name              Process Name

    Arg4: 805d1204, Explanatory message (ascii)         text message about the problem

     

    We shall begin by dumping out all the parameters of the bugcheck. Let’s dump out the "Terminating Object" below

    0: kd> !object 8a03ada0

    Object: 8a03ada0  Type: (8a490900) Process

        ObjectHeader: 8a03ad88 (old version)

        HandleCount: 3  PointerCount: 228

     

    First, let’s dump out the process image file name from the bugcheck parameter 3 above.

    0: kd> dc 8a03af14

    8a03af14  73727363 78652e73 00000065 00000000  csrss.exe

     

     

    0: kd> dt _EPROCESS 8a03ada0 imageFileName

    CSRSRV!_EPROCESS

       +0x174 ImageFileName : [16]  "csrss.exe"

     

    Notice that if we add the base of the _EPROCESS object (8a03ada0- Parameter 2)to the offset of the imageFileName field (+0x174) we get parameter 3. The imageFileName field.

    0: kd> ? 8a03ada0+0x174

    Evaluate expression: -1979470060 = 8a03af14

     

    0: kd> dc 8a03af14

    8a03af14  73727363 78652e73 00000065 00000000  csrss.exe.......

     

    Let’s dump out the ASCII message from parameter number 4

    0: kd> dc 805d1204

    805d1204  6d726554 74616e69 20676e69 74697263  Terminating crit

    805d1214  6c616369 6f727020 73736563 25783020  ical process

     

    Let’s review the debugger help file for more information. We can see that this bugcheck occurs when a critical process or thread terminates. “Several processes and threads are necessary for the operation of the system. When they are terminated for any reason, the system can no longer function. “

    0: kd> .hh bug check 0xf4

     

     

     

     

     

     

    Next, we need to determine why this process terminated. !analyze –v also provided us with an exception record which provides us with an error code:

     

    PROCESS_NAME:  csrss.exe

     

    EXCEPTION_RECORD:  9a85e9d8 -- (.exr 0xffffffff9a85e9d8)

    ExceptionAddress: 7c92c375 (ntdll!RtlFindMessage+0x0000007c)

       ExceptionCode: c0000006 (In-page I/O error)

      ExceptionFlags: 00000000

    NumberParameters: 3

       Parameter[0]: 00000000

       Parameter[1]: 7c99c3d8

       Parameter[2]: c000009a

    Inpage operation failed at 7c99c3d8, due to I/O error c000009a

     

    EXCEPTION_CODE: (NTSTATUS) 0xc0000006 - The instruction at 0x%p referenced memory at 0x%p. The required data was not placed into memory because of an I/O error status of 0x%x.

     

    Since we have an error code, let’s investigate that error code. We can quickly perform this operation from within the debugger using the !error command

     

    0: kd> !error c000009a

    Error code: (NTSTATUS) 0xc000009a (3221225626) - Insufficient system resources exist to complete the API.

     

    0: kd > .hh !error

     

     

    Let’s check the output of the !vm command

     

    0: kd> !vm 2

     

    *** Virtual Memory Usage ***

                    Physical Memory:      760875 (   3043500 Kb)

                    Page File: \??\C:\pagefile.sys

                      Current:   4190208 Kb  Free Space:   4156380 Kb

                      Minimum:   4190208 Kb  Maximum:      4190208 Kb

                    Available Pages:      579241 (   2316964 Kb)

                    ResAvail Pages:       673481 (   2693924 Kb)

                    Locked IO Pages:          69 (       276 Kb)

                    Free System PTEs:     115226 (    460904 Kb)

                    Free NP PTEs:              0 (         0 Kb)

                    Free Special NP:           0 (         0 Kb)

                    Modified Pages:          221 (       884 Kb)

                    Modified PF Pages:       219 (       876 Kb)

                    NonPagedPool Usage:    65534 (    262136 Kb)

                    NonPagedPool Max:      65536 (    262144 Kb)

                    ********** Excessive NonPaged Pool Usage *****

                    PagedPool 0 Usage:     24167 (     96668 Kb)

                    PagedPool 1 Usage:       967 (      3868 Kb)

                    PagedPool 2 Usage:       967 (      3868 Kb)

                    PagedPool 3 Usage:       984 (      3936 Kb)

                    PagedPool 4 Usage:       977 (      3908 Kb)

                    PagedPool Usage:       28062 (    112248 Kb)

                    PagedPool Maximum:     92160 (    368640 Kb)

     

                    ********** 2075 pool allocations have failed **********

     

                    Session Commit:         1562 (      6248 Kb)

                    Shared Commit:          2526 (     10104 Kb)

                    Special Pool:              0 (         0 Kb)

                    Shared Process:         4821 (     19284 Kb)

                    PagedPool Commit:      28062 (    112248 Kb)

                    Driver Commit:          5138 (     20552 Kb)

                    Committed pages:      153449 (    613796 Kb)

                    Commit limit:        1767229 (   7068916 Kb)

     

    0: kd> !poolused

    unable to get PoolTrackTable - pool tagging is disabled, enable it to use this command

    Use gflags.exe and check the box that says "Enable pool tagging".

     

    The output above has informed us that pool tagging is disabled. Let’s demonstrate how you can verify that it is disabled:

     

    0: kd> dd nt!NtGlobalFlag L1

    805597ec  00000000

     

    0: kd> !gflag

    Current NtGlobalFlag contents: 0x00000000

     

    Let’s explore the debugging help file entry on the !poolused command

     

    0: kd > .hh !poolused

     

     

     

    Reading the text above, we are informed that “Pool tagging is permanently enabled on Windows Server 2003 and later versions of Windows. On Windows XP and earlier versions of Windows, you must enable pool tagging by using Gflags.”

     

     

    Using the vertarget command, I can see that this system was running Windows XP.

    0: kd> vertarget

    Windows XP Kernel Version 2600 (Service Pack 2) MP (2 procs) Free x86 compatible

     

     

     

    0: kd > .hh !gflag

     

     

     

    By reviewing the help file entry for the !gflag extension, I was able to determine that if pooltagging was set, the following bit would have been set:

     

    0x400 "ptg"         Enable pool tagging.

     

    0: kd> .formats 0x400

    Evaluate expression:

    ….

      Binary:  00000000 00000000 00000100 00000000                0x00000400

     

    Gflags is included in the Debugging Tools for Windows. The screenshot below is from a Windows 7 system. Notice that Pool Tagging is enabled permanently as described above.

     

     

     

    Summary: This system bugchecked when the critical process csrss.exe failed an I/O operation due to insufficient non-paged pool. For an action plan, we recommended the use of gflags to enable pool tagging in order to obtain more information about pool consumption.

     

     

    Share this post :

Page 21 of 24 (235 items) «1920212223»