Hey there NTDEBUGGERS my name is Randy Monteleone and today we are going to talk about IRPs. In the past we have talked about the IRP structure in passing and showed a field here and there that can be pulled out and used to find answers to stalled IO. I was recently working on a debugger extension and found something interesting in the IRP I was looking at. I had been looking at a !irp output much like the one pictured below. I found that I was asking myself what do the "Success Error Cancel" fields mean?
After doing some digging and working with a few of my co-workers we found the mystery to the meaning behind these words and why we see them in our output. Lets break this IRP down starting with the ">" marker that indicates the current stack frame. In the output below you see this marker is indicating that we are working on something in partmgr.
1: kd> !irp fffffa809a1f3af0
Irp is active with 9 stacks 5 is current (= 0xfffffa809a1f3ce0)
Mdl=fffffa814e9c4f40: No System Buffer: Thread fffffa80d05b67e0: Irp stack trace.
cmd flg cl Device File Completion-Context
[ 0, 0] 0 0 00000000 00000000 00000000-00000000
Args: 00000000 00000000 00000000 00000000
>[ 4,34] 1c e0 fffffa80920c0060 00000000 fffff880010301b0-00000000 Success Error Cancel <-- What are you trying to tell me?
Args: 00001000 00000000 14625000 00000000
[ 4, 0] 1c e0 fffffa80920c0b90 00000000 fffff88001063010-fffffa80b01eec00 Success Error Cancel
Args: 3993d69568d 00000000 14625000 00000000
[ 4, 0] c e0 fffffa80b01eeab0 00000000 fffff88001d59150-fffffa80d05b2180 Success Error Cancel
Args: 00001000 00000000 3993d69568a 00000000
[ 4, 0] c e1 fffffa80d05b2030 00000000 fffff88001845344-fffff8800d52bb38 Success Error Cancel pending
Args: 00001000 00000000 14615000 00000000
[ 4, 0] 0 0 fffffa80c2829030 fffffa80d04f18a0 00000000-00000000
Args: 00001000 00000000 00002000 00000000
In the example below we see the “Success Error Cancel” fields and in this case we can also see that the pending flag has been set. The pending field indicates that STATUS_PENDING was returned to the caller. This is used so that I/O completion can determine whether or not to fully complete the I/O operation requested by the packet. Drivers can do this by calling IoMarkIrpPending.
Now look at the "cl" column as it holds the key to unlocking what "Success Error Cancel pending" really means.
1: kd> !irp 0xfffffa80`920c2340
Irp is active with 3 stacks 3 is current (= 0xfffffa80920c24a0) <--- _IO_STACK_LOCATION
Mdl=fffffa814e9c4f40: No System Buffer: Thread 00000000: Irp stack trace.
[ f, 0] 1c 0 fffffa809209a060 00000000 fffff880010061a0-fffffa80920bfcc0
Args: fffffa80920c22b0 00000000 00000000 fffffa80920bfcc0
>[ f, 0] 1c e1 fffffa809209a060 00000000 fffff88001a01a00-fffffa80920c2190 Success Error Cancel pending
Args: fffffa80920c22b0 00000000 00000000 fffffa80920bfcc0
Focusing on the cl column we see that our active stack frame is working in MPIO, but what is it trying to tell us about its status and intent? Well to figure that out we need to break down the value "e1" we see listed in our cl or Control column. This field is being used as a flag where the high and low parts represent two different values. We get this value in !irp from an _IO_STACK_LOCATION Control member. This can be found by dumping the stack location displayed by !irp.
1: kd> dt_IO_STACK_LOCATION 0xfffffa80920c24a0
+0x000 MajorFunction : 0xf ''
+0x001 MinorFunction : 0 ''
+0x002 Flags : 0x1c ''
+0x003 Control : 0xe1 '' < -- Control flag
+0x008 Parameters : <unnamed-tag>
+0x028 DeviceObject : 0xfffffa80`9209a060 _DEVICE_OBJECT
+0x030 FileObject : (null)
+0x038 CompletionRoutine : 0xfffff880`01a01a00 long CLASSPNP!TransferPktComplete+0
+0x040 Context : 0xfffffa80`920c2190 Void
So now that we know where we get this value we still have to decode what it means and how this value results in what we see in our !IRP output. Let's split our number into two, our high and low parts. In our example above we had the value e1. Taking our value apart gives us an E and a 1. The first bit of our value will indicates the last returned value as our IRP was processed. In our case this is a 1, meaning that our IRP is Pending. Thus we see the word pending at the end of current stack frame in !irp. There are three possible values for this bit, 0, 1 or 2. 0 being nothing, 1 being pending and 2 being Error Returned.
Moving to our high bits we are left with the "e". Lets take this number and convert it to binary. We end up with 1110. This upper number indicates which invoke types were requested for the completion routine for the driver listed on that stack frame. If we look at MSDN we see that IoSetCompletionRoutine takes in three bool values to set this flag. These options specify whether the completion routine is called if the IRP is completed with that corresponding status.
_In_ PIRP Irp,
_In_opt_ PIO_COMPLETION_ROUTINE CompletionRoutine,
_In_opt_ PVOID Context,
_In_ BOOLEAN InvokeOnSuccess,
_In_ BOOLEAN InvokeOnError,
_In_ BOOLEAN InvokeOnCancel
Doing some source review I was able to pin together how these values translate to the !irp output. If we look back at our binary value of "e" we see that we have a set of bits that get set based on the what the driver wanted to do when an IRP was completed with one of our defined status values.
Cancel = 2
Success = 4
Error = 8
Add each of these values up and the sum is 14 or e in hex. Going back to our binary "1110" we see that our first bit is set to zero or off. This would be the pending and Error Returned values that I mentioned above. The next three bits represent the corresponding Cancel, Success and Error bool values passed to the driver at the time IoSetCompletionRoutine was called for this stack location.
Example : IoSetCompletionRoutine(pirp, pCompletionRoutine,pContex, True,False,True); would yield a value of 6.
Remember the important thing here is not that the !irp output is trying to tell us that one of these things happened. It's telling us that this driver would like to be notified if one of those things does happen. This area also provides us with information if pending or or an error is returned.
Well that's all I have for now, for more documentation for the _IRP and _IO_STACK_LOCATION structures please see the following links to MSDN.
More on IRP: http://msdn.microsoft.com/en-us/library/windows/hardware/ff550694(v=vs.85).aspx
More on _IO_STACK_LOCATION: http://msdn.microsoft.com/en-us/library/windows/hardware/ff550659(v=vs.85).aspx
I was working on a debug the other day when I ran the “!vm” command and saw that the system had some 48,000 commit requests that failed. This was strange as the system was not out of memory and the page file was not full. I was left scratching my head and thinking “I wish I knew where !vm got that information from.” So I went on a quest to find out, here is what I found.
13: kd> !vm 1
*** Virtual Memory Usage ***
Physical Memory: 12580300 ( 50321200 Kb)
Page File: \??\C:\pagefile.sys
Current: 50331648 Kb Free Space: 50306732 Kb
Minimum: 50331648 Kb Maximum: 50331648 Kb
Available Pages: 4606721 ( 18426884 Kb)
ResAvail Pages: 12189247 ( 48756988 Kb)
Locked IO Pages: 0 ( 0 Kb)
Free System PTEs: 33460257 ( 133841028 Kb)
Modified Pages: 20299 ( 81196 Kb)
Modified PF Pages: 6154 ( 24616 Kb)
NonPagedPool 0 Used: 19544 ( 78176 Kb)
NonPagedPool 1 Used: 22308 ( 89232 Kb)
NonPagedPool Usage: 53108 ( 212432 Kb)
NonPagedPool Max: 9408956 ( 37635824 Kb)
PagedPool 0 Usage: 168921 ( 675684 Kb)
PagedPool 1 Usage: 4149241 ( 16596964 Kb)
PagedPool 2 Usage: 17908 ( 71632 Kb)
PagedPool Usage: 4336070 ( 17344280 Kb)
PagedPool Maximum: 33554432 ( 134217728 Kb)
Session Commit: 3438 ( 13752 Kb)
Shared Commit: 6522 ( 26088 Kb)
Special Pool: 0 ( 0 Kb)
Shared Process: 53597 ( 214388 Kb)
PagedPool Commit: 4336140 ( 17344560 Kb)
Driver Commit: 5691 ( 22764 Kb)
Committed pages: 5565215 ( 22260860 Kb)
Commit limit: 25162749 ( 100650996 Kb)
********** 48440 commit requests have failed **********
It turns out that this calculation is from a global ULONG array named “nt!MiChargeCommitmentFailures”. The array has 4 members and they are used to trace the types of commit failures that have taken place. This is done by first calculating the new commit size NewCommitValue = CurrentCommitValue + SystemReservedMemory. Based on this calculation commit errors are tracked in a few different ways, which are listed below with the corresponding member in the array that is incremented.
MiChargeCommitmentFailures - If the system failed a commit request and an expansion of the pagefile has failed.
MiChargeCommitmentFailures - If the system failed a commit and we have already reached the maximum pagefile size.
MiChargeCommitmentFailures - If the system failed a commit while the pagefile lock is held.
MiChargeCommitmentFailures - If the system failed a commit and the NewCommitValue is less than or equal to CurrentCommitValue.
In order to calculate the count of failures, "!vm" adds up the values stored in each array member of the array. Members 0 and 1 are always counted, member 2 is counted if the OS version is Windows 2003/XP and member 3 is counted if the build version is newer than Windows 2003/XP.
Let's look at the array in the dump I was debugging:
13: kd> dc nt!MiChargeCommitmentFailures L4
fffff800`01e45ce0 00000000 0000bd38 00000000 00000000 ....8...........
Converting this to decimal we find the 48000+ commit failures I was seeing the in output of !VM.
13: kd> ?0000bd38
Evaluate expression: 48440 = 00000000`0000bd38
Since I now had my answer, “where does the number come from?”, I was left wanting to know a bit more about the overall flow of why a VirtualAlloc fails to commit.
When memory is allocated by VirtualAlloc the newly allocated memory is not committed to physical memory. Only when the memory is accessed by a read or write is it backed by physical memory.
When this newly allocated memory is accessed for the first time it will need to be backed by commit space. Under normal conditions this is a smooth process, however when the system hits what’s called the commit limit and can’t expand this limit we see commit failures.
So how is the commit limit calculated? Let’s say we have a system with 4GB of physical memory and a pagefile that is 6GB in size. To determine the commit limit we add physical memory and the pagefile size together - in this example the commit limit would be 10GB. Since memory manger will not let any user mode allocation consume every last morsel of commit space it keeps a small amount of the commit space for the system to avoid hangs. When the limit is reached the system tries to grow the page file. If there is no more room to grow the pagefile or the pagefile has reached its configured maximum size, the system will try and free some committed memory to make room for more requests. If expansion of the page file or the attempt to free memory do not allow the allocation to complete, the allocation fails and MiChargeCommitmentFailures is incremented.
To sum it all up, commit limit is RAM + pagefile, commit failures happen when we hit the commit limit and the system is unable to grow the pagefile because it is already at its max. It’s that simple, well almost.
For those that will want to know more about how memory manger works please see the post from Somak: The Memory Shell Game.
The Windows Support team has a new YouTube channel, “Windows Troubleshooting”. The first set of videos cover debugging blue screens.
In this video, Bob Golding, Senior Escalation Engineer, describes how the Special Pool Windows diagnostics tool catches drivers that corrupt memory. Bob also introduces how memory is organized in the system for allocating memory for drivers.
Hello NTDebuggers… I look at a lot of dump files every day. This being the case I like to take full advantage of the customizable look and feel of windbg. I actually have an association setup between DMP files and a CMD file that loads my customized COLOR workspace each time I double click on a dump file. I like a black background with colorized source and debug command window output.
Here is a snapshot of my typical debug session.
Here is how you set it up.
1) Create the following CMD file and put it in your path. It’s called D.CMD on my system..
echo off Title kd.exe -z %1 C: CD\debuggers start C:\Debuggers\windbg.exe -z %1 -W color
2) Load windbg and under view \ options configure your color options. Note: you will want to set every option in the list box.
3) Save the workspace by selecting FILE \ Save Workspace, I saved mine as COLOR
4) If you don’t have a dump file create one by opening a process in the debugger and doing a .dump /ma C:\test\test.dmp (I’ll assume you have a directory called C:\test)
5) Select the C:\test\test.dmp file in explorer and right click on the properties. (note I used badwindow.dmp)
6) Change the dump file association via drilling down to Change \ Browse \ (Select D.CMD)
7) That’s all there is to it. Now you can simply double click on any DMP file and windbg will start and load with your COLOR workspace.
Good luck and happy debugging.
It's Venkatesh with a WMI nugget. While troubleshooting or debugging WMI issues you may come across WMI queries wherein you don’t know which provider implemented the WMI class used in the query. You may want to know the binary and the product that implemented the provider so you can contact the vendor or upgrade the binary to see if it resolves the issue you are investigating.
To help make the investigation easier, Windows Vista and all post operating systems have the cool new 'Analytic and Debug log' for WMI tracing. To get similar tracing in Windows 2003 and earlier versions, WMI has a “Verbose Logging” option enabled through the WMI control (WmiMgmt.msc). Check out this blog for detailed steps to enable the WMI Analytic and Debug -
After enabling WMI logging you will see events like the ones pictured below. There may be a particular WMI query relevant to the issue you're troubleshooting requiring you to know the provider binary for the WMI class in the query. Apart from the class name (InstProvSamp) in the query you need the WMI namespace to which this class belongs to such as '\\.\root\default' as shown below.
Fig 1: Event ID 1 in the Windows Vista’s Event Viewer\Windows Logs\Applications and Services Logs\Microsoft\Windows\WMI-Activity\Trace log
Now we have the WMI Namespace and the class name so let’s find the provider for it.
1. Get the provider name from the Class Name and Namespace:
Every WMI class has a qualifier “provider” containing the name of the provider to which it belongs. Using wbemtest.exe connect to the namespace we mentioned above(\\.\root\default). Then get the class information for our class, InstProvSamp using the “Open Class…” button. After clicking "OK" the Object editor for InstrProvSamp displays the class information for InstProvSamp including the name of the provider it belongs to (CIM_STRING qualifier “provider” as shown below).
2. Get the class identifier (CLSID) of the provider from the provider name:
For every WMI provider WMI maintains an instance of system class __Win32Provider which contains information about the physical implementation of the provider like CLSID which contains the GUID that represents the class identifier (CLSID) of the provider COM object. Now query the repository to get the provider information for “InstProvSamp” provider as I did below.
3. Get the provider binary from the provider’s CLSID:
During registration every COM object adds an entry under HKLM\SOFTWARE\Classes\CLSID. Using regedit.exe you can browse to provider’s COM registration information using the CLSID obtained from Step 2 as shown below.
This should help you connect the dots while debugging your WMI issue. Let me know if you have additional questions!
Hello everyone, I'm Ray and I'm here to talk a bit about a dump I recently looked at and a little-referenced memory counter called ResAvail Pages (resident available pages).
The problem statement was: The server hangs after a while.
Not terribly informative, but that's where we start with many cases. First some good housekeeping:
0: kd> vertarget
Windows 7 Kernel Version 7601 (Service Pack 1) MP (2 procs) Free x64
Product: Server, suite: TerminalServer SingleUserTS
Built by: 7601.18113.amd64fre.win7sp1_gdr.130318-1533
Machine Name: "ASDFASDF1234"
Kernel base = 0xfffff800`01665000 PsLoadedModuleList = 0xfffff800`018a8670
Debug session time: Thu Aug 8 09:39:26.992 2013 (UTC - 4:00)
System Uptime: 9 days 1:08:39.307
Of course Windows 7 Server == Server 2008 R2.
One of the basic things I check at the beginning of these hang dumps with vague problem statements is the memory information.
0: kd> !vm 21
Physical Memory: 2097038 ( 8388152 Kb)
Page File: \??\C:\pagefile.sys
Current: 12582912 Kb Free Space: 12539700 Kb
Minimum: 12582912 Kb Maximum: 12582912 Kb
Available Pages: 286693 ( 1146772 Kb)
ResAvail Pages: 135 ( 540 Kb)
********** Running out of physical memory **********
Locked IO Pages: 0 ( 0 Kb)
Free System PTEs: 33526408 ( 134105632 Kb)
******* 12 system cache map requests have failed ******
Modified Pages: 4017 ( 16068 Kb)
Modified PF Pages: 4017 ( 16068 Kb)
NonPagedPool Usage: 113241 ( 452964 Kb)
NonPagedPool Max: 1561592 ( 6246368 Kb)
PagedPool 0 Usage: 35325 ( 141300 Kb)
PagedPool 1 Usage: 28162 ( 112648 Kb)
PagedPool 2 Usage: 24351 ( 97404 Kb)
PagedPool 3 Usage: 24350 ( 97400 Kb)
PagedPool 4 Usage: 24516 ( 98064 Kb)
PagedPool Usage: 136704 ( 546816 Kb)
PagedPool Maximum: 33554432 ( 134217728 Kb)
********** 222 pool allocations have failed **********
Session Commit: 6013 ( 24052 Kb)
Shared Commit: 6150 ( 24600 Kb)
Special Pool: 0 ( 0 Kb)
Shared Process: 1214088 ( 4856352 Kb)
Pages For MDLs: 67 ( 268 Kb)
PagedPool Commit: 136768 ( 547072 Kb)
Driver Commit: 15548 ( 62192 Kb)
Committed pages: 1648790 ( 6595160 Kb)
Commit limit: 5242301 ( 20969204 Kb)
So we're failing to allocate pool, but we aren't out of virtual memory for paged pool or nonpaged pool. Let's look at the breakdown:
0: kd> dd nt!MmPoolFailures l?9
fffff800`01892160 000001be 00000000 00000000 00000002
fffff800`01892170 00000000 00000000 00000000 00000000
yellow = Nonpaged high/medium/low priority failures
green = Paged high/medium/low priority failures
cyan = Session paged high/medium/low priority failures
So we actually failed both nonpaged AND paged pool allocations in this case. Why? We're "Running out of physical memory", obviously. So where does this running out of physical memory message come from? In the above example this is from the ResAvail Pages counter.
ResAvail Pages is the amount of physical memory there would be if every working set was at its minimum size and only what needs to be resident in RAM was present (e.g. PFN database, system PTEs, driver images, kernel thread stacks, nonpaged pool, etc).
Where did this memory go then? We have plenty of Available Pages (Free + Zero + Standby) for use. So something is claiming memory it isn't actually using. In this type of situation one of the things I immediately suspect is process working set minimums. Working set basically means the physical memory used by a process.
So let's check.
0: kd> !process 0 1
<a lot of processes in this output>.
SessionId: 0 Cid: 0adc Peb: 7fffffda000 ParentCid: 0678
DirBase: 204ac9000 ObjectTable: 00000000 HandleCount: 0.
VadRoot 0000000000000000 Vads 0 Clone 0 Private 1. Modified 3. Locked 0.
ElapsedTime 9 Days 01:08:00.134
Working Set Sizes (now,min,max) (5, 50, 345) (20KB, 200KB, 1380KB)
VirtualSize 65 Mb
PeakVirtualSize 84 Mb
I have only shown one example process above for brevity's sake, but there were thousands returned. 241,423 to be precise. None had abnormally high process working set minimums, but cumulatively their usage adds up.
The “now” process working set is lower than the minimum working set. How is that possible? Well, the minimum and maximum are not hard limits, but suggested limits. For example, the minimum working set is honored unless there is memory pressure, in which case it can be trimmed below this value. There is a way to set the min and/or max as hard limits on specific processes by using the QUOTA_LIMITS_HARDWS_MIN_ENABLE flag via SetProcessWorkingSetSize.
You can view if the minimum and maximum working set values are configured in the _EPROCESS->Vm->Flags structure. Note these numbers are from another system as this structure was already torn down for the processes we were looking at.
0: kd> dt _EPROCESS fffffa8008f76060 Vm
+0x398 Vm : _MMSUPPORT
0: kd> dt _MMSUPPORT fffffa8008f76060+0x398
+0x000 WorkingSetMutex : _EX_PUSH_LOCK
+0x008 ExitGate : 0xfffff880`00961000 _KGATE
+0x010 AccessLog : (null)
+0x018 WorkingSetExpansionLinks : _LIST_ENTRY [ 0x00000000`00000000 - 0xfffffa80`08f3c410 ]
+0x028 AgeDistribution :  0
+0x044 MinimumWorkingSetSize : 0x32
+0x048 WorkingSetSize : 5
+0x04c WorkingSetPrivateSize : 5
+0x050 MaximumWorkingSetSize : 0x159
+0x054 ChargedWslePages : 0
+0x058 ActualWslePages : 0
+0x05c WorkingSetSizeOverhead : 0
+0x060 PeakWorkingSetSize : 0x5ae
+0x064 HardFaultCount : 0x41
+0x068 VmWorkingSetList : 0xfffff700`01080000 _MMWSL
+0x070 NextPageColor : 0x2dac
+0x072 LastTrimStamp : 0
+0x074 PageFaultCount : 0x65c
+0x078 RepurposeCount : 0x1e1
+0x07c Spare :  0
+0x084 Flags : _MMSUPPORT_FLAGS
0: kd> dt _MMSUPPORT_FLAGS fffffa8008f76060+0x398+0x84
+0x000 WorkingSetType : 0y000
+0x000 ModwriterAttached : 0y0
+0x000 TrimHard : 0y0
+0x000 MaximumWorkingSetHard : 0y0
+0x000 ForceTrim : 0y0
+0x000 MinimumWorkingSetHard : 0y0
+0x001 SessionMaster : 0y0
+0x001 TrimmerState : 0y00
+0x001 Reserved : 0y0
+0x001 PageStealers : 0y0000
+0x002 MemoryPriority : 0y00000000 (0)
+0x003 WsleDeleted : 0y1
+0x003 VmExiting : 0y1
+0x003 ExpansionFailed : 0y0
+0x003 Available : 0y00000 (0)
How about some more detail?
0: kd> !process fffffa8008f76060
DirBase: 204ac9000 ObjectTable: 00000000 HandleCount: 0.
No active threads
0: kd> !object fffffa8008f76060
Object: fffffa8008f76060 Type: (fffffa8006cccc90) Process
ObjectHeader: fffffa8008f76030 (new version)
HandleCount: 0 PointerCount: 1
The highlighted information shows us that this process has no active threads left but the process object itself (and its 20KB working set use) were still hanging around because a kernel driver had a reference to the object that it never released. Sampling other entries shows the server had been leaking process objects since it was booted.
Unfortunately trying to directly track down pointer leaks on process objects is difficult and requires an instrumented kernel, so we tried to check the easy stuff first before going that route. We know it has to be a kernel driver doing this (since it is a pointer and not a handle leak) so we looked at the list of 3rd party drivers installed. Note: The driver names have been redacted.
0: kd> lm
start end module name
fffff880`04112000 fffff880`04121e00 driver1 (no symbols) <-- no symbols usually means 3rd party
fffff880`04158000 fffff880`041a4c00 driver2 (no symbols)
0: kd> lmvm driver1
Browse full module list
fffff880`04112000 fffff880`04121e00 driver1 (no symbols)
Loaded symbol image file: driver1.sys
Image path: \SystemRoot\system32\DRIVERS\driver1.sys
Image name: driver1.sys
Browse all global symbols functions data
Timestamp: Wed Dec 13 12:09:32 2006 (458033CC)
Translations: 0000.04b0 0000.04e4 0409.04b0 0409.04e4
0: kd> lmvm driver2
Loaded symbol image file: driver2.sys
Image path: \??\C:\Windows\system32\drivers\driver2.sys
Image name: driver2.sys
Timestamp: Thu Nov 30 12:12:07 2006 (456F10E7)
Fortunately for both the customer and us we turned up a pair of drivers that predated Windows Vista (meaning they were designed for XP/2003) that raised an eyebrow. Of course we need a more solid evidence link than just "it's an old driver", so I did a quick search of our internal KB. This turned up several other customers who had these same drivers installed, experienced the same problem, then removed them and the problem went away. That sounds like a pretty good evidence link. We implemented the same plan for this customer successfully.
Hi my name is Bob, I’m an Escalation engineer with the Microsoft critical problem resolution team. We had one of our readers ask how much we deal with hardware. Well in response we recently worked on an interesting problem I thought I would share with you. In this case it was interesting because it demonstrated an issue specific to multi-processor machines and something that probably sounded innocuous to the driver writer who caused the problem.
What was the problem?
The problem was that everything worked except the time in the system was not being updated. The RTC had stopped. We spent time reviewing the semantics of how the RTC should function and even hooked up an oscilloscope to the RTC on the motherboard and were able to turn it off and on with the debugger by writing out the correct port. The trace on the scope validated our understanding of what had to be written to the port to turn the clock off. One we had a clear understanding of this we understood what we were looking for in a driver that might cause the problem. Note the clock typically fires every 10ms so you do not need a fast scope to do this.
Special keyboard driver written
In order to catch a dump in state we had to modify the keyboard driver. It would cause an “Int 3” in its ISR instead of calling “bug check” for an E2 stop. Because the RTC was not running the idle thread was not getting quantums and as a result a normal break in would not work. However the system would respond to ISRs.
What was found
All RTC interrupts were stopped - the clock was not running. We checked all the obvious places to see if the RTC was disabled.
We looked at the ICR in the I/O APIC. This is the interrupt configuration register. There is a register for every interrupt pin on the APIC. These registers are used to tell the APIC what vector to send the processor so the processor can service the interrupt. It also has configuration information about level and if it edge triggered and a mask bit. The mask bit was not set.
Below is a reenactment.
0: kd> ed ffd01000
ffd01000 00000034 20 ß Select register 20 which is pin 8.
0: kd> ed ffd01010
ffd01010 000008d1 ß Contents ß Vect D1 Bit 16 the interrupt mask bit is not set so it is OK.
Next check the RTC status register which are I/O ports 70 and 71. Port 70 is the address port. Port 71 is the data port. This information is from an old BIOS book.
0: kd> ob 70 b ß ‘B’ is a control register.
0: kd> ib 71
00000071: 42 ß The value 42 means that the RTC is enabled. Bit 6 is the enable.
So what was it?
The way the RTC works is it will interrupt at a certain interval. When the interrupt is serviced, the status register has to be read to start it again.
We discovered another driver that was reading the clock, this was done by disassembly various drivers in the dump and looking for the I/O operation to ports 70 or 71. The lower addresses selected by port 70 will yield the time when read. That is what the driver was doing.
You would think that simply reading the time in this way would not hurt anything. However, in a multi-processor system, access has to be serialized. There is only one set of I/O ports for the system.
Since it takes two accesses to perform an operation on the clock, one address & one data, a collision between two processors can cause undetermined results.
Below is a timing diagram of the issue;
Proc 0 running OS RTC handler Proc 1 running XYZ driver
T1 Set register select to status register
T2 Set register select to read time
T3 Read status register to restart clock
So at T3 the OS RTC handler reads the wrong register so the clock does not start.
I thought this was an interesting problem that illustrates the need for serialization. And it demonstrates what to look out for in a multi-proc environment. You always have to think “What happens if the other processor does…….”
For more information consult any BIOS developer manuals you may have lying around or this link we found http://www.geocities.com/SiliconValley/Campus/1671/docs/rtc.htm
See the “Status Register C” section, “All future interrupts are disabled until this register is read - your interrupt handler *must* do it.”
Recently I was working with a developer who had created performance counters that work in Performance Monitor but are never collected in a user defined data collector set. The customer explained that their counters update named shared memory inside the application which should be read by perfmon or the data collector set.
Putting counter data in shared memory is a common technique for performance counter developers. A programmer can update performance data in a block of shared memory in their application and then use a performance extension dll (aka an “extensible counter”) to read from the shared memory.
Shared memory is created by calling CreateFileMapping and MapViewOfFile. This memory is then accessed by another application by calling OpenFileMapping. All applications which use this shared memory must pass the same lpName to CreateFileMapping or OpenFileMapping. An example of using these APIs to implement shared memory is available on MSDN.
Based on the customer’s explanation that they are populating shared memory in their application, and their counters work in Performance Monitor but do not work in a user defined data collector set, I suspected that OpenFileMapping was failing for the data collector set.
User defined data collector sets run in a rundll32.exe process. If you have multiple rundll32.exe processes you may need to identify which one is related to your data collector set. The relevant process has a command line similar to “rundll32.exe C:\Windows\system32\pla.dll,PlaHost”. There are several tools that can be used to identify the command line of the process such as tlist.exe, which is included with the Debugging Tools for Windows.
After attaching a debugger to rundll32.exe, I wanted to break on the ret instruction at the end of the OpenFileMappingW function. This would allow me to determine if the function succeeds or fails. According to MSDN “If the function fails, the return value is NULL. To get extended error information, call GetLastError.”
The uf command is an easy way to unassemble a function and find the ret instruction to break on.
0:001> uf kernelbase!OpenFileMappingW
75b88e0d 8bff mov edi,edi
75b88e79 c9 leave
75b88e7a c20c00 ret 0Ch
0:001> bp 75b88e7a
Breakpoint 0 hit
eax=00000000 ebx=00008022 ecx=7ffd8000 edx=00000002 esi=05abf03c edi=00000000
eip=75b88e7a esp=05abeb20 ebp=05abeb3c iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
In the above output we can see that the eax register is NULL, indicating that the call to OpenFileMapping failed. The !gle command will show the last error and last status.
LastErrorValue: (Win32) 0x2 (2) - The system cannot find the file specified.
LastStatusValue: (NTSTATUS) 0xc0000034 - Object Name not found.
The failure is that OpenFileMapping cannot find the file. The file name is the third parameter to OpenFileMapping. We can get the first three parameters from the kb command.
ChildEBP RetAddr Args to Child
05abf0d0 6abae355 0002001f 00000000 05abeb7c kernelbase!OpenFileMappingW+0x90
WARNING: Stack unwind information not available. Following frames may be wrong.
05abf0f8 7784fe67 02a7ae90 05abf224 05abf254 ninjaprf+0x10edb
05abf110 7784fc97 00472158 02a7ae90 05abf224 advapi32!CallExtObj+0x17
05abf270 7784efaf 05abf2bc 60fcfa02 05abf778 advapi32!QueryExtensibleData+0x735
05abf654 75ff0468 80000004 05abf778 00000000 advapi32!PerfRegQueryValue+0x5da
05abf748 75ffd505 80000004 05abf778 05abf790 kernel32!LocalBaseRegQueryValue+0x366
05abf7b4 61247dc5 80000004 02a7ae90 00000000 kernel32!RegQueryValueExW+0xb7
05abf830 61250595 80000004 02a7ae58 02a7ae90 pdh!GetSystemPerfData+0x92
05abf89c 6124c753 02a407d0 05abf8e8 61241928 pdh!GetQueryPerfData+0xa4
05abf8b8 61254463 02a407d0 05abf8e8 60fcf32f pdh!PdhiCollectQueryData+0x32
05abf90c 611c6d04 02a58f08 00000000 75ffc3e0 pdh!PdhUpdateLogW+0xa2
05abf9bc 611be128 0045c968 00000000 00000000 pla!HPerformanceCounterDataCollector::Cycle+0x48
05abf9bc 00000000 0045c968 00000000 00000000 pla!PlaiCollectorControl+0x3b7
0:008> da 05abeb7c
The user defined data collector set is failing to open the file "Local\NINJAPERF_S-1-5-18". This is the name that the performance extension dll ninjaprf.dll has given to its shared memory.
Based on the customer’s description this operation works in Performance Monitor. Next, I attached a debugger to perfmon and set the same breakpoint.
eax=000009f8 ebx=00008022 ecx=a7330000 edx=080ee678 esi=06798070 edi=00000000
eip=760be9bb esp=0a84e564 ebp=0a84e580 iopl=0 nv up ei pl zr na pe nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246
760be9bb c20c00 ret 0Ch
In the above output we can see that eax is a handle number, indicating that the function succeeded.
Looking at the file being opened we can see why this works in one scenario and not in another.
0:016> kb 1
0a84e580 698e4ab9 0002001f 00000000 0a84e5c0 kernelbase!OpenFileMappingW+0x90
0:016> da 0a84e5c0
The file name used is unique for each user. The ninjaprf dll has chosen a file name which includes the SID of the current user. This works for Performance Monitor because the user who starts the application is the same as the user who runs Performance Monitor. However, a user defined data collector set runs in rundll32.exe as the Local System account.
It is possible to run the user defined data collector set as a different user, however the file name being used will not work in that scenario either. By using the “Local\” prefix, the file is created in the local session namespace. The application runs in the user’s session, while rundll32.exe is started by the Task Scheduler service and runs in session 0. This prevents the user defined data collector set from seeing the file created by the application.
If your performance counter uses shared memory to communicate, be aware that your performance extension dll may be run in the context of a different user account and a different user session. Your inter process communication techniques must account for this, or your customers will ask you why your counters do not work outside of Performance Monitor.
Hi, my name is Todd Webb, I am an Escalation Engineer from the Global Escalation Services OEM Team.
I recently worked a case where we were trying to troubleshoot an issue and the system would hang with USB debugging enabled. This was a problem since USB debugging was the only kernel debugging option available on this particular hardware setup. I needed a quick and easy way to find the source of the problem without using many of the standard methods. Basically I had to debug a debugger issue with no access to a debugger. I knew the problem was likely USB related, but had no other information. The method I used to troubleshoot this issue was to write some macros that would make a call to HalMakeBeep to generate audible indications of where we were in the code. The primary reason for using this method was I could quickly and easily move the location in the code where I made the calls as I narrowed down where the problem occurred.
The alternative would have been to add a bunch of logging and find some way to force a dump. My initial assessment was this would be more time consuming and less flexible then the beep code method. Others may have other methods to do this, but this just gives people another method of debugging driver start-up problems on systems that are not debuggable.
The main beep macro allows you to pass in a frequency and duration in milliseconds for a single beep. Then based on the example below you can extend the single beep macro to create a series of beeps.
Here is an example of the macros I used:
// beep macros for troubleshooting without debugger
#define EHCI_BEEP1(f, d) \
LARGE_INTEGER duration; \
duration.QuadPart = Int32x32To64(d, -10000); \
KeDelayExecutionThread(KernelMode, FALSE, &duration); \
duration.QuadPart = Int32x32To64(500, -10000); \
#define EHCI_BEEP2(f, d) \
EHCI_BEEP1(f, d); \
EHCI_BEEP1(f, d); \
#define EHCI_BEEP3(f, d) \
EHCI_BEEP2(f, d); \
Here is an example of a long beep followed by three short beeps:
// debug - 1 long : 3 short
Microsoft is looking for five Windows Internals subject matter experts to come work on a very special five to eight day project on the Redmond campus during the month of May 2008. Candidates must have good communications skills, be non Microsoft employees, have 5+ years experience with windows, be familiar with the Windows Internals book, have kernel and user mode debugging experience, and be C literate.
We prefer candidates strongly represent the Windows IT Professional or Windows development industry. These typically would be 2nd or 3rd tier escalation resources or Windows developers (developing Win32 code, or device drivers) that are working for Gold Partners, ISVs or are independent consultants or MVPs.
A Microsoft NDA (Non Disclosure Agreement) will be required.
Please contact Jeff Dailey at email@example.com with the subject line “Windows Internals Project”
Please include a copy of your resume and details explaining your Windows background, along with your contact info, and your location.
We will cover air travel, lodging and food for non local candidates attending from within the US.
The Widows Internals experts selected for this project will be brought to Redmond to aid in developing a new certification program based on the Windows Internals. This is not a hiring effort or product testing session. Though the candidates should have C literacy they will not be writing or reviewing any Microsoft source code.