May, 2012

  • The Old New Thing

    Charles Petzold is back with another edition of Programming Windows


    Back in the day (and perhaps still true today), Charles Petzold's Programming Windows was the definitive source for learning to program Windows. The book is so old that even I used it to learn Windows programming, back when everything was 16-bit and uphill both ways. The most recent edition is Programming Windows, 5th Edition, which was published way back in 1998. What has he been doing since then? My guess would have been "sitting on a beach in Hawaiʻi," but apparently he's been writing books on C# and Windows Forms and WPF and Silverlight. Hey, I could still be right: Maybe he writes the books while sitting on a beach in Hawaiʻi.

    It appears that Windows 8 has brought Mr. Petzold back to the topic of Windows progarmming, and despite his earlier claims that he has no plans to write a sixth edition of Programming Windows, it turns out that he's writing a sixth edition of Programming Windows specifically for Windows 8. (Perhaps he could subtitle his book The New Old Thing.)

    Here's where it gets interesting.

    Before the book officially releases (target date November 15), there will be two pre-release versions in eBook form, one based on the Consumer Preview of Windows 8 and one based on the Release Preview.

    Now it gets really interesting: If you order the Consumer Preview eBook, it comes with free upgrades to the Release Preview eBook as well as the final eBook. (If you order the Release Preview eBook, then it comes with a free upgrade to the final eBook.)

    Can it get even more interesting than that? You bet! Because the price of getting in on the action increases the longer you wait. Act now, and you can get the Consumer Preview eBook (and all the free upgrades that come with it) for just $10. Wait a few weeks, and it'll cost you $20. Wait another few months, and it'll cost you $30; after another few weeks the price goes up to $40, and if you are a lazy bum and wait until the final eBook to be released, it'll cost you $50.

    But in order to take advantage of this offer, you have to follow the instructions on this blog entry from Microsoft Press (and read the mandatory legal mumbo-jumbo, because the lawyers always get their say).

    Bonus chatter: One publisher asked me if I wanted to write a book on programming Windows 8, but I told them that I was too busy shipping Windows 8 to have any extra time to write a book about it. And it's a good thing I turned them down, because imagine if I decided to write the book and found that Charles Petzold was coming out of retirement to write his own book. My book would have done even worse than my first book, which didn't even have any competition!

    Bonus disclaimer: Charles Petzold did not pay me to write this, nor did he offer me a cut of his royalties for shilling his book. But that doesn't mean I won't accept it! (Are you listening, Charles?)

  • The Old New Thing

    How to view the stack of threads that were terminated as part of process teardown from the kernel debugger


    As we saw some time ago, process shutdown is a multi-phase affair. After you call ExitProcess, all the threads are forcibly terminated. After that's done, each DLL is sent a DLL_PROCESS_DETACH notification. You may be debugging a problem with DLL_PROCESS_DETACH handling that suggests that some of those threads were not cleaned up properly. For example, you might assert that a reference count is zero, and you find during process shutdown that this assertion sometimes fires. Maybe you terminated a thread before it got a chance to release its reference? How can you test this theory if the thread is already gone?

    It so happens that when all the threads are terminated during the early phase of process shutdown, the kernel is a bit lazy and doesn't free their stacks. It figures, hey, the entire process is going away soon, so the stack memory is going to be cleaned up as part of process termination. (It's sort of the kernel equivalent of not bothering to sweep the floor of a building that's about to be demolished.) You can use this to your advantage by grovelling the stacks that were left behind.

    Hey, this is why you get called in to debug the hard stuff, right?

    Before continuing, I need to emphasize that this information is for debugging purposes only. The structures and offsets are all implementation details which can change from release to release.

    The first step is to identify where all the stacks are. The direct approach is difficult because the stacks can be all different sizes, so it's not easy to pick them out of a line-up. But one thing does come in a consistent size: The TEB.

    From the kernel debugger, use the !process command to dump the process you are interested in, and from the header information, extract the VadRoot.

    1: kd> !process -1
    PROCESS 8731bd40  SessionId: 1  Cid: 0748    Peb: 7ffda000  ParentCid: 0620
        DirBase: 4247b000  ObjectTable: 96f66de0  HandleCount: 104.
        Image: oopsie.exe
        VadRoot 893de570 Vads 124 Clone 0 Private 518. Modified 643. Locked 0.
        DeviceMap 995628c0

    Dump this VAD root with the !vad command, and pay attention only to the entries which say 1 Private READWRITE.

    1: kd> !vad 893de570
    VAD     level      start      end    commit
    ... ignore everything except "1 Private READWRITE" ...
    8730a5f0 ( 6)         50       50         1 Private      READWRITE
    9ab0cb40 ( 5)         60       7f         1 Private      READWRITE
    893978b0 ( 6)         80       9f         1 Private      READWRITE
    87302d30 ( 5)        110      110         1 Private      READWRITE
    889693f8 ( 6)        120      121         1 Private      READWRITE
    872f3fb8 ( 6)        170      170         1 Private      READWRITE
    87089a80 ( 6)        1a0      1a0         1 Private      READWRITE
    8cbf1cb0 ( 5)        1c0      1df         1 Private      READWRITE
    88c079d0 ( 6)        1e0      1e0         1 Private      READWRITE
    9abc33e0 ( 6)        410      48f         1 Private      READWRITE
    873173b0 ( 7)        970      970         1 Private      READWRITE
    8ca1c158 ( 7)      7ffd5    7ffd5         1 Private      READWRITE
    88c02a78 ( 6)      7ffd6    7ffd6         1 Private      READWRITE
    872f9298 ( 5)      7ffd7    7ffd7         1 Private      READWRITE
    8750d210 ( 7)      7ffd8    7ffd8         1 Private      READWRITE
    87075ce8 ( 6)      7ffda    7ffda         1 Private      READWRITE
    87215da0 ( 4)      7ffdc    7ffdc         1 Private      READWRITE
    872f2200 ( 6)      7ffdd    7ffdd         1 Private      READWRITE
    8730a670 ( 5)      7ffdf    7ffdf         1 Private      READWRITE

    (If you are debugging from user mode, then you can use !vadump but the output format is different.)

    Each of these is a candidate TEB. In practice, TEBs tend to be allocated at the high end of memory, so the ones with a low start value are probably red herrings. Therefore, you should investigate these candidates in reverse order.

    For each candidate, take the start address and append three zeroes. (Each page on x86 is 4KB, which conveniently maps to 1000 in hex.) Dump the first seven pointers of the TEB with the dp xxxxx000 L7 command.

    1: kd> dp 7ffdf000 L7
    7ffdf000  0016fbb0 00170000 0016b000 00000000
    7ffdf010  00001e00 00000000 7ffdf000 ← hit

    If the TEB is valid, then the seventh pointer points back to the start of the TEB. In a valid TEB, the second and third values are the stack limits; in this case, the candidate stack lives between 0016b000 and 00170000. (As a double-check, you can verify that the upper limit of the stack, 00170000 in this case, matches up with the end of a VAD allocation in the !vad output above.)

    Now that you know where the stack is, you can dps it and look for EBP frames. (I usually start about two to four pages below the upper limit of the stack.) Test out each candidate EBP frame with the k= command until you find one that seems to be solid. Record this candidate stack trace in a text file for further study.

    Repeat for each candidate TEB, and you will eventually reconstruct what each thread in the process was doing at the moment it was terminated. If you're really lucky, you might even see the code that incremented the reference count but was terminated before it could release it.

    The above discussion also applies to debugging 64-bit processes. However, instead of looking for 1 Private READWRITE pages, you want to look for 2 Private READWRITE pages. As an additional wrinkle, if you are debugging ia64, then converting a page frame to a linear address is sadly not as simple as appending three zeroes. Pages on ia64 are 8KB, not 4KB, so you need to shift the value left by 25 bits: Add three zeroes and then multiply by two.

    And finally, if you are debugging a 32-bit process on x64, then you want to look for 3 Private READWRITE pages, but add 2 before appending the three zeroes. That's because the TEB for a 32-bit process on x64 is really two TEBs glued together: A 64-bit TEB followed by a 32-bit TEB.

    Note: I did not come up with this debugging technique on my own. I learned it from an even greater debugging genius.

    Next time, we'll look at debugging this issue from a user-mode debugger.

    Trivia: The informal term for these terminated-but-not-yet-completely-destroyed threads is ghost threads. The term was coined by the Exchange support team, because they often have to study server failures that require them to do this type of investigation, and they needed a cute name for it.

  • The Old New Thing

    Sure, we do that: Context menu edition


    A customer reported a problem that occurred only when they installed a particular application. If they uninstalled it, then the problem went away. After installing the application, the "Run As" context menu option stopped working. The customer didn't provide any other details, but we were able to make an educated guess as to what was going on.

    A common programming error in context menu extensions occurs in extensions which add only one menu item. These extensions ignore the parameters to the IContextMenu::InvokeCommand and simply assume that the only reason the method can be called is if the user selected their menu item. After all, if you have only one invokable item, there's no need to figure out which one the user selected, because you have only one to begin with!

    The problem is that a context menu extension can be invoked not because the user selected an item under its control but because a verb is being invoked programmatically, and each handler is being asked, "Do you know how to do this?"

    The result is that the context menu host calls the extension to say, "If you know how to do runas, then please do so," and the the extension says "Sure, we do that" and starts doing its thing. If you are unlucky and the grabby extension is asked the question before the actual runas extension, the runas command winds up being hijacked by the grabby extension.

    (This is the same mistake that causes the Copy To and Move To commands to behave strangely if you add them to the context menu: They assume that the only reason they are invoked is that the user invoked their command, because they weren't designed to be hosted by context menus to begin with! They were designed to go into the toolbar, and the toolbar hosting code never invoked commands by name. It's like taking a ladder and using it as a bridge between two tall buildings. Sure, you can now cross from one building to another, but you also run a serious risk of falling to your death.)

    A variation on the initial problem is "I found that after installing a particular program, I can't run anything from the Start menu." I know of at least two programs which install context menu extensions which steal the "open" command on executables.

    This problem is sufficiently prevalent that there is a special compatibility flag that can be set on a shell extension to say, "This is a grabby shell extension that steals commands. Never ask it if it supports anything, because it will always say yes!"

    Notice that the "MoveTo CopyTo Context Menu" is on the list, which I find interesting because MoveTo/CopyTo was never meant to go on the context menu in the first place. Going back to our analogy, it'd be as if the ladder company issued a safety bulletin to warn people of problems that can occur if you use it as a bridge between two tall buildings!

  • The Old New Thing

    Microspeak: The parking lot


    Mike Dunn wonders what the Microspeak term parking lot means.

    I'm not familiar with this term either, and the first document I turned up during my search was a PowerPoint presentation that said "Avoid using Microsoft jargon terms, such as parking lot and dogfood."

    Yeah, that wasn't much help.

    From what I can gather, the term parking lot started out as a term used during brainstorming sessions. You've got a bunch of people in a conference room tossing out all sorts of ideas. The traditional way of organizing the ideas is to write each one on a Post-It® note and stick it on the whiteboard. As more and more notes appear, you start to organize them by grouping together similar ideas.

    Every so often, you'll run into an idea that, while good, isn't really relevant to the problem you're trying to solve. You don't want to throw it away, so instead, you designate a corner of the whiteboard to be the place to "park" those ideas for later consideration. That corner of the whiteboard is nicknamed the parking lot.

    The term parking lot then began to be applied to the document that collected all of these "parked" ideas, so they could be circulated to a more appropriate audience.

    The term then expanded to refer to any document which served as the official repository of assorted suggestions for future work or discussion. (Known to some people simply as The List.) For example, there is a SharePoint List titled Active Issues and the subtitle parking lot for discussion topics in weekly XYZ meeting. Each item on the list is assigned to a particular person and assigned a priority.

    I can't find any citations for parking lot being used as a way to say something like "we'll talk about this after the meeting is over," but I can see how it could be related to the sense of parking lot I was able to turn up: The parking lot is the list of things that aren't really relevant to the topic at hand but which are still worth discussing. We just won't discuss them here.

  • The Old New Thing

    What is the historical reason for MulDiv(1, -0x80000000, -0x80000000) returning 2?


    Commenter rs asks, "Why does Windows (historically) return 2 for MulDiv(1, -0x80000000, -0x80000000) while Wine returns zero?"

    The MulDiv function multiplies the first two parameters and divides by the third. Therefore, the mathematically correct answer for MulDiv(1, -0x80000000, -0x80000000) is 1, because a × b ÷ b = a for all nonzero b.

    So both Windows and Wine get it wrong. I don't know why Wine gets it wrong, but I dug through the archives to figure out what happened to Windows.

    First, some background. What's the point of the MulDiv function anyway?

    Back in the days of 16-bit Windows, floating point was very expensive. Most people did not have math coprocessors, so floating point was performed via software emulation. And the software emulation was slow. First, you issued a floating point operation on the assumption that you had a float point coprocessor. If you didn't, then a coprocessor not available exception was raised. This exception handler had a lot of work to do.

    It decoded the instruction that caused the exception and then emulated the operation. For example, if the bytes at the point of the exception were d9 45 08, the exception handler would have to figure out that the instruction was fld dword ptr ds:[di][8]. It then had to simulate the operation of that instruction. In this case, it would retrieve the caller's di register, add 8 to that value, load four bytes from that address (relative to the caller's ds register), expand them from 32-bit floating point to 80-bit floating point, and push them onto a pretend floating point stack. Then it advanced the instruction pointer three bytes and resumed execution.

    This took an instruction that with a coprocessor would take around 40 cycles (already slow) and ballooned its total execution time to a few hundred, probably thousand cycles. (I didn't bother counting. Those who are offended by this horrific laziness on my part can apply for a refund.)

    It was in this sort of floating point-hostile environment that Windows was originally developed. As a result, Windows has historically avoided using floating point and preferred to use integers. And one of the things you often have to do with integers is scale them by some ratio. For example, a horizontal dialog unit is ¼ of the average character width, and a vertical dialog unit is 1/8 of the average character height. If you have a value of, say, 15 horizontal dlu, the corresponding number of pixels is 15 × average character width ÷ 4. This multiply-then-divide operation is quite common, and that's the model that the MulDiv function is designed to help out with.

    In particular, MulDiv took care of three things that a simple a × b ÷ c didn't. (And remember, we're in 16-bit Windows, so a, b and c are all 16-bit signed values.)

    • The intermediate product a × b was computed as a 32-bit value, thereby avoiding overflow.
    • The result was rounded to the nearest integer instead of truncated toward zero
    • If c = 0 or if the result did not fit in a signed 16-bit integer, it returned INT_MAX or INT_MIN as appropriate.

    The MulDiv function was written in assembly language, as was most of GDI at the time. Oh right, the MulDiv function was exported by GDI in 16-bit Windows. Why? Probably because they were the people who needed the function first, so they ended up writing it.

    Anyway, after I studied the assembly language for the function, I found the bug. A shr instruction was accidentally coded as sar. The problem manifests itself only for the denominator −0x8000, because that's the only one whose absolute value has the high bit set.

    The purpose of the sar instruction was to divide the denominator by two, so it can get the appropriate rounding behavior when there is a remainder. Reverse-compiling back into C, the function goes like this:

    int16 MulDiv(int16 a, int16 b, int16 c)
     int16 sign = a ^ b ^ c; // sign of result
     // make everything positive; we will apply sign at the end
     if (a < 0) a = -a;
     if (b < 0) b = -b;
     if (c < 0) c = -c;
     //  add half the denominator to get rounding behavior
     uint32 prod = UInt16x16To32(a, b) + c / 2;
     if (HIWORD(prod) >= c) goto overflow;
     int16 result = UInt32Div16To16(prod, c);
     if (result < 0) goto overflow;
     if (sign < 0) result = -result;
     return result;
     return sign < 0 ? INT_MIN : INT_MAX;

    Given that I've already told you where the bug is, it should be pretty easy to spot in the code above.

    Anyway, when this assembly language function was ported to Win32, it was ported as, well, an assembly language function. And the port was so successful, it even preserved (probably by accident) the sign extension bug.

    Mind you, it's a bug with amazing seniority.

  • The Old New Thing

    Warum deine Mutter Deutsch spricht


    This upcoming Sunday is Mother's Day in the United States. In recognition of the holiday last year, a local church displayed the following message on its message board: "God couldn't be / everywhere / so God made mothers / German speaking."

    This explains why your mother speaks German.


    The church in question has an evening German-language service, and the advertisement for that service juxtaposed against the Jewish proverb produced an unexpected result.

  • The Old New Thing

    When you crash on a mov ebx, eax instruction, there aren't too many obvious explanations, so just try what you can


    A computer running some tests encountered a mysterious crash:

    eax=ffffffff ebx=00000000 ecx=038ef548 edx=17b060b4 esi=00000000 edi=038ef6f0
    eip=14ae1b77 esp=038ef56c ebp=038ef574 iopl=0         nv up ei pl nz na po nc
    cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010202
    14ae1b77 8bd8            mov     ebx,eax 

    A colleague of mine quickly diagnosed the proximate cause.

    *Something* marked the code page PAGE_READWRITE, instead of PAGE_EXECUTE_READ. I suspect a bug in a driver. FOO is just a victim here.

    0:002> !vprot 14ae1b77 
    BaseAddress:       14ae1000
    AllocationBase:    14ae0000
    AllocationProtect: 00000080  PAGE_EXECUTE_WRITECOPY
    RegionSize:        00001000
    State:             00001000  MEM_COMMIT
    Protect:           00000004  PAGE_READWRITE
    Type:              01000000  MEM_IMAGE

    This diagnosis was met with astonishment. "Wow! What made you think to check the protection on the code page?"

    Well, let's see. We're crashing on a mov ebx, eax instruction. This does not access memory; it's a register-to-register operation. There's no way a properly functioning CPU can raise an exception on this instruction.

    At this point, what possibilities remain?

    • NX, which prevents the CPU from executing data.
    • Overclocking, which will cause all sorts of "impossible" things.
    • A root kit.

    (Note that the second and third options involve rejecting the assumption that the CPU is behaving properly.)

    These are in increasing order of paranoia, so you naturally start with the least paranoid possibility.

    Then, of course, there's the non-psychic solution: Ask the debugger for the exception record.

    EXCEPTION_RECORD:  ffffffff -- (.exr 0xffffffffffffffff)
    ExceptionAddress: 14ae1b77 (FOO!CFrameWnd::GetAssociatedWidget+0x00000047)
       ExceptionCode: c0000005 (Access violation)
      ExceptionFlags: 00000000
    NumberParameters: 2
       Parameter[0]: 00000008
       Parameter[1]: 14ae1b77
    Attempt to execute non-executable address 14ae1b77

    That last line pretty much hands it to you on a silver platter.

  • The Old New Thing

    Cheap amusement: Searching for spelling errors in the registry


    One source of cheap amusement is searching for spelling errors in the registry. For example, one program tried to register a new file extension, or at least they tried, except that they spelled Extension wrong.

    And they wonder why that feature never worked.

    My discovery was that my registry contained the mysterious key HKEY_CURRENT_USER\S. After some debugging, I finally found the culprit. There was a program on my computer that did the equivalent of this:

    RegCreateKeyA(HKEY_CURRENT_USER, (PCSTR)L"Software\\...", &hk);

    One of my colleagues remarked, "With enough force, any peg will fit in any hole."

    I suspect that the code was not that aggressively wrong. It was probably something more subtle.

  • The Old New Thing

    How do I hide a window without blocking on it?


    A customer was working on improving their application startup performance. They found that if their application was launched immediately after a fresh boot, the act of dismissing their splash screen was taking over 5% of their boot time. Their code removed the splash screen by calling Show­Window(hwndSplash, SW_HIDE). They suspect that the splash screen thread has, for some reason, stopped responding to messages, and while an investigation into that avenue was undertaken, a parallel investigation into reducing the cost of hiding the splash screen was also begun.

    One of the things they tried was to remove the WS_EX_TOOL­WINDOW style and call ITaskbarList::DeleteTab(hwndSplash) but they found that it wasn't helping.

    The reason it wasn't helping is that editing the window style generates WM_STYLE­CHANGING/WM_STYLE­CHANGED messages to the target window, and now you're back where you started.

    A better way is to use Show­Window­Async(hwndSplash, SW_HIDE). The -Async version of the Show­Window function is the Send­Notify­Message version of Show­Window: If the window belongs to another thread group, then it schedules the operation but does not wait for it to complete.

  • The Old New Thing

    Why can't I use the file sharing wizard if I exclude inheritable permissions from a folder's parent?


    In Windows Vista and Windows Server 2008, if you go to a the advanced security settings for a directory and uncheck "include inheritable permissions from this object's parent", then go back to the Sharing tab, you'll find that the "Share" button is disabled. Why is this? We don't see this behavior on Windows 7 or Windows Server 2008 R2.

    (Yes, a customer actually noticed and asked the question.)

    The sharing wizard in Windows Vista and Windows Server 2008 does not support folders with the SE_DACL_PROTECTED security descriptor control bit because it isn't sure that it can restore the ACL afterward.

    And as the customer noted, this restriction was lifted in Windows 7 and Windows Server 2008 R2.

Page 2 of 3 (25 items) 123