March, 2011

  • The Old New Thing

    How do I create a topmost window that is never covered by other topmost windows?

    • 74 Comments

    We already know that you can't create a window that is always on top, even in the presence of other windows marked always-on-top. An application of the What if two programs did this? rule demonstrates that it's not possible, because whatever trick you use to be on-top-of-always-on-top, another program can use the same trick, and now you have two on-top-of-always-on-top windows, and what happens?

    A customer who failed to understand this principle asked for a way to establish their window as "super-awesome topmost". They even discovered the answer to the "and what happens?" rhetorical question posed above.

    We have overridden the OnLostFocus and OnPaint methods to re-assert the TopLevel and TopMost window properties, as well as calling BringToFront and Activate. The result is that our application and other applications end up fighting back and forth because both applications are applying similar logic. We tried installing a global hook and swallowing paint and focus events for all applications aside from our own (thereby preventing the other applications from having the opportunity to take TopMost ahead of us), but we found that this causes the other applications to crash. We're thinking of setting a timer and re-asserting TopMost when the timer fires. Is there a better way?

    This is like saying, "Sometimes I'm in a hurry, and I want to make sure I am the next person to get served at the deli counter. To do this, I find whoever has the lowest number, knock them unconscious, and steal their ticket. But sometimes somebody else comes in who's also in a hurry. That person knocks me unconscious and steals my ticket. My plan is to set my watch alarm to wake me up periodically, and each time it wakes me up, I find the person with the lowest number, knock them unconscious, and steal their ticket. Is there a better way?"

    The better way is to stop knocking each other unconscious and stealing each other's tickets.

    The customer (via the customer liaison) provided context for their question.

    This is not a general-purpose application. This application will be run on dedicated machines which operate giant monitors in retail stores. There are already other applications running on the computer which rotate through advertisements and other display information.

    The customer is writing another application which will also run on the machine. Most of the time, the application does nothing, but every so often, their application needs to come to the front and display its message, regardless of whatever the other applications are displaying. (For example, there might be a limited-time in-store promotion that needs to appear on top of the regular advertisements.)

    Unfortunately, all of these different programs were written by different vendors, and there is no coordination among them for who gets control of the screen. We were hoping that there was some way we could mark our window as "super topmost" so that when it came into conflict with another application running on the machine, it would win and the other application would lose.

    I'm thinking of recommending that the vendors all come up with some way of coordinating access to the screen so they can negotiate among themselves and not get into focus fights. (Easier said than done, since all the different applications running on the machine come from different vendors...)

    Since there is no coordination among the various applications, you're basically stuck playing a game of walls and ladders, hoping that your ladder is taller than everybody else's wall. The customer has pretty much found the tallest ladder which the window manager provides. There is no "super topmost" flag.

    Sure, you can try moving to another level of the system, like say creating a custom desktop, but all that does is give you a taller ladder. And then one of the other applications is going to say, "I need to display a store-wide page (manager to the deli please, manager to the deli), overriding all other messages, even if it's a limited-time in-store promotion." And they'll try something nastier, like enumerating all the windows in the system and calling ShowWindow(SW_HIDE).

    And then another application will say, "I need to display an important store-wide security announcement (Will the owner of a white Honda Civic, license plate 037-MSN, please return to your vehicle), overriding all other messages, even if it's a store-wide page." And it'll try something nastier, like setting their program as the screen saver, disabling the mouse and keyboard devices, and then invoking the screen saver on the secure desktop.

    And then another application will say, "I need to display a critical store-wide announcement (Fire in the automotive department. Everybody evacuate the building immediately), overriding all other messages, even if it's an important store-wide security announcement." And it'll try something nastier, like enumerating all the processes on the system, attaching to each one with debug privilege, and suspending all the threads.

    Stop the madness. The only sane way out is to have the programs coöperate to determine who is in control of the screen at any particular time.

    In response to my hypothetical game of walls and ladders, one of my colleagues wrote, "Note to self: Do not get into a walls-and-ladders contest with Raymond."

  • The Old New Thing

    How can I generate a consistent but unique value that can coexist with GUIDs?

    • 18 Comments

    A customer needed to generate a GUID for each instance of a hardware device they encounter:

    The serial number for each device is 20 bits long (four and a half bytes). We need to generate a GUID based on each device, subject to the constraints that when a device is reinserted, we generate the same GUID for it, that no two devices generate the same GUID, and that the GUIDs we generate not collide with GUIDs generated by other means. One of our engineers suggested just running uuidgen and substituting the serial number for the final nine hex digits. Is this a viable technique?

    This is similar to the trap of thinking that half of a GUID is just as unique as the whole thing. Remember that all the pieces of a GUID work together to establish uniqueness. If you just take parts of it, then the algorithm breaks down.

    For this particular case, you're in luck. The classic Type 1 GUID uses 60 bits to encode the timestamp and 48 bits to identify the location (computer). You can take a network card, extract the MAC address, then smash the card with a hammer. Now you have a unique location. Put your twenty bits of unique data as the timestamp, and you have a Type 1 GUID that is guaranteed never to collide with another GUID.

    If you have more than 60 bits of unique data, then this trick won't work. Fortunately, RFC4122 explains how to create a so-called name-based UUID, which is a UUID that can be reliably regenerated from the same source data. Section 4.3 explains how it's done. The result is either a type 3 or type 5 UUID, depending on which variant of the algorithm you chose.

  • The Old New Thing

    Why does my TIME_ZONE_INFORMATION have the wrong DST cutover date?

    • 30 Comments

    Public Service Announcement: Daylight Saving Time begins in most parts of the United States this weekend. Other parts of the world may change on a different day from the United States.

    A customer reported that they were getting incorrect values from the GetTimeZoneInformationForYear function.

    I have a program that calls GetTimeZoneInformationForYear, and it looks like it's returning incorrect DST transition dates. For example, GetTimeZoneInformationForYear(2010, NULL, &tzi) is returning March 2nd as the tzi.DaylightDate value, instead of the Expected March 14th date. The current time zone is Pacific Time.

    The value returned by GetTimeZoneInformationForYear (and GetTimeZoneInformation) is correct; you're just reading it wrong.

    As called out in the documentation for the TIME_ZONE_INFORMATION structure, the wDay field in the StandardDate and DaylightDate changes meaning depending on whether the wYear is zero or nonzero.

    If the wYear is nonzero, then the wDay has its usual meaning.

    But if the wYear is zero (and it is for most time zones), then the wDay encodes the week number of the cutover rather than the day number.

    In other words, that 2 does not mean "March 2nd". It means "the second week in March".

  • The Old New Thing

    Why is there the message '!Do not use this registry key' in the registry?

    • 40 Comments

    Under Software\Microsoft\Windows\Current­Version\Explorer\Shell Folders, there is a message to registry snoopers: The first value is called "!Do not use this registry key" and the associated data is the message "Use the SH­Get­Folder­Path or SH­Get­Known­Folder­Path function instead."

    I added that message.

    The long and sad story of the Shell Folders key explains that the registry key exists only to retain backward compatibility with four programs written in 1994. There's also a TechNet version of the article which is just as sad but not as long.

    One customer saw this message and complained, "That registry key and that TechNet article explain how to obtain the current locations of those special folders, but they don't explain how to change them. This type of woefully inadequate documentation only makes the problem worse."

    Hey, wow, a little message in a registry key and a magazine article are now "documentation"! The TechNet article is historical background. And the registry key is just a gentle nudge. Neither is documentation. It's not like I'm going to put a complete copy of the documentation into a registry key. Documentation lives in places like MSDN.

    But it seems that some people need more than a nudge; they need a shove. Let's see, we're told that the functions for obtaining the locations of known folders are SH­Get­Folder­Path and its more modern counterpart SH­Get­Known­Folder­Path. I wonder what the names of the functions for modifying those locations might be?

    Man that's a tough one. I'll let you puzzle that out for a while.

    Okay, here, I'll tell you: The corresponding functions go by the completely unobvious names SH­Set­Folder­Path and SH­Set­Known­Folder­Path.

    Sorry you had to use your brain. I'll let you get back to programming now.

  • The Old New Thing

    What's up with the mysterious inc bp in function prologues of 16-bit code?

    • 24 Comments

    A little while ago, we learned about the EBP chain. The EBP chain in 32-bit code is pretty straightforward because there is only one type of function call. But in 16-bit code there are two types of function calls, the near call and the far call.

    A near call pushes a 16-bit return address on the stack before branching to the function entry point, which must reside in the same code segment as the caller. The function then uses a ret instruction (a near return) when it wants to return to the caller, indicating that the CPU should resume execution at the specified address within the same code segment.

    By comparison, a far call pushes both the segment (or selector if in protected mode) and the offset of the return address on the stack (two 16-bit values), and the function being called is expected to use a retf instruction (a far return) to indicate that the CPU should pop two 16-bit values off the stack to determine where execution should resume.

    When Windows was first introduced, it ran on an 8086 with 384KB of RAM. This posed a challenge because the 8086 processor had no memory manager, had no CPU privilege levels, and had no concept of task switching. And in order to squeeze into 384KB of RAM, Windows needed to be able to load code from disk on demand and discard it when memory pressure required it.

    One of the really tricky parts of the real-mode memory manager was fixing up all the function pointers when code was loaded and unloaded. When you unloaded a function, you had to make sure that any existing code in memory that called that function didn't actually call it, because the function wasn't there. If you had a memory manager, you could mark the segment or page not present, but there is no such luxury on the 8086.

    There are multiple parts to the solution, but the part that leads to the answer to the title question is the way the memory manager patched up all the stacks in the system. After all, if you discarded a function, you had to make sure that any reference to that function as a return address on somebody's stack got fixed up before the code tried to execute that retf instruction and found itself returning to a function that didn't exist.

    And that's where the mysterious inc bp came from.

    The first rule of stack frames in real-mode Windows is that you must have a bp-based stack frame. FPO was not permitted. (Fortunately, FPO was also not very tempting because the 16-bit instruction set made it cumbersome to access stack memory by means other than the bp register, so the easiest way to do something was also the right way.) In other words, the first rule required that every stack have a valid bp chain at all times.

    The second rule of stack frames in real-mode Windows is that if you are going to return with a retf, then you must increment the bp register before you push it (and must therefore perform the corresponding decrement after you pop it). This second rule means that code which walks the bp chain can find the next function up the stack. If bp is even, then the function will use a near return, so it looks at the 16-bit value stored on the stack after the bp and doesn't change the cs register. On the other hand, if the bp is odd, then it knows to look at both the 16-bit offset and the 16-bit segment that were pushed on the stack.

    Okay, so let's put it all together: When code got discarded, the kernel walked all the stacks in the system (which it could now do due to these two rules), and if it saw that a return address corresponded to a function that got discarded, it patched the return address to point to a chunk of code which called back into the memory manager to reload the function, re-patch all the return addresses so they now point to the new address where the function got loaded (probably different from where the function was when it was discarded), and then jumped back to the original code as if nothing had happened.

    I continue to be amazed at how much Windows 1.0 managed to accomplish given that it had so little to work with. It even used an LRU algorithm to choose which functions to discard by implementing a software version of the "accessed bit", something that modern CPUs manage in hardware.

  • The Old New Thing

    How to rescue a broken stack trace: Recovering the EBP chain

    • 15 Comments

    When debugging, you may find that the stack trace falls apart:

    ChildEBP RetAddr
    001af118 773806a0 ntdll!KiFastSystemCallRet
    001af11c 7735b18c ntdll!ZwWaitForSingleObject+0xc
    001af180 7735b071 ntdll!RtlpWaitOnCriticalSection+0x154
    001af1a8 2f6db1a9 ntdll!RtlEnterCriticalSection+0x152
    001af1b4 2fe8d533 ABC!CCriticalSection::Lock+0x12
    001af1d0 2fe8d56a ABC!CMessageList::Lock+0x24
    001af234 2f6e47ac ABC!CMessageWindow::UpdateMessageList+0x231
    001af274 2f6f040e ABC!CMessageWindow::UpdateContents+0x84
    001af28c 2f6e4474 ABC!CMessageWindow::Refresh+0x1a8
    001af360 2f6e4359 ABC!CMessageWindow::OnChar+0x4c
    001af384 761a1a10 ABC!CMessageWindow::WndProc+0xb31
    00000000 00000000 USER32!GetMessageW+0x6e
    

    This can't possible be the complete stack. I mean, where's the thread procedure? That should be at the start of the stack for any thread.

    What happened is that the EBP chain got broken, and the debugger can't walk the stack any further. If the code was compiled with frame pointer optimization (FPO), then the compiler will not create EBP frames, permitting it to use EBP as a general purpose register instead. This is great for optimization, but it causes trouble for the debugger when it tries to take a stack trace through code compiled with FPO for which it does not have the necessary information to decode these types of stacks.

    Begin digression: Traditionally, every function began with the sequence

            push ebp      ;; save caller's EBP
            mov ebp, esp  ;; set our EBP to point to this "frame"
            sub esp, n    ;; reserve space for local variables
    

    and ended with

            mov esp, ebp  ;; discard local variables
            pop ebp       ;; recover caller's EBP
            ret n
    

    This pattern is so common that the x86 has dedicated instructions for it. The ENTER n,0 instruction does the push / mov / sub, and the LEAVE instruction does the mov / pop. (In C/C++, the value after the comma is always zero.)

    if you look at what this does to the stack, you see that this establishes a linked list of what are called EBP frames. Suppose you have the following code fragment:

    void Top(int a, int b)
    {
     int toplocal = b + 5;
     Middle(a, local);
    }
    
    void Middle(int c, int d)
    {
     Bottom(c+d);
    }
    
    void Bottom(int e)
    {
     int bottomlocal1, bottomlocal2;
     ...
    }
    

    When execution reaches the ... inside function Bottom the stack looks like the following. (I put higher addresses at the top; the stack grows downward. I also assume that the calling convention is __stdcall and that the code is compiled with absolutely no optimization.)

    Top's stack frame  
    0040F8F8 parameter b passed to Top During execution of Top,
    EBP = 0040F8EC
    0040F8F4 parameter a passed to Top
    0040F8F0 return address of Top's caller
    0040F8EC EBP of Top's caller
    0040F8E8 toplocal
    Middle's stack frame  
    0040F8E4 parameter d passed to Middle During execution of Middle,
    EBP = 0040F8D8
    0040F8E0 parameter c passed to Middle
    0040F8DC return address of Middle's caller
    0040F8D8 0040F8EC = EBP of Middle's caller
    Bottom's stack frame  
    0040F8D4 parameter e passed to Bottom During execution of Bottom,
    EBP = 0040F8CC
    0040F8D0 return address of Bottom's caller
    0040F8CC 0040F8D8 = EBP of Bottom's caller
    0040F8C8 bottomlocal1
    0040F8C4 bottomlocal2

    Each stack frame is identified by the EBP value which the function uses during its execution.

    The structure of each stack frame is therefore

    [ebp+n]Offsets greater than 4 access parameters
    [ebp+4]Offset 4 is the return address
    [ebp+0]Zero offset accesses caller's EBP
    [ebp-n]Negative offsets access locals

    And the stack frames are all connected to each other in the form of a linked list threaded through the EBP values. This linked list is known as the EBP chain. End digression.

    To recover from the broken EBP chain, start dumping the stack a little before things go bad (in this case, I would start at 001af384-80) and then look for something that looks like a valid stack frame. Since the parameters and locals to a function can be pretty much anything, all you have left to work with is the EBP and the return address. In other words, you are looking for pairs of values of the form

    «pointer a little higher up the stack».
    «code address»
    

    In this case, I got lucky and didn't have to go very far:

      001af474  00000000
     -001af478  001af494
    / 001af47c  14f4fba8 DEF!SubclassBase::CallOriginalWndProc+0x1a
    | 001af480  2f6e4317 ABC!CMessageWindow::WndProc
    | 001af484  00970338
    | 001af488  0000000f
    | 001af48c  00000000
    \ 001af490  00000000
     >001af494  001af4f0
      001af498  14f4fcd6 DEF!SubclassBase::ForwardMessage+0x23
      001af49c  00970338
      001af4a0  0000000f
      001af4a4  00000000
      001af4a8  00000000
      001af4ac  00000000
      001af4b0  2f6e4317 ABC!CMessageWindow::WndProc
      001af4b4  ed758311
      001af4b8  00000000
      001af4bc  15143f70
      001af4c0  00000000
      001af4c4  14f4fb8e DEF!CView::SortItems+0x96
      001af4c8  00000000
      001af4cc  2f6e4317 ABC!CMessageWindow::WndProc
      001af4d0  00000000
    

    At stack address 001af478, we have a pointer to memory higher up the stack followed by a code address. if you follow that pointer, it points to another instance of the same pattern: A pointer higher up the stack followed by the code address.

    Once you find where the EBP chain resumes, you can ask the debugger to resume its stack trace from that point with the =n option to the k command.

    0:000> k=001af478
    ChildEBP RetAddr
    001af478 14f4fba8 ntdll!KiFastSystemCallRet
    001af494 14f4fcd6 DEF!SubclassBase::CallOriginalWndProc+0x1a
    001af4f0 14f4fc8b DEF!SubclassBase::ForwardMessage+0x23
    001af514 14f32dd1 DEF!SubclassBase::ForwardChar+0x59
    001af530 14f4fcd6 DEF!SubclassBase::OnChar+0x3c
    001af58c 14f4fd76 DEF!HelpSubclass::WndProc+0x51
    001af5e4 761a1a10 DEF!SubclassBase::s_WndProc+0x1b
    001af610 761a1ae8 USER32!GetMessageW+0x6e
    001af688 761a1c03 USER32!GetMessageW+0x146
    001af6e4 761a3656 USER32!GetMessageW+0x261
    001af70c 77380e6e USER32!OffsetRect+0x4d
    001af784 761a2a98 ntdll!KiUserCallbackDispatcher+0x2e
    001af794 698fd0aa USER32!DispatchMessageW+0xf
    001af7a4 2f7bf15c ABC!CThread::DispatchMessageW+0x23
    001af7e0 2f7befc9 ABC!CMessageWindow::MessageLoop+0x3a2
    001af808 2ff56d20 ABC!CMessageWindow::ThreadProc+0x9f
    001af898 75c2384b ABC!CMessageWindow::s_ThreadProc+0x10
    001af8a4 7735a9bd kernel32!BaseThreadInitThunk+0x12
    001af8e4 00000000 ntdll!LdrInitializeThunk+0x4d
    

    When you do this, make sure to ignore the first line of the resumed stack trace, since that is based on your current EIP, not the return address stored in the stack frame.

    Today was really just a warm-up for another debugging technique that I haven't finished writing up yet, so you're just going to be in suspense for another two years or so, though if you attended my TechEd China talk, you already know where I'm going.

    Bonus reading: In Ryan Mangipano's two-part series on kernel mode stack overflows, the second part does a bit of EBP chain chasing. (Feel free to read the first part, as well as earlier discussion on the subject of stack overflows.)

  • The Old New Thing

    Why did Win32 define BOOL as a signed int instead of an unsigned int?

    • 48 Comments

    Igor Levicki wants somebody from Microsoft to explain why BOOL was defined as a signed int instead of an unsigned int.

    You don't need to work for Microsoft to figure this out. All the information you need is publically available.

    Quoting from K&R Classic, which was the operative C standards document at the time Windows was being developed:

    7.6 Relational Operators

    The [relational operators] all yield 0 if the specified relation is false and 1 if it is true. The type of the result is int.

    Win32 defined BOOL as synonymous with int because Brian and Dennis said so. If you want to know why Brian and Dennis decided to have the result of relational operators be signed instead of unsigned, you'll have to ask them.

  • The Old New Thing

    No, not that M, the other M, the one called Max

    • 29 Comments

    Code names are rampant at Microsoft. One of the purposes of a code name is to impress upon the people who work with the project that the name is only temporary, and that the final name will come from the marketing folks (who sometimes pull through with a catchy name like Zune, and who sometimes drop the ball with a dud like Bob and who sometimes cough up monstrosities like Microsoft WinFX Software Development Kit for Microsoft® Pre-Release Windows Operating System Code-Named "Longhorn", Beta 1 Web Setup).

    What I find amusing are the project which change their code names. I mean, the code name is already a placeholder; why replace a placeholder with another placeholder?

    One such example is the experimental project released under the code name Max. The project founders originally named it M. Just the letter M. Not to be confused with this thing code named M or this other thing code named M.

    In response to a complaint from upper management about single-letter code names, the name was changed to Milkshake, and the team members even made a cute little mascot figure, with a straw coming out the top of his head like a milkshake.

    I'm not sure why the name changed a second time. Perhaps those upper level managers didn't think Milkshake was a dignified-enough name. For whatever reason, the name changed yet again, this time to Max. (Wikipedia claims that the project was named after the pet dog of one of the team members; I have been unable to confirm this. Because I haven't bothered trying.)

    There's no real punch line here, sorry. Just one example of the naming history of a project that went by many names.

    Bonus chatter: Apparently the upper management folks who complained about the single-letter code name M were asleep when another product was code-named Q (now known as Windows Home Server).

  • The Old New Thing

    Why can't Explorer decide what size a file is?

    • 42 Comments

    If you open Explorer and highlight a file whose size is a few kilobytes, you can find some file sizes where the Explorer Size column shows a size different from the value shown in the Details pane. What's the deal? Why can't Explorer decide what size a file is?

    The two displays use different algorithms.

    The values in the Size column are always given in kilobytes, regardless of the actual file size. File is 15 bytes? Show it in kilobytes. File is 2 gigabytes? Show it in kilobytes.

    The value shown in the Size column is rounded up to the nearest kilobyte. Your 15-byte file shows up as 1KB. This has been the behavior since Explorer was first introduced back in Windows 95, Why? I don't know; the reasons may have been lost to the mists of time. Though I suspect one of the reasons is that you don't want a file to show up as 0KB unless it really is an empty file.

    On the other hand, the value shown in the Details pane uses adaptive units: For a tiny file, it'll show bytes, but for a large file, it'll show megabytes or gigabytes or whatever. And the value is shown to three significant digits.

    The result is that a file which is, say, 19465 bytes in size (19.0088 kilobytes) shows up in the Size column as 20KB, since the Size column rounds up. On the other hand, the Details pane shows 19.0KB since it displays the value to three significant digits.

    It looks like Explorer can't make up its mind, and perhaps it can't, but the reason is that the two places on the screen which show the size round in different ways.

  • The Old New Thing

    Charlie Sheen v Muammar Gaddafi: Whose line is it anyway?

    • 23 Comments

    I got seven out of ten right.

Page 1 of 3 (28 items) 123