• The Old New Thing

    A complex family calculus

    • 16 Comments

    I spent the other night at a relative's house, and I was woken the next morning by my young niece who politely asked me to make her breakfast. (Her technique for waking me up is to come in and call my name. If the door is closed, she pounds on the bedroom door and shouts, "Wake up! Wake up!" If I fail to open the door, she opens it herself. If the door is locked, she jiggles the handle until she forces the door open. I just leave the door open now. Making the best of a bad situation.)

    Anyway, later that morning, the following conversation took place between my niece and an adult family member (which conversation I have taken the liberty of translating into English):

    "Why did you wake up Uncle Raymond?"

    I wanted cereal for breakfast.

    "Why didn't you ask Mommy?"

    Mommy was still sleeping.

  • The Old New Thing

    Why does my window style change when I call SetWindowText?

    • 9 Comments

    A customer observed some strange behavior with window styles:

    We ran some weird behavior: Calling the Set­Window­Text function causes a change in window styles. Specifically, calling Set­Window­Text results in the WM_STYLE­CHANGING and WM_STYLE­CHANGED messages, and sometimes the result is that the WS_TAB­STOP style is removed. Is this a bug? What would cause this?

    The Set­Window­Text message sends the WM_SET­TEXT message to the control, at which point anything that happens is the window's own responsibility. If it wants to change styles based on the text you sent, then that's what happens. The window manager doesn't do anything special to force it to happen or to prevent it.

    That's weird, because I'm not even listening for WM_SET­TEXT messages. I also verified that there is no call into my code during the call to the the Set­Window­Text function.

    I'm assuming that the window belongs to the same process as the caller. If the window belongs to another process, then the rules are different.

    I'm changing the text of a window created by the same thread.

    Okay, so let's see what we have so far. The customer is calling the Set­Window­Text function to change the text of a window created on the same thread. There is no handler for the WM_SET­TEXT message, and yet the window style is changing. At this point, you might start looking for more obscure sources for the problem, like say a global hook of some sort. While I considered the possibilities, the customer added,

    It may be worth noting that I'm using the Sys­Link.

    Okay, now things are starting to make sense, and it didn't help that the customer provided misleading information in the description of the problem. For example, when the customer wrote, "There is no handler for the WM_SET­TEXT message," the customer was not referring to the window whose window text is changing but to some other unrelated window.

    It's like responding to the statement "A confirmation letter should have been sent to the account holder" with "I never got the confirmation letter," and then the person spends another day trying to figure out why the confirmation letter was never sent before you casually mention, "Oh, I'm not the account holder."

    The WM_SET­TEXT message is sent to the window you passed to Set­Window­Text; in this case, it's the Sys­Link window. It is therefore the window procedure of the Sys­Link window that is relevant here.

    The Sys­Link control remembers whether it was originally created with the WS_TAB­STOP, and if the markup it is given has no tab stops, then it removes the style; if the markup has tab stops, then it re-adds the style.

    How do I add a tab stop to a string? I couldn't find any reference to it and all my guesses failed.

    The tab stops in question are the hyperlinks you added when you used the <A>...</A> notation. If the text has no hyperlinks, then the control removes the WS_TAB­STOP style because it is no longer something you can tab to.

  • The Old New Thing

    West Bank Story, the movie that sells itself in five seconds

    • 5 Comments

    This weekend, I attended an Oscar-watching party, and when the clips from the nominees for Live Action Short Film were run, I was completely won over by West Bank Story. Five seconds of dancing, finger-snapping Jews and Arabs is all I needed.

    When filling out your Oscar party ballot, "Live Action Short Film" is one of those categories you just close your eyes and pick randomly since you don't know anything about any of the movies. It felt like everybody else at the party felt the same way I did: If we had seen that clip, we would all have voted for it. In our minds, it won the award before the envelope was even torn open.

    Update: Interview with Ari Sandel, director and co-writer.

  • The Old New Thing

    Get Sea-Tac flight information (including gate and baggage claim) via email

    • 3 Comments

    There are lots of flight status sites out there. (My favorite is FlightAware because it's the geekiest of them.) Many of them will send you email alerts when flight information changes, but the one from the Port of Seattle is the only one I know of that will also tell you when the arrival gate and baggage claim carousel number change, which is handy when you're picking up someone at the airport. (Main entry page here.)

    A useful site if you're departing from the airport is the TSA's Security Checkpoint Wait Times site, which gives you estimates of wait times based on data collected in the most recent few weeks.

    When you arrive at Seattle-Tacoma International Airport, a friendly voice welcomes you via a pre-recorded message. Historically, the voice has been that of the mayor of the city of Seattle, but a few years ago, it changed to that of the President of the Port Commission. I don't know what prompted the change, but I guess some sort of gentleman's agreement fell apart.

  • The Old New Thing

    How to rescue a broken stack trace on x64: Recovering the stack pointer

    • 1 Comments

    Recovering a broken stack on x64 machines on Windows is trickier because the x64 uses unwind codes for stack walking rather than a frame pointer chain. When you dump the stack, all you're going to see is return addresses sprinkled in amongst the stack data.

    Begin digression: According to the x64 ABI, each function must begin with a prologue which sets up the stack frame. It traditionally goes something like this:

        push rbx ;; save registers
        push rsi ;; save registers
        push rdi ;; save registers
        sub rsp, 0x20 ;; allocate space for local variables and outbound calls
    

    Suppose we have functions

    void Top(int a, int b)
    {
     int toplocal = b + 5;
     Middle(a, local);
    }
    
    void Middle(int c, int d)
    {
     Bottom(c+d);
    }
    
    void Bottom(int e)
    {
     int bottomlocal1, bottomlocal2;
     ...
    }
    

    When execution reaches the ... inside function Bottom the stack looks like the following. (I put higher addresses at the top; the stack grows downward. I also assume that the code is compiled with absolutely no optimization.)

          0040F8E8 parameter 4 (unused)
    0040F8E0 parameter 3 (unused)
    0040F8D8 parameter b passed to Top
    0040F8D0 parameter a passed to Top
    Top's stack frame     0040F8C8 return address of Top's caller During execution of Top,
    rsp = 0040F8A0
    0040F8C0 toplocal
        0040F8B8 parameter 4 (unused)
    0040F8B0 parameter 3 (unused)
    0040F8A8 parameter d passed to Middle
    0040F8A0 parameter c passed to Middle
    Middle's stack frame     0040F898 return address of Middle's caller During execution of Middle,
    rsp = 0040F870
    0040F890 padding for alignment
        0040F888 parameter 4 (unused)
    0040F880 parameter 3 (unused)
    0040F878 parameter 2 (unused)
    0040F870 parameter e passed to Bottom
    Bottom's stack frame     0040F868 return address of Bottom's caller During execution of Bottom,
    rsp = 0040F830
    0040F860 padding for alignment
    0040F858 bottomlocal1
    0040F850 bottomlocal2
        0040F848 parameter 4
    0040F840 parameter 3
    0040F838 parameter 2
    0040F830 parameter 1

    Of course, once the optimizer kicks in, there will also be saved registers in the stack frame, the unused space will start getting used as scratch variables, and the parameters will almost certainly not be spilled into their home locations. End digression.

    Consider this crash where we started executing random instructions (data in the code segment) and finally trapped.

    0:000> r
    rax=0000000000000000 rbx=0000000000000005 rcx=0000000000000042
    rdx=0000000000000010 rsi=00000000000615d4 rdi=00000000043f48e0
    rip=0000000000000000 rsp=00000000001ebf68 rbp=00000000043f32d0
     r8=00000000001ebfd0  r9=0000000000000000 r10=000000007fff3cae
    r11=0000000000000000 r12=0000000000000002 r13=0000000000517050
    r14=0000000000000000 r15=00000000043f55c0
    iopl=0         nv up ei pl nz na pe nc
    cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000202
    ABC!RandomFunction+0x1234:
    00000000`ff6ebaad test    byte ptr [rax+rdx*4],ah ds:00000000`00000040=??
    0:000> k
    Child-SP          RetAddr           Call Site
    00000000`001ebf70 00000000`00000004 ABC!RandomFunction+0x1234
    00000000`001ebf78 00000000`0000000e 0x4
    00000000`001ebf80 00000000`00000000 0xe
    

    Not very helpful. Let's try to reconstruct the call stack. Here's what we have right now:

    001ebf70  00000000`00000004
    001ebf78  00000000`0000000e
    001ebf80  00000000`00000000
    001ebf88  00000000`77ba21bc ntdll!RtlAllocateHeap+0x16c
    001ebf90  00000000`00000000
    001ebf98  00000000`ff6e1fa1 ABC!operator new[]+0x20
    001ebfa0  00000000`00000000
    001ebfa8  00000000`ff6e28ae ABC!DoesUserPreferMetricUnits+0x2a
    001ebfb0  00000000`000615d4
    001ebfb8  00000000`043f48e0
    001ebfc0  00000000`00000002
    001ebfc8  00000000`00517050
    001ebfd0  00000000`00000010
    001ebfd8  00000000`00000000
    001ebfe0  00000000`00000005
    001ebfe8  00000000`ff6e2b9b ABC!CUIController::UpdateTwoLineDisplay+0x156
    001ebff0  00000000`00000002
    001ebff8  00000000`005170f0
    001ec000  00000000`043f55c0
    001ec008  00000000`00510000
    001ec010  00000000`00000000
    001ec018  00000000`00000000
    001ec020  00000000`00000002
    001ec028  00000000`005170f0
    001ec030  00000000`00000000
    001ec038  00000000`00000000
    001ec040  00000000`00000002
    001ec048  00000000`005170f0
    001ec050  00000000`005170f8
    001ec058  00000000`ff6e2a94 ABC!CUIController::displayEvent+0xea
    001ec060  00000000`00750ed0
    001ec070  00000000`00517118
    001ec078  00000000`043f5aa0
    001ec080  00000000`005170f8
    001ec088  00000000`ff6e2f70 ABC!CEventRegistry::fire+0x34
    001ec090  00000000`00518090
    001ec098  00000000`00517118
    001ec0a0  00000000`043f5aa0
    001ec0a8  00000000`0000000e
    001ec0b0  00000000`00000000
    001ec0b8  00000000`00000000
    001ec0c0  00000000`043f2f00
    001ec0c8  00000000`ff6e2eef ABC!CCalculatorState::storeAndFire+0x126
    001ec0d0  00000000`043f5aa0
    001ec0d8  00000000`00000000
    001ec0e0  00000000`001ec180
    001ec0e8  00000000`00000000
    

    (Note that this dump shows addresses increasing downward, whereas the previous diagram had them increasing upward. Being able to read stack dumps comfortably in both directions is one of those skills you develop as you gain experience.)

    There is no frame pointer chain here to help you see if what you found is a call frame. You just have to use your intuition based on the function names. For example, it sounds perfectly reasonable for operator new[] to call Rtl­Allocate­Heap (to allocate memory), but DoesUserPreferMetric­Units is probably not going to call operator new[].

    Some disassembling around of candidate return addresses suggests that the DoesUserPreferMetric­Units is the one likely to have jumped into space, because it is calling through a function pointer variable, whereas the other candidate return addresses used a direct call (or a call to an import table entry, which is unlikely to be invalid).

    How do we reconstruct the stack based on this assumption? You trick the debugger into thinking that execution stopped inside the DoesUserPreferMetric­Units just before or after the fateful jump. It's easier to do "just after", since that's just the return address. We're going to pretend that instead of jumping into space, we jumped to a ret instruction.

    Since we don't know what the junk code did before it finally crashed, the current value of rsp is probably not accurate. We'll have to think backward to a point in time whose stack pointer we can infer, and then replay the code forward.

    From our knowledge of stack frames, we see that the rsp register had the value 001ebfb0 during the execution of DoesUserPreferMetric­Units just before it called the bad function pointer. Let's temporarily set our rsp and rip to simulate the return from the function.

    0:000> r rsp=1ebfb0
    0:000> r rip=ff6e28ae 
    0:000> k
    Child-SP          RetAddr           Call Site
    00000000`001ebfb0 00000000`ff6e2b9b ABC!DoesUserPreferMetricUnits+0x2a
    00000000`001ebff0 00000000`ff6e2a94 ABC!CUIController::UpdateDisplay+0x156
    00000000`001ec060 00000000`ff6e2f70 ABC!CUIController::displayEvent+0xea
    00000000`001ec090 00000000`ff6e2eef ABC!CEventRouter::fire+0x34
    00000000`001ec0d0 00000000`ff6e3469 ABC!CEngineState::storeAndFire+0x126
    00000000`001ec110 00000000`ff6e4149 ABC!CEngine::SetDisplayText+0x39
    00000000`001ec140 00000000`ff6ea48d ABC!CEngine::DisplayResult+0x648
    00000000`001ec3c0 00000000`ff6e49c6 ABC!CEngine::ProcessCommandWorker+0xa1a
    00000000`001ec530 00000000`ff6e4938 ABC!CEngine::ProcessCommand+0x2a
    00000000`001ec560 00000000`ff6e460a ABC!CUIController::ProcessInput+0xaa
    00000000`001ec5a0 00000000`ff6e4744 ABC!CContainer::ProcessInputs+0x7a1
    00000000`001ec700 00000000`77a6c3c1 ABC!CContainer::WndProc+0xa12
    00000000`001ecbe0 00000000`77a6a6d8 USER32!UserCallWinProcCheckWow+0x1ad
    00000000`001ecca0 00000000`77a6a85d USER32!SendMessageWorker+0x682
    00000000`001ecd30 00000000`ff70c5d8 USER32!SendMessageW+0x5c
    00000000`001ecd80 00000000`77a5e53b ABC!CMainDlgFrame::MainDlgProc+0x87
    00000000`001ecdc0 00000000`77a5e2f2 USER32!UserCallDlgProcCheckWow+0x1b6
    00000000`001ece80 00000000`77a5e222 USER32!DefDlgProcWorker+0xf1
    00000000`001ecf00 00000000`77a6c3c1 USER32!DefDlgProcW+0x36
    00000000`001ecf40 00000000`77a6a6d8 USER32!UserCallWinProcCheckWow+0x1ad
    00000000`001ed000 00000000`77a6a85d USER32!SendMessageWorker+0x682
    00000000`001ed090 000007fe`fc890ba3 USER32!SendMessageW+0x5c
    00000000`001ed0e0 000007fe`fc8947e2 COMCTL32!Button_ReleaseCapture+0x157
    00000000`001ed120 00000000`77a6c3c1 COMCTL32!Button_WndProc+0xcde
    00000000`001ed1e0 00000000`77a6c60a USER32!UserCallWinProcCheckWow+0x1ad
    00000000`001ed2a0 00000000`ff6e1a76 USER32!DispatchMessageWorker+0x3b5
    00000000`001ed320 00000000`ff6fa00f ABC!WinMain+0x1db4
    00000000`001efa10 00000000`7794f33d ABC!__mainCRTStartup+0x18e
    00000000`001efad0 00000000`77b82ca1 kernel32!BaseThreadInitThunk+0xd
    00000000`001efb00 00000000`00000000 ntdll!RtlUserThreadStart+0x1d
    0:000> r rsp=001ebf68
    0:000> r rip=ff6ebaad
    

    After getting what we want, we restore the registers to their original values at the time of the crash so that future investigation won't be misled by our editing.

  • The Old New Thing

    How do you prevent the linker from discarding a function you want to make available for debugging?

    • 22 Comments

    We saw some time ago that you can ask the Windows symbolic debugger engine to call a function directly from the debugger. To do this, of course, the function needs to exist.

    But what if you want a function for the sole purpose of debugging? It never gets called from the main program, so the linker will declare the code dead and remove it.

    One sledgehammer solution is to disable discarding of unused functions. This the global solution to a local problem, since you are now preventing the discard of any unused function, even though all you care about is one specific function.

    If you are comfortable hard-coding function decorations for specific architectures, you can use the /INCLUDE directive.

    #if defined(_X86_)
    #define DecorateCdeclFunctionName(fn) "_" #fn
    #elif defined(_AMD64_)
    #define DecorateCdeclFunctionName(fn) #fn
    #elif defined(_IA64_)
    #define DecorateCdeclFunctionName(fn) "." #fn
    #elif defined(_ALPHA_)
    #define DecorateCdeclFunctionName(fn) #fn
    #elif defined(_MIPS_)
    #define DecorateCdeclFunctionName(fn) #fn
    #elif defined(_PPC_)
    #define DecorateCdeclFunctionName(fn) ".." #fn
    #else
    #error Unknown architecture - don't know how it decorates cdecl.
    #endif
    #pragma comment(linker, "/include:" DecoratedCdeclFunctionName(TestMe))
    EXTERN_C void __cdecl TestMe(int x, int y)
    {
        ...
    }
    

    If you are not comfortable with that (and I don't blame you), you can create a false reference to the debugging function that cannot be optimized out. You do this by passing a pointer to the debugging function to a helper function outside your module that doesn't do anything interesting. Since the helper function is not in your module, the compiler doesn't know that the helper function doesn't do anything, so it cannot optimize out the debugging function.

    struct ForceFunctionToBeLinked
    {
      ForceFunctionToBeLinked(const void *p) { SetLastError(PtrToInt(p)); }
    };
    
    ForceFunctionToBeLinked forceTestMe(TestMe);
    

    The call to Set­Last­Error merely updates the thread's last-error code, but since this is not called at a time where anybody cares about the last-error code, it is has no meaningful effect. The compiler doesn't know that, though, so it has to generate the code, and that forces the function to be linked.

    The nice thing about this technique is that the optimizer sees that this class has no data members, so no data gets generated into the module's data segment. The not-nice thing about this technique is that it is kind of opaque.

  • The Old New Thing

    The mystery of the garbage lady

    • 11 Comments

    Last year, my good friend and colleague Sarah transfered from the Redmond offices to Microsoft UK in Reading. One of her most popular lunchtime stories is the mystery of the garbage lady, which she finally got around to posting on her blog.

    Some of my other favorite stories from her blog:

    A colleague of mine experienced the phenomenon of clouded geography in reverse. He was temporarily assigned to Microsoft UK and while living there had occasion to drive out to Wales. He pulled out his handy road map and studied it: "Okay, I need to take this highway west, over the mountain range, and then take that exit, and then I'll be there." He hopped in his car and started driving.

    After a while he started getting nervous. It was getting late, and he still hadn't reached the mountain range yet. He started worrying that the people he was meeting at the destination would be concerned when he failed to show up on time. (I guess he picked up the British habit of worrying about other people being worried.)

    And then he saw the exit, and boom, he was at his destination.

    Afterwards, he went back to the map to see what happened.

    The first issue was one of scale. His map was of all of Great Britain, and he assumed that the scale of such a map was comparable to maps of large areas of the United States. A route that goes halfway across a large map, say a map of the state of Washington, will take a few hours to cover. The UK is comparatively much more compact. From Reading, you can get to the Welsh border in 90 minutes.

    The second issue was one of geography. What was notated on the map as a mountain range was, to someone more familiar with the mountains of western North America, just a hill.

  • The Old New Thing

    There is no complete list of all notifications balloon tips in Windows

    • 33 Comments

    A customer wanted a complete list of all notifications balloon tips in Windows.

    There is no such list. Each component is responsible for its own balloon tips and is not required to submit their list of balloon tips to any central authority for cataloging. In order to create such a list, somebody would have to go contact every component team and ask them for a list of all their balloon tips, and that component team would probably undertake a search of their code base looking for balloon tips. And figuring out the text of each balloon tip can be tricky since the text may be built dynamically. (And the customer didn't ask for an explanation of the conditions under which each balloon tip may appear, but that's probably going to be their follow-up question.)

    It's like publishing instructions on how to display a message on the message board, and then somebody asking the message board manufacturer, "Please send me a list of all messages that might appear on the message board." The message board manufacturer doesn't know how its customers are using the message board. They would have to go survey their customers and ask each one to build an inventory of every message that could potentially be shown.

    In other words, this customer was asking for a research project to be spun up to answer their question.

    I suspected that this was a case of the for-if antipattern being applied to custom support questions, and I asked what the customer intended to do with this information.

    It turns out that they didn't even want this list of balloon tips. They just had a couple of balloon tips they were interested in, and they wanted to know if there were settings to disable them.

    But even though that was their stated goal, they still reiterated their request for a list of balloon tips. The customer liaison asked, "Is there a possibility is getting a list of balloon tips generated by Windows itself? Even a partial list would be helpful. Can we help with this?"

    What the customer liaison can do is contact each Windows component team and ask them, "Can you give me a list of the balloon tips your component generates? Even a partial list would be helpful." I wished him luck.

    The customer liaison replied, "I passed this information to the customer and will follow up if they have any further questions." No follow-up ever appeared.

  • The Old New Thing

    Who decides what can be done with an object or a control?

    • 8 Comments

    This is one of those things that is obvious to me, but perhaps is not obvious to everyone. An object establishes what can be done with it. Any rights granted by the object go to the creator. The creator can in turn grant rights to others. But if you're a third party to the object/creator relationship, you can't just step in and start messing around without the permission of both the object and the creator.

    For example, unless you have permission of the creator of a list view control, you can't go around adding, removing, and changing items in the list view. The creator of the list view decides what goes in it, and at most you can look at it to see what the creator decided to put in it. I say "at most" because you often can't even do that: If the fact that the item is a list view control is not public, then the program that created the list view might decide to use some other control in a future version, and then your code that tries to look at the list view will stop working.

    Naturally, any private data associated with a control (such as the LPARAM associated with a list item) is entirely under the purview of the control's creator. The creator of the control decides what the item data means, and that can change from version to version if not explicitly documented. (For example, the meaning of the item data associated with the main list view in Explorer windows has changed at pretty much every major release of Windows.)

    Generally speaking, without the permission of the creator of a control and the control itself, you can't do anything to it. You can't hide it, show it, move it, change its scroll bars, mess with its menus (assuming it even uses menus at all), change its text, or destroy it. The code that created the control and the control itself presumably maintain their own state (the control maintains state about itself, and the creator maintains state about what it has done to the control). If you start messing with the control yourself, you may find that your changes seem to work for a while and then are suddenly lost when the control decides to re-synchronize its internal state with its external state. Or worse, things will just be permanently out of sync, and the program will start acting strange. If the control and the control's creator have not provided a way to coordinate your actions with them, then you can't mess with the control.

    If you're tempted to go mess with somebody else's controls, think about how you would like it if the tables were turned. How would you feel if somebody wrote a program that took the controls in your program and started messing with them, say adding items to your list box and using the item data to extract information out of your program? What if they started filing bugs against you for changing the way your program operates internally?

  • The Old New Thing

    The three things you need to know about tsunamis

    • 9 Comments

    One of my friends is involved with Science on Tap, a free, informal monthly get-together in the Seattle area covering science topics for the general public. (Recent coverage in the Seattle-PI.) The topic for July 30th is "The three things you need to know about tsunamis", and a title like that pretty much sells itself.

Page 370 of 447 (4,468 items) «368369370371372»