Other

  • The Old New Thing

    Why does the mouse cursor jump a few pixels if you click on the very bottom row of pixels on the taskbar?

    • 31 Comments

    ABCDSchuetze discovered that if you click on the very bottom row of pixels of the taskbar, the mouse cursor jumps up a few pixels. Why is that?

    In order to take advantage of Fitts's Law, the bottom-most row of pixels on a bottom-docked taskbar are clickable. Even though they are technically on the dead border of the window, the taskbar redirects the click to the button immediately above the border. But then you have this problem:

    • User clicks on the border pixel between the button and the edge of the screen.
    • The taskbar remaps the click to the button, thereby activating the button.
    • The button takes capture because that's what buttons do when you click on them. This allows you to drag off the button to cancel the click.
    • But wait: Since the mouse is on the border, it is already outside the button.
    • Result: The button cancels immediately.

    The short version: Clicking on the Fitts's edge causes the button to be pressed and then immediately canceled.

    The fix is to nudge the mouse back inside the button when the click begins.

  • The Old New Thing

    When asking about the capacity of a program, you also need to consider what happens when you decide to stop using the program

    • 27 Comments

    An internal customer had a question about a tool, let's call it Program Q.

    As part of our gated checkin system, the system creates a new record in a Program Q table to record details of the checkin. (What tests were run, who did the code review, that sort of thing.) We are considering incorporating the Program Q record number as part of our version number: major.minor.service­pack.hot­fix.record­number. What is the maximum number of records per table supported by Program Q?

    Now, the easy way out is to just answer the question: The theoretical maximum number of records per table is 2³²−1. Even if your gated checkin system's throughput is one checkin per second, that gives you over a century of record numbers.

    But answering the question misses the big picture: The limiting factor is not the capacity of Program Q. The limiting factor is how long you plan to keep using Program Q! Before the century is up, you probably won't be using Program Q. What is your transition plan?

    In this case, it's probably not that complicated. Suppose that at the time Program Q is retired and replaced with Program R, the highest record number is 314159. You could just say that the version number in the binary is the Program R record number plus 400000.

    If you're clever, you can time the switch from Program Q to Program R to coincide with a change to the major.minor.service­pack.hot­fix, at which point you can reset the record­number to 1.

  • The Old New Thing

    Microspeak: Light up

    • 16 Comments

    In Microspeak, a feature lights up if it becomes available in an application when run on an operating system that supports it.

    The idea is that you write your application to run on, say, Windows versions N and N + 1. There is a new feature in Windows version N + 1, and new functionality in the application when the code detects that the underlying Windows feature is availble.

    Here are a few citations:

    I have had some requests lately for details on the path to Windows 7 compatibility and how to light up key features.
    Top 7 Ways to light Up Your Apps on Windows Server 2008.

    The idea is that the program takes advantage of new Windows features when running on new versions of Windows. if run on older versions of Windows without those features, nothing happens.

    Inside the product group, discussion about "lighting up" often takes the form of deciding how much new hotness will be automatically applied to old applications and how much needs to be explicitly opted in. Applying new features to old applications makes the old applications and the new feature more valuable because the user sees the feature everywhere, as opposed to working only in new applications designed to take advantage of it. On the other hand, applying them automatically to old applications creates a compatibility risk, because the application may observe a behavior that hadn't occurred before, and it may begin behaving erratically as a result.

  • The Old New Thing

    Why was the replacement installer for recognized 16-bit installers itself a 32-bit program instead of a 64-bit program?

    • 34 Comments

    Even though 64-bit Windows does not support 16-bit applications, there is a special case for 16-bit installers for 32-bit applications. Windows detects this scenario and substitutes a 32-bit replacement installer which replicates the actions of the 16-bit installer. Commenter Karellen is horrified at the fact that the replacement installer is a 32-bit program. "You're writing a program that will run exclusively on 64-bit systems. Why not built it to run natively on the OS it's designed for? Why is this apparently not the "obvious" Right Thing(tm) to do? What am I missing?"

    Recall that a science project is a programming project that is technically impressive but ultimately impractical. For example it might be a project that nobody would actually use, or it attempts to add a Gee-Whiz feature that nobody is really clamoring for.

    But at least a science project is trying to solve a problem. This proposal doesn't even solve any problems! Indeed, this proposal creates problems. One argument in favor of doing it this way is that it is satisfies some obsessive-compulsive requirement that a 64-bit operating system have no 32-bit components beyond the 32-bit emulation environment itself.

    Because! Because you're running a 64-bit system, and running apps native to that system is just more elegant.

    Okay, it's not obsessive-compulsive behavior. It's some sort of aesthetic ideal, postulated for its own sake, devoid of practical considerations.

    Remember the problem space. We have a bunch of 32-bit applications that use a 16-bit installer. Our goal is to get those applications installed on 64-bit Windows. By making the replacement installer a 32-bit program, you get the emulator to do all the dirty work for you. Things like registry redirection, file system redirection, and 32-bit application compatibility.

    Suppose the original installer database says

    • Copy X.DLL file into the %Program­Files%\AppName directory.
    • Copy Y.DLL into the %windir%\System32 directory.
    • If the current version of C:\Program Files\Common Files\Adobe\Acrobat\ActiveX\AcroPDF.dll is 7.5 or higher, then set this registry key.

    If you write the replacement installer as a 32-bit program, then other parts of the 32-bit emulation engine do the work for you.

    • The environment manager knows that 64-bit processes get the environment variable Program­Files pointing to C:\Program Files, whereas 32-bit processes get Program­Files pointing to C:\Program Files (x86).
    • The file system redirector knows that if a 32-bit process asks for %windir%\System32, it should really get %windir%\SysWOW64.
    • The registry redirector knows that if a 32-bit process tries to access certain parts of the registry, they should be sent to the Wow­64­32­Node instead.

    If you had written the replacement installer as a 64-bit program, you would have to replicate all of these rules and make sure your copy of the rules exactly matched the rules used by the real environment manager, file system redirector, and registry redirector.

    Now you have to keep two engines in sync: the 32-bit emulation engine and the 64-bit replacement installer for 32-bit applications. This introduces fragility, because any behavior change in the 32-bit emulation engine must be accompanied by a corresponding change in the 64-bit replacement installer.

    Suppose the application compatibility folks add a rule that says, "If a 32-bit installer tries to read the version string from C:\Program Files\Common Files\Adobe\Acrobat\ActiveX\AcroPDF.dll, return the version string from C:\Program Files (x86)\Common Files\Adobe\Acrobat\ActiveX\AcroPDF.dll instead." And suppose that rule is not copied to the 64-bit replacement installer. Congratulations, your 64-bit replacement installer will incorrectly install any program that changes behavior based on the currently-installed version of AcroPDF.

    I don't know for sure, but I wouldn't be surprised if some of these installers support plug-ins, so that the application developer can run custom code during installation. It is possible for 16-bit applications to load 32-bit DLLs via a technique known as generic thunking, and the 16-bit stub installer would use a generic thunk to call into the 32-bit DLL to do whatever custom action was required. On the other hand, 64-bit applications cannot load 32-bit DLLs, so if the 64-bit replacement installer encountered a 32-bit DLL plug-in, it would have to run a 32-bit helper application to load the plug-in and call into it. So you didn't escape having a 32-bit component after all.

    And the original obsessive-compulsive reason for requiring the replacement installer to be 64-bit was flawed anyway. This is a replacement installer for a 32-bit application. Therefore, the replacement installer is part of the 32-bit emulation environment, so it is allowed to be written as a 32-bit component.

    Let's look at the other arguments given for why the replacement installer for a 32-bit application should be written as a 64-bit application.

    Because complexity is what will be our undoing in the end, and reducing it wherever we can is always a win.

    As we saw above, writing the replacement installer as a 64-bit application introduces complexity. Writing it as a 32-bit application reduces complexity. So this statement itself argues for writing the replacement installer as a 32-bit application.

    Because we can't rewrite everything from scratch at once, but we can create clean new code one small piece at a time, preventing an increase to our technical debt where we have the opportunity to do so at negligible incremental cost to just piling on more cruft.

    As noted above, the incremental cost is hardly negligible. Indeed, writing the replacement installer as a 64-bit application is not merely more complex, it creates an ongoing support obligation, because any time there is a change to the 32-bit emulation environment, that change needs to be replicated in the 64-bit replacement installer. This is a huge source of technical debt: Fragile coupling between two seemingly-unrelated components.

    And writing the replacement installer as a 32-bit application does not create a future obligation to port it to 64 bits when support for 32-bit applications is dropped in some future version of Windows. Because when support for 32-bit applications disappears (as it already has on Server Core), there will be no need to port the replacement installer to 64-bit because there's no point writing an installer for a program that cannot run!

    Writing the replacement installer as a 32-bit program was the right call.

  • The Old New Thing

    If you wonder why a function can't be found, one thing to check is whether the function exists in the first place

    • 10 Comments

    One of my colleagues was frustrated trying to get some code to build. "Is there something strange about linking variadic functions? Because I keep getting an unresolved external error for the function, but if I move the function definition to the declaration point, then everything works fine."

    // blahblah.h
    
    ... other declarations ...
    
    void LogWidget(Widget* widget, const char* format, ...);
    
    ...
    
    // widgetstuff.cpp
    ...
    #include "blahblah.h"
    ...
    
    // some code that calls LogWidget
    void foo(Widget* widget)
    {
     LogWidget(widget, "starting foo");
     ...
    }
    
    // and then near the end of the file
    
    void LogWidget(Widget* widget, const char* format, ...)
    {
        ... implementation ...
    }
    
    ...
    

    "With the above code, the linker complains that Log­Widget cannot be found. But if I move the implementation of Log­Widget to the top of the file, then everything builds fine."

    // widgetstuff.cpp
    ...
    #include "blahblah.h"
    ...
    
    // move the code up here
    void LogWidget(Widget* widget, const char* format, ...)
    {
        ... implementation ...
    }
    
    // some code that calls LogWidget
    void foo(Widget* widget)
    {
     LogWidget(widget, "starting foo");
     ...
    }
    
    ...
    

    "I tried putting an explicit calling convention in the declaration, I tried using extern "C", nothing seems to help."

    We looked at the resulting object file and observed that in the case where the error occurred, there was an external reference to Log­Widget but no definition. I asked, "Is the definition of the function #ifdef'd out by mistake? You can use this technique to find out."

    That was indeed the problem. The definition of the function was inside some sort of #ifdef that prevented it from being compiled.

    Sometimes, the reason a function cannot be found is that it doesn't exist in the first place.

  • The Old New Thing

    Tip for trying to boost morale: Don't brag about your overseas trips

    • 26 Comments

    Once upon a time, a senior manager held a team meeting to address low team morale. Attempting to highlight how important the project was, he opened by saying, "I just got back from ⟨faraway country⟩, meeting with some of our important clients, and..."

    This remark had exactly the opposite effect from what the manager intended. Instead of revitalizing the team, the team became even more despondent. "Here we are, working late and coming in on weekends, and this senior manager is telling us about his recent overseas junket."

    (After he left, I heard that he still pulled this stunt with his new team. "Over the past two weeks, I've been all over ⟨continent⟩...")

  • The Old New Thing

    Finding the leaked object reference by scanning memory: Example

    • 16 Comments

    An assertion failure was hit in some code.

        // There should be no additional references to the object at this point
        assert(m_cRef == 1);
    

    But the reference count was 2. That's not good. Where is that extra reference and who took it?

    This was not code I was at all familiar with, so I went back to first principles: Let's hope that the reference was not leaked but rather that the reference was taken and not released. And let's hope that the memory hasn't been paged out. (Because debugging is an exercise in optimism.)

    1: kd> s 0 0fffffff 00 86 ec 00
    04effacc  00 86 ec 00 c0 85 ec 00-00 00 00 00 00 00 00 00  ................ // us
    0532c318  00 86 ec 00 28 05 00 00-80 6d 32 05 03 00 00 00  ....(....m2..... // rogue
    

    The first hit is the reference to the object from the code raising the assertion. The second hit is the interesting one. That's probably the rogue reference. But who is it?

    1: kd> ln 532c318
    1: kd>
    

    It does not report as belong to any module, so it's not a global variable.

    Is it a reference from a stack variable? If so, then a stack trace of the thread with the active reference may tell us who is holding the reference and why.

    1: kd> !process -1 4
    PROCESS 907ef980  SessionId: 2  Cid: 06cc    Peb: 7f4df000  ParentCid: 0298
        DirBase: 9e983000  ObjectTable: a576f560  HandleCount: 330.
        Image: contoso.exe
    
            THREAD 8e840080  Cid 06cc.0b78  Teb: 7f4de000 Win32Thread: 9d04b3e0 WAIT
            THREAD 91e24080  Cid 06cc.08d8  Teb: 7f4dd000 Win32Thread: 00000000 WAIT
            THREAD 8e9a3580  Cid 06cc.09f8  Teb: 7f4dc000 Win32Thread: 9d102cc8 WAIT
            THREAD 8e2be080  Cid 06cc.0878  Teb: 7f4db000 Win32Thread: 9d129978 WAIT
            THREAD 82c08080  Cid 06cc.0480  Teb: 7f4da000 Win32Thread: 00000000 WAIT
            THREAD 90552400  Cid 06cc.0f5c  Teb: 7f4d9000 Win32Thread: 9d129628 WAIT
            THREAD 912c9080  Cid 06cc.02ec  Teb: 7f4d8000 Win32Thread: 00000000 WAIT
            THREAD 8e9e8680  Cid 06cc.0130  Teb: 7f4d7000 Win32Thread: 9d129cc8 READY on processor 0
            THREAD 914b8b80  Cid 06cc.02e8  Teb: 7f4d6000 Win32Thread: 9d12d568 WAIT
            THREAD 9054ab00  Cid 06cc.0294  Teb: 7f4d5000 Win32Thread: 9d12fac0 WAIT
            THREAD 909a2b80  Cid 06cc.0b54  Teb: 7f4d4000 Win32Thread: 00000000 WAIT
            THREAD 90866b80  Cid 06cc.0784  Teb: 7f4d3000 Win32Thread: 93dbb4e0 RUNNING on processor 1
            THREAD 90cfcb80  Cid 06cc.08c4  Teb: 7f3af000 Win32Thread: 93de0cc8 WAIT
            THREAD 90c39a00  Cid 06cc.0914  Teb: 7f3ae000 Win32Thread: 00000000 WAIT
            THREAD 90629480  Cid 06cc.0bc8  Teb: 7f3ad000 Win32Thread: 00000000 WAIT
    

    Now I have to dump the stack boundaries to see whether the address in question lies within the stack range.

    1: kd> dd 7f4de000 l3
    7f4de000  ffffffff 00de0000 00dd0000
    1: kd> dd 7f4dd000 l3
    7f4dd000  ffffffff 01070000 01060000
    ...
    1: kd> dd 7f4d7000 l3
    7f4d7000  ffffffff 04e00000 04df0000 // our stack
    ...
    

    The rogue reference did not land in any of the stack ranges, so it's probably on the heap. Fortunately, since it's on the heap, it's probably part of some larger object. And let's hope (see: optimism) that it's an object with virtual methods.

    0532c298  73617453
    0532c29c  74654d68
    0532c2a0  74616461
    0532c2a4  446e4961
    0532c2a8  00007865
    0532c2ac  00000000
    0532c2b0  76726553 USER32!_NULL_IMPORT_DESCRIPTOR  (USER32+0xb6553)
    0532c2b4  44497265
    0532c2b8  45646e49
    0532c2bc  41745378 contoso!CMumble::CMumble+0x4c
    0532c2c0  00006873
    0532c2c4  00000000
    0532c2c8  4e616843
    0532c2cc  79546567
    0532c2d0  4e496570
    0532c2d4  00786564
    0532c2d8  2856662a
    0532c2dc  080a9b87
    0532c2e0  00f59fa0
    0532c2e4  05326538
    0532c2e8  00000000
    0532c2ec  00000000
    0532c2f0  0000029c
    0532c2f4  00000001
    0532c2f8  00000230
    0532c2fc  fdfdfdfd
    0532c300  45ea1370 contoso!CFrumble::`vftable'
    0532c304  45ea134c contoso!CFrumble::`vftable'
    0532c308  00000000
    0532c30c  05b9a040
    0532c310  00000002
    0532c314  00000001
    0532c318  00ec8600
    

    Hooray, there is a vtable a few bytes before the pointer, and the contents of the memory do appear to match a CFrumble object, so I think we found our culprit.

    I was able to hand off the next stage of the investigation (why is a Frumble being created with a reference to the object?) to another team member with more expertise with Frumbles.

    (In case anybody cared, the conclusion was that this was a variation of a known bug.)

  • The Old New Thing

    No good deed goes unpunished: Marking a document as obsolete

    • 43 Comments

    I was contacted by a customer support liaison who was hoping that I could help them understand Feature X.

    I saw your name on a "Feature X technical specification" document in the Windows specification repository, and I was hoping you could answer a few questions for me, or redirect me to somebody who can.

    I was puzzled why this person saw my name on the "Feature X technical specification" document in the Windows specification repository, because I was not the author of that specification. I went to the specification repository, opened the document in question, and nope, my name appears nowhere in it.

    I asked, "What gave you the impression that I had anything to do with Feature X? XYZ can help you with your questions; he's the one listed as the author of the document."

    The response was, "Oh, I'm sorry. I didn't actually read the specification. I merely did a search through the entire repository for Feature X, and the "Feature X technical specification" is the one that showed up as most recently updated by you. In the past, this technique has been pretty good at finding someone who can help with a feature. Sorry about that."

    I went back and took another look at the document, and then I remembered why I updated it: My duties at the time included reviewing all documents that met certain criteria, such as this particular document. I had some feedback about the document for the author, who told me, "Oh, that's an obsolete version of the document, but it's retained for historical purposes. The current one is over there." To save the next person some time, I edited the obsolete document by inserting in big letters at the top, "TECHNICAL DOCUMENTATION FOR THIS FEATURE HAS MOVED TO ⟨new location⟩. THIS DOCUMENT IS OBSOLETE." I could've asked the author to do this, but I had the document open already, so I figured I'd save a few steps (ask author to update document, wait for reply, reopen document to verify that edit occurred) and just do it myself.

    Boom, no good deed goes unpunished. My update was made long after the real technical specification was completed. As a result, of all the documents on Feature X, not only is it the obsolete one that shows up as most recently updated, but I am the one listed as the person who made that most recent update.

    Next time, I'll try to remember to do things the long way, even though it is big hassle for everybody.

    "Please update the document to indicate that it is obsolete and redirect the reader to the current document."

    — Could you do that? You've already got the document open.

    "No, I used to do that, but it came back and bit me, because I become the person to edit the document last, and then everybody comes to me with questions about the document instead of you."

    — You do realize that in the time you tried to convince me to do it, you could've just done it.

    Follow-up: I tried it, and sometimes the response was "I'm really busy now, I'll get around to it in a few weeks." Now I have to create a reminder task in two weeks to follow up. More hassle for everybody.

    I think the next time this happens, I'll write back, "I'm coming over to your office. I'll make the one-line edit on your computer so that your name is the one attached to the edit."

  • The Old New Thing

    2014 year-end link clearance

    • 14 Comments

    Another round of the semi-annual link clearance.

  • The Old New Thing

    Even the publishing department had its own Year 2000 preparedness plan

    • 7 Comments

    On December 31, 1999, Microsoft Product Support Services were ready in case something horrible happened as the calendar rolled over into the new year.

    I'm told that Microsoft Press also had its own Year 2000 plan. They staffed their helpline continuously from Friday evening December 31, 1999 all the way through Sunday, January 2, 2000. They did this even though Microsoft Press did not normally staff its helpline ouside normal business hours, and even though all sample code in all publications come with a disclaimer that they are provided "as is" with no warranty.

    I do not know if they took any calls, but I suspect not.

Page 1 of 95 (946 items) 12345»