• The Old New Thing

    How did the invalid floating point operand exception get raised when I disabled it?

    • 23 Comments

    Last time, we learned about the dangers of uninitialized floating point variables but left with a puzzle: Why wasn't this caught during internal testing?

    I dropped a hint when I described how SNaNs work: You have to ask the processor to raise an exception when it encounters a signaling NaN, and the program disabled that exception. Why was an exception being raised when it had been disabled?

    The clue to the cause was that the customer that was encountering the crash reported that it tended to happen after they printed a report. It turns out that the customer's printer driver was re-enabling the invalid operand exception in its DLL_PROCESS_ATTACH handler. Since the exception was enabled, the SNaN exception, which was previously masked, was now live, and it crashed the program.

    I've also seen DLLs change the floating point rounding state in their DLL_PROCESS_ATTACH handler. This behavior can be traced back to old versions of the C runtime library which reset the floating point state as part of their DLL_PROCESS_ATTACH; this behavior was corrected as long ago as 2002 (possibly even earlier; I don't know for sure). Obviously that printer driver was even older. Good luck convincing the vendor to fix a bug in a driver for a printer they most likely don't even manufacture any more. If anything, they'll probably just treat it as incentive for you to buy a new printer.

    When you load external code into your process, you implicitly trust that the code won't screw you up. This is just another example of how a DLL can inadvertently screw you up.

    Sidebar

    One might argue that the LoadLibrary function should save the floating point state before loading a library and restore it afterwards. This is an easy suggestion to make in retrospect. Writing software would be so much easier if people would just extend the courtesy of coming up with a comprehensive list of "bugs applications will have that you should protect against" before you design the platform. That way, when a new class of application bugs is found, and they say "You should've protected against this!", you can point to the list and say, "Nuh, uh, you didn't put it on the list. You had your chance."

    As a mental exercise for yourself: Come up with a list of "all the bugs that the LoadLibrary function should protect against" and how the LoadLibrary function would go about doing it.

  • The Old New Thing

    Why was Pinball removed from Windows Vista?

    • 115 Comments

    Windows XP was the last client version of Windows to include the Pinball game that had been part of Windows since Windows 95. There is apparently speculation that this was done for legal reasons.

    No, that's not why.

    One of the things I did in Windows XP was port several millions of lines of code from 32-bit to 64-bit Windows so that we could ship Windows XP 64-bit Edition. But one of the programs that ran into trouble was Pinball. The 64-bit version of Pinball had a pretty nasty bug where the ball would simply pass through other objects like a ghost. In particular, when you started the game, the ball would be delivered to the launcher, and then it would slowly fall towards the bottom of the screen, through the plunger, and out the bottom of the table.

    Games tended to be really short.

    Two of us tried to debug the program to figure out what was going on, but given that this was code written several years earlier by an outside company, and that nobody at Microsoft ever understood how the code worked (much less still understood it), and that most of the code was completely uncommented, we simply couldn't figure out why the collision detector was not working. Heck, we couldn't even find the collision detector!

    We had several million lines of code still to port, so we couldn't afford to spend days studying the code trying to figure out what obscure floating point rounding error was causing collision detection to fail. We just made the executive decision right there to drop Pinball from the product.

    If it makes you feel better, I am saddened by this as much as you are. I really enjoyed playing that game. It was the location of the one Windows XP feature I am most proud of.

    Update: Hey everybody asking that the source code be released: The source code was licensed from another company. If you want the source code, you have to go ask them.

  • The Old New Thing

    Psychic debugging: IP on heap

    • 43 Comments

    Somebody asked the shell team to look at this crash in a context menu shell extension.

    IP_ON_HEAP:  003996d0
    
    ChildEBP RetAddr
    00b2e1d8 68f79ca6 0x3996d0
    00b2e1f4 7713a7bd ATL::CWindowImplBaseT<
                               ATL::CWindow,ATL::CWinTraits<2147483648,0> >
                         ::StartWindowProc+0x43
    00b2e220 77134be0 USER32!InternalCallWinProc+0x23
    00b2e298 7713a967 USER32!UserCallWinProcCheckWow+0xe0
    ...
    
    eax=68f79c63 ebx=00000000 ecx=00cade10 edx=7770df14 esi=002796d0 edi=000603cc 
    eip=002796d0 esp=00cade4c ebp=00cade90 iopl=0         nv up ei pl nz na pe nc 
    cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010206 
    002796d0 c744240444bafb68 mov     dword ptr [esp+4],68fbba44
    

    You should be able to determine the cause instantly.

    I replied,

    This shell extension is using a non-DEP-aware version of ATL. They need to upgrade to ATL 8 or disable DEP.

    This was totally obvious to me, but the person who asked the question met it with stunned amazement. I guess the person forgot that older versions of ATL are notorious DEP violators. You see a DEP violation, you see that it's coming from ATL, and bingo, you have your answer. When DEP was first introduced, the base team sent out mail to the entire Windows division saying, "Okay, folks, we're turning it on. You're going to see a lot of application compatibility problems, especially this ATL one."

    Psychic powers sometimes just means having a good memory.

    Even if you forgot that information, it's still totally obvious once you look at the scenario and understand what it's trying to do.

    The fault is IP_ON_HEAP which is precisely what DEP protects against. The next question is why IP ended up on the heap. Was it a mistake or intentional?

    Look at the circumstances surrounding the faulting instruction again. The faulting instruction is the window procedure for a window, and the action is storing a constant into the stack. The symbols of the caller tell us that it's some code in ATL, and you can even go look up the source code yourself:

    template <class TBase, class TWinTraits>
    LRESULT CALLBACK CWindowImplBaseT< TBase, TWinTraits >
      ::StartWindowProc(HWND hWnd, UINT uMsg, WPARAM wParam, LPARAM lParam) {
        CWindowImplBaseT< TBase, TWinTraits >* pThis =
                  (CWindowImplBaseT< TBase, TWinTraits >*)
                      _AtlWinModule.ExtractCreateWndData();
        pThis->m_hWnd = hWnd; 
        pThis->m_thunk.Init(pThis->GetWindowProc(), pThis); 
        WNDPROC pProc = pThis->m_thunk.GetWNDPROC(); 
        ::SetWindowLongPtr(hWnd, GWLP_WNDPROC, (LONG_PTR)pProc);
        return pProc(hWnd, uMsg, wParam, lParam);
    } 
    

    Is pProc corrupted and we're jumping to a random address on the heap? Or was this intentional?

    ATL is clearly generating code on the fly (the window procedure thunk), and it is in execution of the thunk that we encounter the DEP exception.

    Now, you didn't need to have the ATL source code to realize that this is what's going on. It is a very common pattern in framework libraries to put a C++ wrapper around window procedures. Since C++ functions have a hidden this parameter, the wrappers need to sneak that parameter in somehow, and one common technique is to generate some code on the fly that sets up the hidden this parameter before calling the C++ function. The value at [esp+4] is the window handle, something that can be recovered from the this pointer, so it's a handly thing to replace with this before jumping to the real C++ function.

    The address being stored as the this parameter is 68fbba44, which is inside the DLL in question. (You can tell this because the return address, which points to the ATL thunk code, is at 68f79ca6 which is in the same neighborhood as the mystery pointer.) Therefore, this is almost certainly an ATL thunk for a static C++ object.

    In other words, this is extremely unlikely be a jump to a random address. The code at the address looks too good. It's probably jumping there intentionally, and the fact that it's coming from a window procedure thunk confirms it.

    But our tale is not over yet. The plot thickens. We'll continue next time.

  • The Old New Thing

    What other programs are filtered from the Start menu's list of frequently-used programs?

    • 22 Comments

    We already saw that programs in the pin list are pruned from the most-frequently-used programs list because they would be redundant. Another fine-tuning rule was introduced after the initial explorations with the new Windows XP Start menu: Programs with specific "noise" names are removed from consideration.

    Many "noise" programs were showing up as frequently used because they happened to be shortcuts to common helper programs like Notepad or Wordpad to display a "Read Me" document. These shortcuts needed to be filtered out so that they couldn't be nominated as, say, the Notepad representative. The list of English "poison words" is given in Knowledge Base article 282066.

    (Incidentally, a program can also register itself as not eligible for inclusion in the front page of the Start menu by creating a NoStartPage value in its application registration.)

    We'll see in the epilogue that Windows Vista uses an improved method for avoiding the "unwanted representative" problem.

  • The Old New Thing

    Points are earned by programs, not by shortcuts

    • 63 Comments

    The first subtlety of the basic principle that determines which programs show up in the Start menu is something you may not have noticed when I stated it:

    Each time you launch a program, it "earns a point", and the longer you don't launch a program, the more points it loses.

    Notice that the rule talks about programs, not shortcuts.

    The "points" for a program are tallied from all the shortcuts that exist on the All Programs section of the Start menu. Many programs install multiple shortcuts, say one to the root of the All Programs menu and another to a deep folder. It doesn't matter how many shortcuts you have; if they all point to the same program, then it is that program that earns the points when you use any of the shortcuts.

    One the Start menu decides that a program has earned enough points to make it to the front page, it then has to choose which shortcut to use to represent that program. This is an easy decision if there's only one shortcut. If there are multiple shortcuts to the same program, then the most-frequently-used shortcut is selected as the one to appear on the front page of the Start menu.

    If you paid really close attention, you may have noticed a subtlety to this subtlety. We'll take that up next time.

    Please hold off your questions until the (two-week!) series is complete, because I suspect a later entry will answer them. (This series is an expansion upon the TechNet column on the same topic. If you've read the TechNet article, then a lot of this series will be review.)*

    Footnotes

    *I wrote this last time, but that didn't stop people from asking questions anyway. I don't expect it'll work today either, but who knows, maybe you'll surprise me.

  • The Old New Thing

    Why doesn't String.Format throw a FormatException if you pass too many parameters?

    • 19 Comments

    Welcome to CLR Week 2009. As always, we start with a warm-up.

    The String.Format method doesn't throw a FormatException if you pass too many parameters, but it does if you pass too few. Why the asymmetry?

    Well, this is the type of asymmetry you see in the world a lot. You need a ticket for each person that attends a concert. If you have too few tickets, they won't let you in. If you have too many, well, that's a bit wasteful, but you can still get in; the extras are ignored. If you create an array with 10 elements and use only the first five, nobody is going to raise an ArrayBiggerThanNecessary exception. Similarly, the String.Format message doesn't mind if you pass too many parameters; it just ignores the extras. There's nothing harmful about it, just a bit wasteful.

    Besides, you probably don't want this to be an error:

    if (verbose) {
      format = "{0} is not {1} (because of {2})";
    } else {
      format = "{0} not {1}";
    }
    String.Format(format, "Zero", "One", "Two");
    

    Think of the format string as a SELECT clause from the dataset provided by the remaining parameters. If your table has fields ID and NAME and you select just the ID, there's nothing wrong with that. But if you ask for DATE, then you have an error.

  • The Old New Thing

    Why can't I pass a reference to a derived class to a function that takes a reference to a base class by reference?

    • 23 Comments

    "Why can't I pass a reference to a derived class to a function that takes a reference to a base class by reference?" That's a confusing question, but it's phrased that way because the simpler phrasing is wrong!

    Ths misleading simplified phrasing of the question is "Why can't I pass a reference to a derived class to a function that takes a base class by reference?" And in fact the answer is "You can!"

    class Base { }
    class Derived : Base { }
    
    class Program {
      static void f(Base b) { }
    
      public static void Main()
      {
          Derived d = new Derived();
          f(d);
      }
    }
    

    Our call to f passes a reference to the derived class to a function that takes a reference to the base class. This is perfectly fine.

    When people ask this question, they are typically wondering about passing a reference to the base class by reference. There is a double indirection here. You are passing a reference to a variable, and the variable is a reference to the base class. And it is this double reference that causes the problem.

    class Base { }
    class Derived : Base { }
    
    class Program {
      static void f(ref Base b) { }
    
      public static void Main()
      {
          Derived d = new Derived();
          f(ref d); // error
      }
    }
    

    Adding the ref keyword to the parameter results in a compiler error:

    error CS1503: Argument '1': cannot convert from 'ref Derived' to 'ref Base'
    

    The reason this is disallowed is that it would allow you to violate the type system. Consider:

      static void f(ref Base b) { b = new Base(); }
    

    Now things get interesting. Your call to f(ref d) passes a reference to a Derived by reference. When the f function modifies its formal parameter b, it's actually modifying your variable d. What's worse, it's putting a Base in it! When f returns, your variable d, which is declared as being a reference to a Derived is actually a reference to the base class Base.

    At this point everything falls apart. Your program calls some method like d.OnlyInDerived(), and the CLR ends up executing a method on an object that doesn't even support that method.

    You actually knew this; you just didn't know it. Let's start from the easier cases and work up. First, passing a reference into a function:

    void f(SomeClass s);
    
    ...
       T t = new T();
       f(t);
    

    The function f expects to receive a reference to a SomeClass, but you're passing a reference to a T. When is this legal?

    "Duh. T must be SomeClass or a class derived from SomeClass."

    What's good for the goose is good for the gander. When you pass a parameter as ref, it not only goes into the method, but it also comes out. (Not strictly true but close enough.) You can think of it as a bidirectional parameter to the function call. Therefore, the rule "If a function expects a reference to a class, you must provide a reference to that class or a derived class" applies in both directions. When the parameter goes in, you must provide a reference to that class or a derived class. And when the parameter comes out, it also must be a reference to that class or a derived class (because the function is "passing the parameter" back to you, the caller).

    But the only time that S can be T or a subclass, while simultaneously having T be S or a subclass is when S and T are the same thing. This is just the law of antisymmetry for partially-ordered sets: "if a ≤ b and b ≤ a, then a = b."

  • The Old New Thing

    2007 Q3 link clearance: Microsoft blogger edition

    • 16 Comments

    A few random links that I've collected from other Microsoft bloggers.

  • The Old New Thing

    If you pin a program, it doesn't show up in the frequently-used programs list

    • 18 Comments

    After the initial explorations with the Windows XP Start menu, we had to add a rule that fine-tuned the results: If a program is pinned, then it is removed from consideration as a frequently-used program.

    For example, if you right-click Lotus Notes and select "Pin to Start menu", then it goes into the pin list and will never show up in the dynamic portion of the front page of the Start menu. This tweak was added to avoid the ugly situation where you have two icons for the same program on the front page of the Start menu, when only one would do the job.

    This is another manifestation of the "Don't show me something I already know" principle, which we saw earlier when we discussed why the All Programs list doesn't use Intellimenus. After all, you pinned the program to your Start menu because you run it often. There's no point in showing it again at the top of your "frequently-used" list; you knew that already! Use that scarce real estate to show the user something that is actually of value.

    Next time, another fine-tuning rule that tries to filter the noise from the results.

  • The Old New Thing

    The program doesn't have to be run from the Start menu to earn Start menu points

    • 29 Comments

    There's a second subtlety to the basic principle that determines which programs show up in the Start menu:

    Each time you launch a program, it "earns a point", and the longer you don't launch a program, the more points it loses.

    Since programs earn points and not shortcuts, a program can earn points even if you don't use the Start menu to run it.

    In usability studies, we often see people who run programs by digging through their Program Files directory until they find an icon that looks promising and then double-click it. If there is a shortcut on the All Programs section of the Start menu that points to the same program, then that shortcut will eventually work its way onto the front page, assuming the user runs the program often enough.

    This is why you will see a program appear on the front page of the Start menu even though you never ran it from the Start menu. The program earned points because you ran the program manually, or because you opened a document that is associated with that program. Promoting a program run this way helps users realize that they can run Backgammon from the Start menu instead of having to open My Computer, then click on my C drive, then click on Program Files, then MSN Gaming Zone, then Windows, and then double-click the icon with the strange name bckgzm. I've seen usability sessions where the users did this repeatedly, and they considered it perfectly normal, albeit frustrating. "Computers are so hard to use."

    Next time, we'll look at how the pin list influences the list of frequently-used programs.

Page 7 of 458 (4,571 items) «56789»