August, 2007

  • The Old New Thing

    Just because you're a control doesn't mean that you're necessarily inside a dialog box

    • 25 Comments

    Prerequisites: Moderate to advanced understanding of the window and dialog managers.

    When you're implementing a control, you need to be aware that you aren't necessarily being hosted inside a dialog box. One commenter suggested handling WM_KEYDOWN and closing the dialog box as a way to prevent multi-line edit controls from eating the Enter key. But the edit control can't do that because people create edit controls outside of dialog boxes. How do you "close the dialog box" when there isn't one?

    This leads to a related topic brought up by another comment:

    Doesn't ES_WANTRETURN do exactly this? The MSDN states the following (emphasis mine): "ES_WANTRETURN: Specifies that a carriage return be inserted when the user presses the ENTER key while entering text into a multiple-line edit control in a dialog box. Without this style, pressing the ENTER key has the same effect as pressing the dialog box's default pushbutton. This style has no effect on a single-line edit control."

    I remarked that ES_WANTRETURN is a messy subject. Now I'm going to show you the mess. It's sort of like visiting your friend's house when they're not expecting you and wandering into their bedroom where they haven't tidied up and there's clothes all over the floor.

    The authors of the edit control back in 1981 didn't follow the above guidance. Probably¹ because back in the days when the edit control was first written, the window manager was still in a state of flux and its design hadn't settled down. You can't blame the edit control for not following guidance that didn't exist.

    The edit control implements ES_WANTRETURN as you might expect: It include DLGC_WANTALLKEYS in its response to WM_GETDLCODE, which causes all keys, including Enter, to go to the edit control.

    What's more interesting is how the edit control implemented the absence of ES_WANTRETURN: It still includes DLGC_WANTALLKEYS, but when it receives the Enter key, it first attempts to detect whether it's inside a dialog box, and if so, it tries to mimic what the dialog box would have done: It asks its parent dialog box for the default ID, sets focus to the corresponding control, and simulates input via PostMessage to make that control act as if the user had pressed Enter. Since only button controls can be the default ID, the edit control "knows" that the recipient of the simulated input is the button control. The author of the edit control then went in and modified the button control so that it didn't rely on virtualized input state when handling the WM_KEYDOWN message.

    This is ugly no matter how you slice it, and it violates so many principles of control design it isn't funny. For one thing, the way it detects whether the control it hosted inside a dialog is fragile and can be tricked into guessing wrong. Next, its mimcry of the IsDialogMessage function is incorrect. When it wants to invoke the default button, it does so by simulating input, which we already know is wrong. And before it does so, it sets focus to the control, which is also wrong; the IsDialogMessage function generates a WM_COMMAND message without changing focus. And finally, it totally misses the boat if the edit control is inside a nested dialog.

    As I noted, all these mistakes are obvious in retrospect, but when the control was first written, these mistakes might not¹ even have been mistakes. (For example, nested dialogs didn't appear on the scene until Windows 95.) Why haven't these mistakes been fixed? Well, how can you prove that there aren't any programs that rely on the mistakes? One thing you quickly learn in application compatibility is that a bug once shipped gains the status of a feature, because you can be pretty sure that some program somewhere relies on it. (I've seen a plugin that relies on a memory leak in Explorer, for example.) This goes doubly true for core controls like the edit control. Any change to the edit control must be taken with a great deal of trepidation, because your change affects pretty much every single Windows program on the entire planet. With that high a degree of risk, the prudent choice is often to let sleeping dogs lie.

    Nitpicker's Corner

    ¹Note weasel words. This is my educated guess as to what happened based on personal observation and thought. It is not a statement of the official position of Microsoft Corporation, and this guess may ultimately prove incorrect.

  • The Old New Thing

    Do you have a Starbucks name?

    • 36 Comments

    Annabelle Gurwitch has found what may be one of the few remaining places where you can be anybody you want: Starbucks. Check out the part towards the end where people on the street are asked to share their Starbucks names.

    I'm reminded of a time many years ago when Schultzy's Sausage had expanded to a second location in Redmond. (They closed the location after maybe a year, not because business was bad but because it was just too much work for the owner.¹) A group of us stopped in for dinner, and as each of us placed our order, the man behind the counter asked for our names so he could call us when the order was ready. Most of us gave our names without a fuss, but one of the group (whom I shall call "X") decided to play around a bit:

    X: I'll have a dooey.

    Man: (taking down the order) What's your name?

    X: (being a bit of a smart aleck) What's yours?

    Man: Schultzy.

    X: Oh.

    Nitpicker's Corner

    ¹This statement of fact is actually an interpretation of events based on hearsay evidence. It does not establish a statement of the official position of Microsoft Corporation, and that interpretation may very well prove incorrect given the unreliability of the nature of hearsay evidence.

  • The Old New Thing

    What are these spurious nop instructions doing in my C# code?

    • 31 Comments

    Prerequisites: Basic understanding of assembly language.

    When you debug through some managed code at the assembly level, you may find that there are an awful lot of nop instructions scattered throughout your method. What are they doing there; isn't the JIT smart enough to remove them? Isn't this going to slow down execution of my program?

    It is my understanding that¹ this nop instructions are inserted by the JIT because you're running the program under the debugger. They are emitted specifically so that the debugger can set breakpoints in locations that you normally wouldn't be able to. (For example, they might represent a line of code that got optimized out or merged with another line of code.)

    Don't worry. If there's no debugger, the JIT won't generate the dummy nops.

    Nitpicker's Corner

    ¹As with all statements of alleged fact, this statement is an interpretation of events based on observation and thought and does not establish a statement of the official position of the CLR JIT compiler team or Microsoft Corporation, and that interpretation may ultimately prove incorrect.

  • The Old New Thing

    The Radioactive Boy Scout is back in the news

    • 25 Comments

    The Radioactive Boy Scout appears to be back to his old tricks

    Nitpicker's Corner

    ¹Although this statement is written as if it were a fact, it is actually my interpretation of a newspaper article and is not an official position of Microsoft Corporation.

  • The Old New Thing

    What are these strange cmp [ecx], ecx instructions doing in my C# code?

    • 41 Comments

    When you debug through some managed code at the assembly level, you'll find a whole lot of seemingly pointless instructions that perform a comparison but ignore the result. What's the point of comparing two values if you don't care what the result is?

    In C++, invoking an instance method on a NULL pointer results in undefined behavior. In other words, if you do it, the compiler is allowed to do anything it wants. And what most compilers do is, um, nothing. They don't take any special steps if the this pointer is NULL; they just generate code on the assumption that it isn't. In practice, this often means that everything seems to run just fine until you access a member variables or call a virtual functions, and then you crash.

    The C# language, by comparison, is quite explicit about what happens if you invoke an instance method on a null object reference:

    The value of E is checked to be valid. If the value of E is null, a System.NullReferenceException is thrown and no further steps are executed.

    The null reference exception must be thrown before the method can be called. That's what the strange cmp [ecx], ecx comparison is for.¹ The compiler doesn't actually care what the result of the comparison is; it just wants to raise an exception if ecx is null. If ecx is null, the attempt to dereference it (in order to perform the comparison) will raise an access violation, which the runtime inspects and turns into a NullReferenceException.

    The test is usually against the ecx register since the CLR happens to use² the fastcall calling convention, which for instance methods passes the this pointer in the ecx register. The pointer the compiler wants to test is going to wind up in the ecx register sooner or later,³ so it's not surprising that the test, when it happens, is made against the ecx register.

    Nitpicker's Corner

    ¹Although this statement is written as if it were a fact, it is actually my interpretation based on observation and thinking about how language features are implemented. It is not an official position of the CLR team nor Microsoft Corporation, and that interpretation may ultimately prove incorrect.

    ²"Happens to use" means that this is an implementation detail, not a contractual guarantee.¹

    ³Unless the call is optimized. For example, the function might be inlined.

  • The Old New Thing

    For $15, you can purchase incorrect information, and to prevent people from getting it, you have to renew every three months

    • 27 Comments

    Given what I know about Naveen Jain, I basically view everything he does with enormous skepticism.¹ I mean, I trust lawyers more than I trust that guy, that's how bad it is.

    After being booted from InfoSpace, Jain moved across the street and founded Intelius, a company that does basically the same thing: Selling directory information.² Recently, the company launched a cell phone look-up service: For $15 you can obtain the cell phone number of anybody in their directory. Mind you, the information is cobbled together from various private sources, and it can even be wrong, but if the result is incorrect, you won't get a refund. Cellphone industry lobbyist Steve Largent calls it a "scam", and that's saying a lot, coming from somebody whose own job doesn't rank very high on the trust scale either.

    They claim to have collected this information from private sources. Did you give the pizza delivery store your cell phone number? They may have sold it to Intelius. The auto mechanic shop? They may have sold it to Intelius. Ironically, the very first sentence in the Intelius Privacy Policy is "Intelius respects your right to privacy, and we are committed to protecting it." And yet they make it difficult to protect your privacy: Read on.

    I called their customer support line to remove my cell phone number from their database. You can try it, too: +1-425-974-6100, then 1, then 1; but I'll save you the trouble and tell you the answer. (If you don't trust me, you can call and confirm this information for yourself.)

    • Make a cell phone search for yourself on their Web site.
    • Proceed as if you actually wanted to pay $15 to get the information, but don't do it.
    • Print out the Web page that shows the information that they are offering to sell for $15. (Your cell phone number, your unlisted telephone number, etc.)
    • Send a fax to +1-425-974-6194 containing that screenshot, your name, address, and date of birth.
    • Wait seven to ten days for the change to take effect.
    • The request is valid for three months; after three months, you must repeat the process.

    Such is their commitment to privacy that they make you jump through these hoops four times a year. Even the opt-out requests for the dreaded Direct Marketing Association are good for five years.

    According to the privacy statement, you can direct any questions or concerns regarding their Privacy Policy to privacy@intelius.com. Good luck. (They brag about the blog they started in April, but if you follow the link you find no blog.)

    As Scott McNealy famously put it, "You have zero privacy anyway. Get over it."

    Nitpicker's Corner

    ¹The opinions expressed herein are my own and are not an official position of Microsoft Corporation.

    ²Although this statement is written as if it were a fact, it is actually my interpretation based on what I remember and may be incorrect.

  • The Old New Thing

    C# static constructors are called on demand, not at startup

    • 55 Comments

    One of the differences between C++ and C# is when static constructors run. In C++, static constructors are the first thing run in a module, even before the DllMain function runs.¹ In C#, however, static constructors don't run until you use the class for the first time. If your static constructor has side effects, you may find yourself experiencing those side effects in strange ways.

    Consider the following program. It's rather contrived and artificial, but it's based on an actual program that encountered the same problem.

    using System;
    using System.Runtime.InteropServices;
    
    class Program {
     [DllImport("kernel32.dll", SetLastError=true)]
     public static extern bool SetEvent(IntPtr hEvent);
    
     public static void Main()
     {
      if (!SetEvent(IntPtr.Zero)) {
       System.Console.WriteLine("Error: {0}", Trace.GetLastErrorFriendlyName());
      }
     }
    }
    

    This program tries to set an invalid event, so the call to SetEvent is expected to fail with an invalid handle error. We print the last error code using a function in this helper class: The details of this method aren't important. In fact, for illustrative purposes, I'm going to skip the call to FormatMessage and just return an ugly name.²

    class Trace {
     public static string GetLastErrorFriendlyName()
     {
      return Marshal.GetLastWin32Error().ToString();
     }
    }
    

    Run this program, and you should get this output:

    Error: 6
    

    Six is the expected error code, since that is the numeric value of ERROR_INVALID_HANDLE.

    You don't think much of this program until one day you run it and instead of getting error 6, you get something like this:

    Error: 126
    

    What happened?

    While you weren't paying attention, somebody decided to do some enhancements to the Trace class, maybe added some new methods and stuff, and in particular, a static constructor got added:

    class Trace {
     public static string GetLastErrorFriendlyName()
     {
      return Marshal.GetLastWin32Error().ToString();
     }
    
     [DllImport("kernel32.dll", SetLastError=true, CharSet=CharSet.Auto)]
     public static extern IntPtr LoadLibrary(string dll);
     static Trace() { LoadLibrary("enhanced_logging.dll"); }
    }
    

    It's not important what the static constructor does; the point is that we have a static constructor now. In this case, the static constructor tries to load a helper DLL which presumably does something fancy so we can get better trace logging, something like that, the details aren't important.

    The important thing is that the constructor has a side effect. Since it uses a p/invoke, the value of Marshal.GetLastWin32Error() is overwritten by the error code returned by the LoadLibrary, which in our case is error 126, ERROR_MOD_NOT_FOUND.

    Now let's look at what happens in our program.

    First, we call SetEvent, which fails and sets the Win32 error code to 6. Next, we call Trace.GetLastErrorFriendlyName, but wait! This is the first call to a method in the Trace class, so we have to run the static constructor first.

    The static constructor tries to load the enhanced_logging.dll module, and it fails, setting the last error code to 126. This overwrites the previous value.

    After the static constructor returns, we return to our program already in progress and call Trace.GetLastErrorFriendlyName, but it's too late. The damage has been done. The last error code has been corrupted.

    And that's why we get 126 instead of 6.

    What's really scary is that problems with static constructors running at inopportune times are often extremely hard to identify. For one thing, there is no explicit indication in the source code that there's any static constructor funny business going on. Indeed, somebody could just recompile the assembly containing the Trace class without modifying your program, and the problem will rear its head. "But I didn't change anything. The timestamp on program.exe is the same as the one that still works!"

    A side effect you might not consider is synchronization. If the static constructor takes any locks, you have to keep an eye on your lock hierarchy, or one of those locks might trigger a deadlock. This is insidious, because you can stare at the code all you want; you won't see anything. You'll have a method like

    class Trace {
     ...
     public static string GetFavoriteColor() { return "blue"; }
    }
    

    and yet when you try to step over a call to Trace.GetFavoriteColor, your program hangs! "This makes no sense. How can Trace.GetFavoriteColor hang? It just returns a constant!"

    Another factor that makes this problem baffling is that the problem occurs only the first time you call a method in the Trace class. We saw it here only because the very first thing we did with Trace was display an error. If you happened to call, say, Trace.GetFavoriteColor() before calling Trace.GetLastErrorFriendlyName(), then you wouldn't have seen this problem. In fact, that's how the program that inspired today's entry stumbled across this problem. They deleted a call into the Trace class from some unrelated part of the program, which meant that the static constructor ran at a different time than it used to, and unfortunately, the new time was less hospitable to static construction.

    "I'm sorry, did I call you at a bad time?"

    Footnotes³

    ¹This is not strictly true. In reality, it's a bit of sleight-of-hand performed by the C runtime library.⁴

    ²For a less ugly name, you can use this class instead:

    class Trace {
     [DllImport("kernel32.dll", SetLastError=true)]
     public static extern IntPtr LocalFree(IntPtr hlocal);
    
     [DllImport("kernel32.dll", SetLastError=true, CharSet=CharSet.Auto)]
     public static extern int FormatMessage(int flags, IntPtr unused1,
        int error, int unused2, ref IntPtr result, int size, IntPtr unused3);
     static int FORMAT_MESSAGE_ALLOCATE_BUFFER = 0x00000100;
     static int FORMAT_MESSAGE_IGNORE_INSERTS  = 0x00000200;
     static int FORMAT_MESSAGE_FROM_SYSTEM     = 0x00001000;
    
     public static string GetLastErrorFriendlyName()
     {
      string result = null;
      IntPtr str = IntPtr.Zero;
      if (FormatMessage(FORMAT_MESSAGE_ALLOCATE_BUFFER |
                        FORMAT_MESSAGE_IGNORE_INSERTS  |
                        FORMAT_MESSAGE_FROM_SYSTEM, IntPtr.Zero,
                        Marshal.GetLastWin32Error(), 0,
                        ref str, 0, IntPtr.Zero) > 0) {
       try {
        result = Marshal.PtrToStringAuto(str);
       } finally {
        LocalFree(str);
       }
      }
      return result;
     }
    }
    

    Note that there may be better ways of accomplishing this. I'm not the expert here.

    ³Boring footnote symbols from now on. You guys sure know how to take the fun out of blogging. (I didn't realize that blogs were held to academic writing standards. Silly me.) Now you can go spend your time telling Scoble that he wrote a run-on sentence or something.

    ⁴Although this statement is written as if it were a fact, it is actually my interpretation of how the C runtime works and is not an official position of the Visual Studio team nor Microsoft Corporation, and that interpretation may ultimately prove incorrect. Similar remarks apply to other statements of fact in this article.

    Postscript: Before you start pointing fingers and saying, "Hah hah, we don't have this problem in Win32!"—it turns out that you do! As we noted in the introduction, static constructors run when the DLL is loaded. The granularity in Win32 is not as fine, being at the module level rather than the class level, but the problem is still there. If you use delay-loading, then the first call to a function in a delay-loaded DLL will load the target DLL, and its static constructors will run, possibly when your program wasn't expecting it.

  • The Old New Thing

    SIFF 2007 wrap-up: Grandhotel, The Boss of It All, Vacation

    • 7 Comments

    Sorry, SIFF fans, but this article got stuck in the queue. But now it's unstuck.

    3 stars out of 5 Grandhotel: A sweet story about a shy, innocent, weather-obsessed hotel employee and the even stranger people who surround him. I wasn't quite sure what to expect, but I was quite pleased with what I got. Part comedy, part drama, the movie creates touching moments while remaining true to the quirky nature of its characters. I give it a 3 out of 5.

    4.5 stars out of 5 The Boss of It All: A company's founder blames all unpopular decisions on his imaginary boss, but when he enters negotiations to sell the company, he must produce this elusive boss and hires an incompetent actor to play him. Experiencing the story from the point of view of the hapless actor heightens the comedy, since we don't know what the heck is going on either! Everybody in this goofball comedy is insane, save for Icelandic interpreter. I give it a 4½ out of 5; it loses a half point for the maddening framing and editing. (This movie also served a quasi-linguistic purpose, since I anticipated that my knowledge of German and Swedish might make the Danish semi-comprehensible, but all I learned was that I was right to remove Danish from my list of modern Germanic languages I want to learn. I did get a kick out of the American—or maybe he was British—employee whose Danish was atrociously bad. That's what I would sound like if I studied Danish.)

    The Seattle Film group is bringing the moving back for a summer run, so if you want to see it and missed it, you have a second chance.

    3 stars out of 5 Vacation: Laura takes her family on a summer vacation at her mother's country home. Things are tense for reasons we learn later, and they get even more uncomfortable when her grandmother and estranged sister drop in for a visit. Scenes were held much, much longer than typical in modern moviemaking, making peaceful moments even more peaceful and uncomfortable moments even more uncomfortable. Dialogue was extremely sparse and often nonexistent, with emotions conveyed through silence and long pauses; rather than being boring, these moments carried poignancy, even if it's just a simple scene of two kids picking wildflowers. Other people may hate the slow pacing of this movie, but I really enjoyed it, which totally messes up my rating system. I'd give it a 4 for me, but a 3 for everybody else. (And it worked great from a "learning German" standpoint: The dialogue was very sparse, giving me plenty of time to work out what was said.)

    Legend:

    5 stars out of 5 Would pay money to see again by myself.
    4 stars out of 5 Would see again if it were free or if seeing it with others.
    3 stars out of 5 Would recommend to others.
    2 stars out of 5 Okay, but wouldn't recommend to someone not already interested.
    1 star out of 5 Would advise against.
    0 stars out of 5 Waste of my time.
  • The Old New Thing

    What is the order of evaluation in C#?

    • 68 Comments

    The C and C++ languages leave the order of evaluation generally unspecified aside from specific locations called sequence points. Side effects of operations performed prior to the sequence point are guaranteed visible to operations performed after it.¹ For example, the C comma operator introduces a sequence point. When you write f(), g(), the language guarantees that any changes to program state made by the function f can be seen by the function g; f executes before g. On the other hand, the multiplication operator does not introduce a sequence point. If you write f() * g() there is no guarantee which side will be evaluated first.

    (Note that order of evaluation is not the same as associativity and operator precedence. Given the expression f() + g() * h(), operator precedence says that it should be evaluated as if it were written f() + (g() * h()), but that doesn't say what order the three functions will be evaluated. It merely describes how the results of the three functions will be combined.)

    In the C# language, the order of evaluation is spelled out more explicitly. The order of evaluation for operators is left to right. if you write f() + g() in C#, the language guarantees that f() will be evaluated first. The example in the linked-to page is even clearer. The expression F(i) + G(i++) * H(i) is evaluated as if it were written like this:

    temp1 = F(i);
    temp2 = i++;
    temp3 = G(temp2);
    temp4 = H(i);
    return temp1 + temp3 * temp4;
    

    The side effects of each part of the expression take effect in left-to-right order. Even the order of evaluation of function arguments is strictly left-to-right.

    Note that the compiler has permission to evaluate the operands in a different order if it can prove that the alternate order of evaluation has the same effect as the original one (in the absence of asynchronous exceptions).

    Why does C# take a much more restrictive view of the order of evaluation? I don't know, but I can guess.²

    My guess is that the language designers wanted to reduce the frequency of a category of subtle bugs (in this case, order-of-evaluation dependency). There are many other examples of this in the language design. Consider:

    class A {
     void f()
     {
      int i = 1;
      if (true) {
       int i = 2; // error - redeclaration
      }
     }
    
     int x;
     void g()
     {
      x = 3; // error - using variable before declared
      int x = 2;
     }
    }
    

    The language designers specified that the scope of a local variable in C# extends to the entire block in which it is declared. As a first consequence of this, the second declaration of i in the function f() is illegal since its scope overlaps with the scope of the first declaration. This removes a class of bugs that can be traced to one local variable masking another with the same name.

    In the function g() the assignment x = 3; is illegal because the x refers not to the member variable but to the local variable declared below it. Notice that the scope of the local variable begins with the entire block, and not with the point of declaration as it would have been in C++.

    Nitpicker's Corner

    ¹This is a simplified definition of sequence point. For more precise definitions, consult the relevant standards documents.

    ²I have not historically included the sentence "I don't know but I can guess" because this is a blog, not formal documentation. Everything is my opinion, recollection, or interpretation. But it seems that people take what I say to establish the official Microsoft position on things, so now I have to go back and add explicit disclaimers.

  • The Old New Thing

    Math is hard, let's go shopp—oops

    • 60 Comments

    (The title is another variation on Math is hard, let's go shopping!", which appears to be a popular catchphrase over in Michael Kaplan's neck of the woods. The history of the phrase was researched on Language Log.)

    Last spring, I was at a local crafts store and paid for a $2.15 item with a $5 bill and two dimes. The teenage salesclerk rang up the sale and began to give me $17.90 in change.

    "Um, I gave you $5.20." You'd think the salesclerk would notice something strange when the amount of change exceeded the amount of cash tendered!

    "Oh, right." The salesclerk had entered $20.05 instead of $5.20. But now came the hard part: Computing the correct amount of change.

    Apparently kids these days aren't taught how to make change. They just punch the number into the register and trust what comes out. "In my day," we learned to make change by rewriting the formula "change = tendered - cost" as "cost + change = tendered". In other words, you start with the cost of the item, then add money to bring the total to the amount of money you received. For example, if somebody paid for a $3.45 item with a $20 bill, you'd make change as follows:

    You give the customer... You say...
    three forty-five
    a nickel ($0.05)three fifty
    a quarter ($0.25)three seventy-five
    a quarter ($0.25)four
    a $1 billfive
    a $5 billten
    a $10 billtwenty

    Adding up the change you created yields $0.05 + $0.25 + $0.25 + $1 + $5 + $10 = $16.55, which is the correct amount of change for $20 - $3.45.

    Even if kids aren't taught this technique nowadays, at least they should be able to do subtraction the traditional way. $5.20 - $2.15 is not a particularly difficult computation, seeing as I specifically added the extra twenty cents to avoid the borrow from the units position.

    But the salesclerk sat there and stared at the numbers for several seconds, unsure what to do next. I had to say, "$5.20 minus $2.15 is $3.05."

    Going shopping won't let you escape math.

Page 3 of 5 (46 items) 12345