August, 2007

  • The Old New Thing

    Things I've written that have amused other people, Episode 4

    • 66 Comments

    One of my colleagues pointed out that my web site is listed in the references section of this whitepaper. It scares me that I'm being used as formal documentation because that is explicitly what this web site isn't. I wrote back,

    I really need to put a disclaimer on my web site.
    FOR ENTERTAINMENT PURPOSES ONLY

    Remember, this is a blog. The opinions (and even some facts) expressed here are those of the author and do not necessarily reflect those of Microsoft Corporation. Nothing I write here creates an obligation on Microsoft or establishes the company's official position on anything. I am not a spokesperson. I'm just this guy who strings people along in the hopes that they might hear a funny story once in a while.

    You'd think this was obvious, but apparently there are people who think that somehow what I write has the weight of official Microsoft policy and take my sentences apart as if they were legal documents or who take my articles and declare them to be official statements from Microsoft Corporation.

  • The Old New Thing

    What are these strange cmp [ecx], ecx instructions doing in my C# code?

    • 41 Comments

    When you debug through some managed code at the assembly level, you'll find a whole lot of seemingly pointless instructions that perform a comparison but ignore the result. What's the point of comparing two values if you don't care what the result is?

    In C++, invoking an instance method on a NULL pointer results in undefined behavior. In other words, if you do it, the compiler is allowed to do anything it wants. And what most compilers do is, um, nothing. They don't take any special steps if the this pointer is NULL; they just generate code on the assumption that it isn't. In practice, this often means that everything seems to run just fine until you access a member variables or call a virtual functions, and then you crash.

    The C# language, by comparison, is quite explicit about what happens if you invoke an instance method on a null object reference:

    The value of E is checked to be valid. If the value of E is null, a System.NullReferenceException is thrown and no further steps are executed.

    The null reference exception must be thrown before the method can be called. That's what the strange cmp [ecx], ecx comparison is for.¹ The compiler doesn't actually care what the result of the comparison is; it just wants to raise an exception if ecx is null. If ecx is null, the attempt to dereference it (in order to perform the comparison) will raise an access violation, which the runtime inspects and turns into a NullReferenceException.

    The test is usually against the ecx register since the CLR happens to use² the fastcall calling convention, which for instance methods passes the this pointer in the ecx register. The pointer the compiler wants to test is going to wind up in the ecx register sooner or later,³ so it's not surprising that the test, when it happens, is made against the ecx register.

    Nitpicker's Corner

    ¹Although this statement is written as if it were a fact, it is actually my interpretation based on observation and thinking about how language features are implemented. It is not an official position of the CLR team nor Microsoft Corporation, and that interpretation may ultimately prove incorrect.

    ²"Happens to use" means that this is an implementation detail, not a contractual guarantee.¹

    ³Unless the call is optimized. For example, the function might be inlined.

  • The Old New Thing

    What are these spurious nop instructions doing in my C# code?

    • 31 Comments

    Prerequisites: Basic understanding of assembly language.

    When you debug through some managed code at the assembly level, you may find that there are an awful lot of nop instructions scattered throughout your method. What are they doing there; isn't the JIT smart enough to remove them? Isn't this going to slow down execution of my program?

    It is my understanding that¹ this nop instructions are inserted by the JIT because you're running the program under the debugger. They are emitted specifically so that the debugger can set breakpoints in locations that you normally wouldn't be able to. (For example, they might represent a line of code that got optimized out or merged with another line of code.)

    Don't worry. If there's no debugger, the JIT won't generate the dummy nops.

    Nitpicker's Corner

    ¹As with all statements of alleged fact, this statement is an interpretation of events based on observation and thought and does not establish a statement of the official position of the CLR JIT compiler team or Microsoft Corporation, and that interpretation may ultimately prove incorrect.

  • The Old New Thing

    Disclaimers and such

    • 2 Comments

    Statements made in a general sense may have exceptions even if such exceptions are not explicitly acknowledged. Example: "Dogs have four legs." There are dogs which do not have four legs, but as a general rule, dogs have four legs.

    Statements are not independently fact-checked. They are based on personal experience and recollection, augmented by informed guesswork. Statements may even be intentionally incorrect for rhetorical purposes, for example, to avoid getting distracted by a side topic, or because it's a joke.

    Not all quotation marks indicate literal quotation; some may represent an imaginary conversation or a fictionalization of a real conversation. All quotations are subject to editing, for example for reasons of space or privacy, but such editing is not meant to alter the basic sense of the original statement.

    Phrases such as "some people" do not exclude the possibility that those people may be Microsoft employees. Microsoft employees are people, too. Similarly, "some programs" might include Microsoft programs.

    There is no correction policy.

    Statements do not establish the official position of Microsoft Corporation. Recommendations and advice are those of the author (or people and organizations the author trusts).

    Contents are provided "AS IS" with no warranties and confer no rights.

    In summary, readers are expected to employ critical thinking skills to evaluate statements in context.

  • The Old New Thing

    C# static constructors are called on demand, not at startup

    • 55 Comments

    One of the differences between C++ and C# is when static constructors run. In C++, static constructors are the first thing run in a module, even before the DllMain function runs.¹ In C#, however, static constructors don't run until you use the class for the first time. If your static constructor has side effects, you may find yourself experiencing those side effects in strange ways.

    Consider the following program. It's rather contrived and artificial, but it's based on an actual program that encountered the same problem.

    using System;
    using System.Runtime.InteropServices;
    
    class Program {
     [DllImport("kernel32.dll", SetLastError=true)]
     public static extern bool SetEvent(IntPtr hEvent);
    
     public static void Main()
     {
      if (!SetEvent(IntPtr.Zero)) {
       System.Console.WriteLine("Error: {0}", Trace.GetLastErrorFriendlyName());
      }
     }
    }
    

    This program tries to set an invalid event, so the call to SetEvent is expected to fail with an invalid handle error. We print the last error code using a function in this helper class: The details of this method aren't important. In fact, for illustrative purposes, I'm going to skip the call to FormatMessage and just return an ugly name.²

    class Trace {
     public static string GetLastErrorFriendlyName()
     {
      return Marshal.GetLastWin32Error().ToString();
     }
    }
    

    Run this program, and you should get this output:

    Error: 6
    

    Six is the expected error code, since that is the numeric value of ERROR_INVALID_HANDLE.

    You don't think much of this program until one day you run it and instead of getting error 6, you get something like this:

    Error: 126
    

    What happened?

    While you weren't paying attention, somebody decided to do some enhancements to the Trace class, maybe added some new methods and stuff, and in particular, a static constructor got added:

    class Trace {
     public static string GetLastErrorFriendlyName()
     {
      return Marshal.GetLastWin32Error().ToString();
     }
    
     [DllImport("kernel32.dll", SetLastError=true, CharSet=CharSet.Auto)]
     public static extern IntPtr LoadLibrary(string dll);
     static Trace() { LoadLibrary("enhanced_logging.dll"); }
    }
    

    It's not important what the static constructor does; the point is that we have a static constructor now. In this case, the static constructor tries to load a helper DLL which presumably does something fancy so we can get better trace logging, something like that, the details aren't important.

    The important thing is that the constructor has a side effect. Since it uses a p/invoke, the value of Marshal.GetLastWin32Error() is overwritten by the error code returned by the LoadLibrary, which in our case is error 126, ERROR_MOD_NOT_FOUND.

    Now let's look at what happens in our program.

    First, we call SetEvent, which fails and sets the Win32 error code to 6. Next, we call Trace.GetLastErrorFriendlyName, but wait! This is the first call to a method in the Trace class, so we have to run the static constructor first.

    The static constructor tries to load the enhanced_logging.dll module, and it fails, setting the last error code to 126. This overwrites the previous value.

    After the static constructor returns, we return to our program already in progress and call Trace.GetLastErrorFriendlyName, but it's too late. The damage has been done. The last error code has been corrupted.

    And that's why we get 126 instead of 6.

    What's really scary is that problems with static constructors running at inopportune times are often extremely hard to identify. For one thing, there is no explicit indication in the source code that there's any static constructor funny business going on. Indeed, somebody could just recompile the assembly containing the Trace class without modifying your program, and the problem will rear its head. "But I didn't change anything. The timestamp on program.exe is the same as the one that still works!"

    A side effect you might not consider is synchronization. If the static constructor takes any locks, you have to keep an eye on your lock hierarchy, or one of those locks might trigger a deadlock. This is insidious, because you can stare at the code all you want; you won't see anything. You'll have a method like

    class Trace {
     ...
     public static string GetFavoriteColor() { return "blue"; }
    }
    

    and yet when you try to step over a call to Trace.GetFavoriteColor, your program hangs! "This makes no sense. How can Trace.GetFavoriteColor hang? It just returns a constant!"

    Another factor that makes this problem baffling is that the problem occurs only the first time you call a method in the Trace class. We saw it here only because the very first thing we did with Trace was display an error. If you happened to call, say, Trace.GetFavoriteColor() before calling Trace.GetLastErrorFriendlyName(), then you wouldn't have seen this problem. In fact, that's how the program that inspired today's entry stumbled across this problem. They deleted a call into the Trace class from some unrelated part of the program, which meant that the static constructor ran at a different time than it used to, and unfortunately, the new time was less hospitable to static construction.

    "I'm sorry, did I call you at a bad time?"

    Footnotes³

    ¹This is not strictly true. In reality, it's a bit of sleight-of-hand performed by the C runtime library.⁴

    ²For a less ugly name, you can use this class instead:

    class Trace {
     [DllImport("kernel32.dll", SetLastError=true)]
     public static extern IntPtr LocalFree(IntPtr hlocal);
    
     [DllImport("kernel32.dll", SetLastError=true, CharSet=CharSet.Auto)]
     public static extern int FormatMessage(int flags, IntPtr unused1,
        int error, int unused2, ref IntPtr result, int size, IntPtr unused3);
     static int FORMAT_MESSAGE_ALLOCATE_BUFFER = 0x00000100;
     static int FORMAT_MESSAGE_IGNORE_INSERTS  = 0x00000200;
     static int FORMAT_MESSAGE_FROM_SYSTEM     = 0x00001000;
    
     public static string GetLastErrorFriendlyName()
     {
      string result = null;
      IntPtr str = IntPtr.Zero;
      if (FormatMessage(FORMAT_MESSAGE_ALLOCATE_BUFFER |
                        FORMAT_MESSAGE_IGNORE_INSERTS  |
                        FORMAT_MESSAGE_FROM_SYSTEM, IntPtr.Zero,
                        Marshal.GetLastWin32Error(), 0,
                        ref str, 0, IntPtr.Zero) > 0) {
       try {
        result = Marshal.PtrToStringAuto(str);
       } finally {
        LocalFree(str);
       }
      }
      return result;
     }
    }
    

    Note that there may be better ways of accomplishing this. I'm not the expert here.

    ³Boring footnote symbols from now on. You guys sure know how to take the fun out of blogging. (I didn't realize that blogs were held to academic writing standards. Silly me.) Now you can go spend your time telling Scoble that he wrote a run-on sentence or something.

    ⁴Although this statement is written as if it were a fact, it is actually my interpretation of how the C runtime works and is not an official position of the Visual Studio team nor Microsoft Corporation, and that interpretation may ultimately prove incorrect. Similar remarks apply to other statements of fact in this article.

    Postscript: Before you start pointing fingers and saying, "Hah hah, we don't have this problem in Win32!"—it turns out that you do! As we noted in the introduction, static constructors run when the DLL is loaded. The granularity in Win32 is not as fine, being at the module level rather than the class level, but the problem is still there. If you use delay-loading, then the first call to a function in a delay-loaded DLL will load the target DLL, and its static constructors will run, possibly when your program wasn't expecting it.

  • The Old New Thing

    What is the order of evaluation in C#?

    • 68 Comments

    The C and C++ languages leave the order of evaluation generally unspecified aside from specific locations called sequence points. Side effects of operations performed prior to the sequence point are guaranteed visible to operations performed after it.¹ For example, the C comma operator introduces a sequence point. When you write f(), g(), the language guarantees that any changes to program state made by the function f can be seen by the function g; f executes before g. On the other hand, the multiplication operator does not introduce a sequence point. If you write f() * g() there is no guarantee which side will be evaluated first.

    (Note that order of evaluation is not the same as associativity and operator precedence. Given the expression f() + g() * h(), operator precedence says that it should be evaluated as if it were written f() + (g() * h()), but that doesn't say what order the three functions will be evaluated. It merely describes how the results of the three functions will be combined.)

    In the C# language, the order of evaluation is spelled out more explicitly. The order of evaluation for operators is left to right. if you write f() + g() in C#, the language guarantees that f() will be evaluated first. The example in the linked-to page is even clearer. The expression F(i) + G(i++) * H(i) is evaluated as if it were written like this:

    temp1 = F(i);
    temp2 = i++;
    temp3 = G(temp2);
    temp4 = H(i);
    return temp1 + temp3 * temp4;
    

    The side effects of each part of the expression take effect in left-to-right order. Even the order of evaluation of function arguments is strictly left-to-right.

    Note that the compiler has permission to evaluate the operands in a different order if it can prove that the alternate order of evaluation has the same effect as the original one (in the absence of asynchronous exceptions).

    Why does C# take a much more restrictive view of the order of evaluation? I don't know, but I can guess.²

    My guess is that the language designers wanted to reduce the frequency of a category of subtle bugs (in this case, order-of-evaluation dependency). There are many other examples of this in the language design. Consider:

    class A {
     void f()
     {
      int i = 1;
      if (true) {
       int i = 2; // error - redeclaration
      }
     }
    
     int x;
     void g()
     {
      x = 3; // error - using variable before declared
      int x = 2;
     }
    }
    

    The language designers specified that the scope of a local variable in C# extends to the entire block in which it is declared. As a first consequence of this, the second declaration of i in the function f() is illegal since its scope overlaps with the scope of the first declaration. This removes a class of bugs that can be traced to one local variable masking another with the same name.

    In the function g() the assignment x = 3; is illegal because the x refers not to the member variable but to the local variable declared below it. Notice that the scope of the local variable begins with the entire block, and not with the point of declaration as it would have been in C++.

    Nitpicker's Corner

    ¹This is a simplified definition of sequence point. For more precise definitions, consult the relevant standards documents.

    ²I have not historically included the sentence "I don't know but I can guess" because this is a blog, not formal documentation. Everything is my opinion, recollection, or interpretation. But it seems that people take what I say to establish the official Microsoft position on things, so now I have to go back and add explicit disclaimers.

  • The Old New Thing

    It rather involved being on the other side of this airtight hatchway: Executable corruption

    • 22 Comments

    In the category of dubious vulnerability, I submit the following (paraphrased) report:

    I discovered that if I take an EXE file and corrupt its header, then when I try to run the EXE file, the process starts up and then crashes. I used the information in the crash dialog to direct further investigations, noting that the specific crash location could be controlled by modifying particular bytes in the EXE. Finally, I was able to put all the details together to form an exploit: I modified a block of bytes in the EXE file to consist of code which opens a network socket and connects it to a command shell, then modified the header to point to those bytes. When I run the EXE, the exploit code runs, and I can connect to the network socket from another computer and control the command shell.

    Yeah, that's great, but what's the vulnerability? What you did was take a program that you have write permission to and change the code in it to run your exploit. If you can modify an EXE file, then you may as well just replace the entire contents of the file with the bytes of PWNZ0RD.EXE. In other words, modifying bytes here and there is just a very slow, inefficient, and unnecessarily complicated way of doing this:

    copy pwnz0rd.exe victim.exe
    
    Then when the user runs the infected program, they're really running the PWNZ0RD.EXE program, and your so-called exploit can do whatever it wants. That's a lot easier than trying to modify a dozen bytes here, a dozen bytes there.

    In order to trigger the vulnerability, the user has to run the compromised program, but a program is already arbitrary code. No need to be so sneaky about it. It's sort of a tautology: "Here's my clever way to get the user to run my code. Step 1: Write some code. Step 2: Get the user to run it."

    Of course, if this corrupted EXE file created other types of problems, such as crashing Explorer or triggering a buffer overflow when the user tried to view its properties, then you'd be onto something. Or if you could somehow avoid detection by not altering the digital signature, then that'd be interesting as well. But if the only way to trigger code injection is to run the injected code, then that's not really all that interesting. You just found a roundabout way of creating a Trojan horse.

  • The Old New Thing

    What is the difference between the Folder and Directory (and other special) progids?

    • 23 Comments

    When you're installing your shell extension, you need to know which progid to hang it off of inside HKEY_CLASSES_ROOT. We'll start with the title question and then move on to other predefined (but perhaps not well-known) progids.

    • "Folder" is the progid for any shell folder. It could be a virtual folder (like Control Panel) or a file system folder (like C:\WINDOWS).
    • "Directory" is the progid for file system folders. This is a subset of "Folder".
    • "*" is the progid for all files. Doesn't matter what the extension is.
    • "." (that's a single period) is the progid for files without any extension.
    • "AllFileSystemObjects" is the union of "*" and "Directory". It is the progid for all files and for file system directories.
  • The Old New Thing

    Yes indeed, all Microsoft files are (or should be) digitally signed

    • 52 Comments

    Yes indeed, all Microsoft files are (or should be) digitally signed (as far as I'm aware). So I'm not quite sure what commenter Dave is getting at:

    The Microsoft file should have embedded vendor/product information saying it's from Microsoft and will be cryptographically signed by Microsoft. Similarly-named malware won't be signed by Microsoft, unless Verisign slipped up *again* and issued another bogus certificate.

    Wow, this is such a great idea, that it's been true for many years now. All Microsoft files are digitally signed. They have to be; otherwise, Windows File Protection wouldn't be able to tell whether this new version of shell32.dll is a security update or just some malware trying to replace a system file. As I noted quite some time ago, you can run the sigverif program to validate the digital signatures of all system files.

    Long descriptive names are just as much an opportunity to malware makers as they are to legit software developers. Gee, why would you want to stop a file named "Critical Security Update Service.exe" for example?

    And that's why Windows XP Service Pack 2 added the ability to mark a file as "I got this from the Internet." (Internet Explorer applies this marking when you download a file from the Internet.) Before you run such a file, Explorer will prompt you with a warning and show you the digital signature information so that you can confirm that the download is what it purports to be and has not been tampered with.

    This is just one example of a commenter who suggests that Windows do things that it already does: Drop down lists do let you type multiple characters. Here's another example. Windows has been multilingual since Windows 2000.

    For now, I'm going to keep ignoring them. Just because you say something in my presence and I don't raise an objection doesn't mean that I agree. Or that what you said is even true.

  • The Old New Thing

    Nested fly-out menus are a usability nightmare

    • 74 Comments

    The Windows Vista Start menu abandoned the flyout model for the "All Programs" menu because nested fly-out menus are a usability nightmare, and not just for novices.

    Research has shown that once you have menus more than one level deep, you have the problem that the slightly wiggle of the mouse can take the big, complicated menu hierarchy that the user spent enormous attention to build and make it all disappear in a flash. I run into this a lot. "File, Open Multiple, By Searching..." oops I moved my mouse too far upwards and tickled the View menu and boom my menu vanishes and I have to start all over again. Menu navigation has turned into one of those mouse dexterity games where you have to guide your character through a maze without hitting any of the walls or you die and have to start over.

    The All Programs menu has turned into a unwieldy mess thanks to all the programs that shove themselves into every nook and cranny. As a result, navigating it as a hierarchical menu has turned into a common source of frustration due to the "collapsing menu" problem. The solution? Reframe it as a tree view which acts only when you click. The hierarchy is still there, but it's much easier to navigate.

    This ease of use comes at a cost: If you're one of those people who can guide a mouse with pixel-perfect precision, then you're going to find mouse-based menu navigation a bit slower due to the extra clicking. But if it's speed you're after, then put down the mouse and stick to the keyboard: Type the name of what you want into the Start menu search box, and you'll be taken straight to it.

    (No nitpicker's corner today. We'll see what happens.)

Page 1 of 5 (46 items) 12345