• The Old New Thing

    If you're looking for the code that displays a particular dialog box, the most directly way to find it is to look for the dialog box


    Suppose you are working in a large or unfamiliar code base and you want to know where the code is that displays a particular dialog box or message box or something. Probably the most direct way of figuring this out is to look for the strings.

    Say there is a message box that asks for user confirmation. "Are you sure you want to frobulate the flux capacitor?" Search for that string in your source code. It will probably be in a resource file.

    resource.rc:IDS_CONFIRM­FROBULATE "Are you sure you want to frobulate the flux capacitor?"

    Great, now you have the string ID for that message. You can perform a second search for that ID.

    resource.h:#define IDS_CONFIRM­FROBULATE 1024
    resource.rc:IDS_CONFIRM­FROBULATE "Are you sure you want to frobulate the flux capacitor?"
    maintenance.cpp:   strPrompt.LoadString(IDS_CONFIRM­FROBULATE);

    If the thing you are searching for is a dialog box or menu item, then be aware that there may be an accelerator in the string, so a straight grep won't find it.

    No matches for "Enter the new name of the frobulator:"

    For a dialog box, you can tap the Alt key to make the accelerator show up, so you can search for the right string. For a menu, you invoke the menu via the keyboard. Or in either case, you can disable the Hide underlined letters for keyboard navigation setting.

    resource.rc:  LTEXT "Enter the ne&w name of the frobulator:",

    I tend to be lazy and instead of using any of those tricks to make the underlines show up, I just search for a shorter string and hope that the accelerator isn't in it.

    resource.rc:  LTEXT "Enter the ne&w name of the frobulator:",

    "But Raymond, hitting the Alt is just a quick tap on the keyboard. Surely you can't be that lazy!"

    Right. If the dialog box were right in front of me, then I could tap the Alt and be done. But usually, when I am investigating this sort of thing, it's because somebody has sent a screen shot and asks, "Where is the code that displays this?" Tapping Alt on a screen shot doesn't usually get you very far.

    Once you find the code that displays the dialog box or message box or whatever, you can then study the code to answer follow-up questions like "What are the conditions under which this dialog will appear?" or "Is there a setting to suppress this dialog?"

  • The Old New Thing

    My friend and his buddy invented the online shopping cart back in 1994


    Back in 1994 or so, my friend helped out his buddy who worked as the IT department for a local Seattle company known as Sub Pop Records. Here's what their Web site looked like back then. Oh, and in case you were wondering, when I said that his buddy worked as the IT department, I mean that the IT department consisted of one guy, namely him. And this wasn't even his real job. His main job was as their payroll guy; he just did their IT because he happened to know a little bit about computers. (If you asked him, he'd say that his main job was as a band member in Earth.)

    The mission was to make it possible for fans to buy records online. Nobody else was doing this at the time, so they had to invent it all by themselves. The natural metaphor for them was the shopping cart. You wandered through the virtual record store putting records in your basket, and then you went to check out.

    The trick here is how to keep track of the user as they wander through your store. This was 1994. Cookies hadn't been invented yet, or at least if they had been invented, support for them was very erratic, and you couldn't assume that every visitor to your site is using a browser that supported them.

    The solution was to encode the shopping cart state in the URL by making every link on the page include the session ID in the URL. It was crude but it got the job done.

    The site went online, and soon they were taking orders from excited fans around the world. The company loved it, because they probably got to charge full price for the records (rather than losing a cut to the distributor). And my friend told me the deep dark secret of his system: "We do okay if you ask for standard shipping, but the real money is when somebody is impatient and insists on overnight shipping. Overcharging for shipping is where the real money is."

    (Note: Statements about business models for a primitive online shopping site from 1994 are not necessarily accurate today.)

  • The Old New Thing

    News flash: Big houses cost more to maintain


    In 2005, we learned that big houses cost more to heat. In 2006, we learned that big houses cost more to cool.

    But then the research into big houses seems to have stalled.

    No worries. The research journal The Wall Street Journal recently released a paper concluding that big houses cost more to maintain.

  • The Old New Thing

    Deleting elements from an JavaScript array and closing up the gaps


    Today's Little Program is an exercise that solves the following problem:

    Given a JavaScript array a and an unsorted array indices (possibly containing duplicates), calculate the result of deleting all of the elements from the original array as specified by the indices. For example, suppose a = ["alice", "bob", "charles", "david", "eve"] and indices = [2, 4, 2, 0].

    a[0] = "alice";
    a[1] = "bob";
    a[2] = "charles";
    a[3] = "david";
    a[4] = "eve";
    a.length= 5;

    The indices specify that elements 2 (charles), 4 (eve), 2 (charles again, redundant), and 0 (alice) should be deleted from the array, leaving bob and david.

    Now, if you had to delete only one element from the array, it is pretty simple:

    a.splice(n, 1);

    The trick with removing multiple elements is that deleting one element shifts the indices, which can throw off future calculations. One solution is to remove the highest-indexed element first; in other words, operate on the indices in reverse sorted order.

    indices.sort().reverse().forEach(function(n) { a.splice(n, 1); });

    This technique does still suffer from the problem that if there are duplicates in the indices, extraneous elements get deleted by mistake.

    Another approach is to reinterpret the problem by focusing not on the deletion but on the survivors: Produce the array consisting of elements whose indices are not in the list of indices to be deleted.

    a = a.filter(function(e, i) { return indices.indexOf(i) >= 0; });

    The above approach works well if the list of indices to be deleted is short, but it gets quite expensive if the list is long.

    My approach is to use the fact that JavaScript arrays can be sparse. This is a side effect of the fact that JavaScript array indices are actually object properties; the only thing that makes arrays different from generic objects in a language-theoretic sense is the magic length property:

    • If a new property is added, and the property name is the stringification of a whole number, then the length is updated to the numeric value of the added property name, plus 1.
    • If the length property is modified programmatically, the new value must be a whole number, and all properties which are the stringification of a whole number greater than or equal to the new length are deleted.

    (See ECMA-262 sections 15.4,, and for nitpicky details.)

    The first step, then, is to delete all the indices that need to be deleted.

    indices.forEach(function(n) { delete a[n]; });

    When applied to our sample data, this leaves

    a[1] = "bob";
    a[3] = "david";
    a.length= 5;

    which gets printed in a rather goofy way: a = [, "bob", , "dave", ].

    The next step is to close up the gaps. We take advantage of the fact that the Array enumeration methods operate on indices 0 through length - 1 and that they skip missing elements. Therefore, I can simply apply a dummy filter:

    a = a.filter(function() { return true; });

    Exercise: What is the difference (aside from performance) between a.push(1); and a = a.concat(1);? How is this question relevant to today's exercise?

  • The Old New Thing

    The scope of the C# checked/unchecked keyword is static, not dynamic


    C# has operators checked and unchecked to control the behavior of the language in the face of integer overflow. There are also checked and unchecked statements which apply the behavior to blocks of statements rather than single expressions.

    int x;
    x = checked(a + b); // evaluate with overflow checking
    y = unchecked(a + b); // evaluate without overflow checking
    checked {
     x = a + b; // evaluate with overflow checking
    unchecked {
     x = a + b; // evaluate without overflow checking

    Why, then, doesn't this code below raise an overflow exception?

    class Program {
     static int Multiply(int a, int b) { return a * b; }
     static int Overflow() { return Multiply(int.MaxValue, 2); }
     public static void Main() {
      checked {

    (Mini-exercise: Why couldn't I have just written static int Overflow() { return int.MaxValue * 2; }?)

    The answer is that the scope of the checked or unchecked keyword is static, not dynamic. Whether a particular arithmetic is checked or unchecked is determined at compile time, not at run time. Since the multiplication in the Multiply function is not explicitly marked checked or unchecked, uses the overflow context implied by your compiler options. Assuming you've left it at the default of unchecked, this means that there is no overflow checking in the Multiply function, even if you call it from a checked context. Because once you call the Multiply function, you have left the checked context.

    The C# language specification addresses this issue not once, not twice, but three times! (But it seems that some people miss it, possibly because there is too much documentation.)

    First, there is an explicit list of operations which are controlled by the checked or unchecked keyword:

    • The predefined ++ and -- unary operators, when the operand is of an integral type.
    • The predefined - unary operator, when the operand is of an integral type.
    • The predefined +, -, *, and / binary operators, when both operands are of integral types.
    • Explicit numeric conversions from one integral type to another integral type, or from float or double to an integral type.

    That's all. Note that function calls are not on the list.

    Now, that may have been a bit too subtle (documentation by omission), so the language specific goes ahead and calls it out.

    The checked and unchecked operators only affect the overflow checking context for those operations that are textually contained within the "(" and ")" tokens. The operators have no effect on function members that are invoked as a result of evaluating the contained expression.

    And then, in case you still didn't get it, the language specification even includes an example:

    class Test
       static int Multiply(int x, int y) {
          return x * y;
       static int F() {
          return checked(Multiply(1000000, 1000000));

    The use of checked in F does not affect the evaluation of x * y in Multiply, so x * y is evaluated in the default overflow checking context.

    (I wrote my example before consulting the language specification. That we both chose to use multiplication overflow is just a coincidence.)

    Even though the language specification says it three times, in three different ways, there are still people who are under the mistaken impression that the scope of the checked keyword is dynamic.

    Another thing you may have notice is that the checked and unchecked keywords apply only to the built-in arithmetic operations on integers. They do not apply to overloaded operators or to operators on custom classes.

    Which makes sense if you think about it, because in order to define an overloaded operator or an operator on a custom class, you need to write the implementation as a separate function, in which case you have already left the scope of the checked and unchecked keywords.

    And now we are leaving the scope of CLR Week. You can remove your hands from your ears now.

  • The Old New Thing

    Customers not getting the widgets they paid for if they click too fast -or- In C#, the += operator is not merely not guaranteed to be atomic, it is guaranteed not to be atomic

    In the C# language, operation/assignment such as += are explicitly not atomic. But you already knew this, at least for properties.

    Recall that properties are syntactic sugar for method calls. A property declaration

    string Name { get { ... } set { ... } }

    is internally converted to the equivalent of

    string get_Name() { ... }
    void set_Name(string value) { ... }

    Accessing a property is similarly transformed.

    // o.Name = "fred";
    // x = o.Name;
    x = o.get_Name();

    Note that the only operations you can provide for properties are get and set. There is no way of customizing any other operations, like +=. Therefore, if you write

    o.Name += ", Jr.";

    the compiler has no choice but to convert it to

    o.put_Name(o.get_Name() + ", Jr.");

    If all you have is a hammer, everything needs to be converted to a nail.

    Since the read and write are explicitly decoupled, there is naturally a race condition here. The underlying property may change value in between the time you read the old value and the time you write the new value.

    But there are extra subtleties here. Let's dig in.

    The rule for operators like += are spelled out in Section 14.3.2: Compound Assignment:

    [T]he operation is evaluated as x = x op y, except that x is evaluated only once.

    (There is some discussion of what "evaluated only once" means, but that's not important here.)

    The subtleties lurking in that one sentence are revealed when you see how that sentence interacts with other rules in the language.

    Now, you might say, "Sure, it's not atomic, but my program is single-threaded, so this should never affect me."

    Actually, you can get bitten by this even in single-threaded programs. Let's try it:

    class Program
     static int x = 0;
     static int f()
      x = x + 10;
      return 1;
     public static void Main()
      x += f();

    What does this program print?

    You might naïvely think that it prints 11 because x is incremented by 1 by Main and incremented by 10 in f.

    But it actually prints 1.

    What happened here?

    Recall that C# uses strict left-to-right evaluation order. Therefore, the order of operations in the evaluation of x += f() is

    1. Rewrite as x = x + f().
    2. Evaluate both sides of the = operator, left to right.
      1. Left hand side of assignment: Find the variable x.
      2. Right hand side of assignment:
        1. Evaluate both sides of the + operator, left to right.
          1. Evaluate x.
          2. Evaluate f().
        2. Add together the results of steps 2b(i)1 and 2b(i)2.
    3. Take the result of step 2b(ii) and assign it to the variable x found in step 2a.

    The thing to notice is that a lot of things can happen between step 2b(i)1 (evaluating the old value of x), and step 3 (assigning the final result to x). Specifically, we shoved a whole function call in there: f().

    In our case, the function f() also modifies x. That modification takes place after we already captured the value of x in step 2b(i)1. When we get around to adding the values in step 2b(ii), we don't realize that the values are out of date.

    Let's step through this evaluation in our example.

    1. Rewrite as x = x + f().
    2. Evaluate both sides of the = operator, left to right.
      1. Left hand side of assignment: Find the variable x.
      2. Right hand side of assignment:
        1. Evaluate both sides of the + operator, left to right.
          1. Evaluate x. The result is 0.
          2. Evaluate f(). The result is 1. It also happens that x is modified as a side-effect.
        2. Add together the results of steps 2b(i)1 and 2b(i)2. In this case, 0 + 1 = 1.
    3. Take the result of step 2b and assign it to the variable x found in step 2a. In this case, assign 1 to x.

    The modification to x that took place in f was clobbered by the assignment operation that completed the += sequence. And this behavior is not just in some weird "undefined behavior" corner of the language specification. The language specification explicitly requires this behavior.

    Now, you might say, "Okay, I see your point, but this is clearly an unrealistic example, because nobody would write code this weird."

    Maybe you don't intentionally write code this weird, but you can do it accidentally. And this is particularly true if you are using the new await keyword, because an await means, "Hey, like, put my function on hold and do other stuff for a while. When the thing I'm awaiting is ready, then resume execution of my function." And that "do other stuff for a while" might change x.

    Suppose that you have a button in your application called Buy More. When the user clicks it, they can buy more widgets. Let's assume that the Buy­More­Async function return the number of items bought. (If the user cancels the purchase it returns zero.)

    // Suppose the user starts with 100 widgets.
    async void BuyMoreButton_OnClick()
     TotalWidgets += await BuyMoreAsync();
     Inventory.Text = string.Format("You have {0} widgets.",
    async Task<int> BuyMoreAsync()
     int quantity = QuickPurchase.IsChecked ? 1
                                            : await GetQuantityAsync();
     if (quantity != 0) {
      if (await DebitAccountAsync(quantity * PricePerWidget)) {
       return quantity;
     return 0; // user bought no items

    You receive a bug report that you track back to the fact that Total­Widgets does not match the number of widgets purchased. It affects only people who checked the quick purchase box, and only people purchasing from overseas.

    Here's what is going on.

    The user clicks the Buy More button, and they have Quick Purchase enabled. The Buy­More­Async function tries to debit the account for the price of one widget.

    While waiting for the server to process the transaction, the user gets impatient and clicks Buy More a second time. This triggers a second task to debit the account for the price of one widget.

    Okay, so you now have two tasks running, each processing one of the clicks. In theory, the worst case is that the user accidentally buys two widgets, but in practice...

    The first Debit­Account­Async task completes, and Buy­More­Async returns 1, which is then added to the value of Total­Widgets at the time the button was clicked, as we discussed above. At the time the button was clicked the first time, the number of widgets was 100, so the total number of widgets is now 101.

    The second Debit­Account­Async task completes, and Buy­More­Async returns 1, which is then added to the value of Total­Widgets at the time the button was clicked, as we discussed above. When the button was clicked the second time, the number of widgets was still 100. We set the total widget count to 100 + 1 = 101.

    Result: The user paid for two widgets but got only one.

    The fix for this is to explicitly move waiting for the purchase to complete outside of the compound assignment.

     int quantity = await BuyMoreAsync();
     TotalWidgets += quantity;

    Now, the await is outside the compound assignment so that the value of Total­Widgets is not captured prematurely. When the purchase completes, we update Total­Widgets without interruption from any async operations.

    (You probably also should fix the program so it disables the Buy More button while a transaction is in progress, to avoid the impatient user ends up making an accidental double purchase problem. The above fix merely gets rid of the user pays for two items and gets only one problem.)

    Like closing around the loop control variable, this is the sort of subtle change that should be well-commented so that somebody doesn't "fix" it in a well-intentioned but misguided attempt to remove unnecessary variables. The purpose of the variable is not to break an expression into two but rather to force a particular order of evaluation: You want to to finish the purchase operation before starting to update the widget count.

  • The Old New Thing

    Keep your eye on the code page: C# edition (the mysterious third code page)


    A customer was having trouble manipulating the console from a C# program:

    We found that C# can read only ASCII data from the console. If we try to read non-ASCII data, we get garbage.

    using System;
    using System.Text;
    using System.Runtime.InteropServices;
    class Program
      struct COORD
        public short X;
        public short Y;
      [DllImport("kernel32.dll", SetLastError=true)]
      static extern IntPtr GetStdHandle(int nStdHandle);
      const int STD_OUTPUT_HANDLE = -11;
      [DllImport("kernel32.dll", SetLastError=true)]
      static extern bool ReadConsoleOutputCharacter(
        IntPtr hConsoleOutput,
        [Out] StringBuilder lpCharacter,
        uint nLength,
        COORD dwReadCoord,
        out uint lpNumberOfCharsRead);
      public static void Main()
        // Write a string to a fixed position
        // Read it back
        COORD coord  = new COORD { X = 0, Y = 0 };
        StringBuilder sb = new StringBuilder(8);
        uint nRead = 0;
                                   sb, (uint)sb.Capacity, coord, out nRead);
        // Trim off any unused excess.
        sb.Remove((int)nRead, sb.Length - (int)nRead);
        // Show what we read

    Observe that this program is unable to read the Å and ö characters. They come back as garbage.

    Although there are three code pages that have special treatment in Windows, the CLR gives access to only two of them via Dll­Import.

    • CharSet.Ansi = CP_ACP
    • CharSet.Unicode = Unicode (which in Windows means UTF16-LE unless otherwise indicated).

    Unfortunately, the console traditionally uses the OEM code page.

    Since the Dll­Import did not specify a character set, the CLR defaults (unfortunately) to Char­Set.Ansi. Result: The Read­Console­Output­Character function stores its results in CP_OEM, the CLR treats the buffer as if it were CP_ACP, and the result is confusion.

    The narrow-minded fix is to try to fix the mojibake by taking the misconverted Unicode string, converting it to bytes via the ANSI code page, then converting the bytes to Unicode via the OEM code page.

    The better fix is simply to avoid the 8-bit code page issues entirely and say you want to use Unicode.

      [DllImport("kernel32.dll", SetLastError=true, CharSet=CharSet.Unicode)]
      static extern bool ReadConsoleOutputCharacter(...);
  • The Old New Thing

    Keep your eye on the code page: C# edition (warning about DllImport)


    Often, we receive problem reports from customers who failed to keep their eye on the code page.

    Does the SH­Get­File­Info function support files with non-ASCII characters in their names? We find that the function either fails outright or returns question marks when asked to provide information for files with non-ASCII characters in their name.

    using System;
    using System.Runtime.InteropServices;
    class Program
     static void Main(string[] args)
      string fileName = "BgṍRồ.txt";
      Console.WriteLine("File exists? {0}", System.IO.File.Exists(fileName));
      // assumes extensions are hidden
      string expected = "BgṍRồ";
      Test(fileName, SHGFI_DISPLAYNAME, expected);
     static void Test(string fileName, uint flags, string expected)
      var actual = GetNameViaSHGFI(fileName, flags);
      Console.WriteLine("{0} == {1} ? {2}", actual, expected, actual == expected);
     static string GetNameViaSHGFI(string fileName, uint flags)
      SHFILEINFO sfi = new SHFILEINFO();
      if (SHGetFileInfo(fileName, 0, ref sfi, Marshal.SizeOf(sfi),
                        flags) != IntPtr.Zero) {
       return sfi.szDisplayName;
      } else {
       return null;
     struct SHFILEINFO {
      public IntPtr hIcon;
      public int iIcon;
      public uint dwAttributes;
      [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 260)]
      public string szDisplayName;
      [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 80)]
      public string szTypeName;
     const uint SHGFI_USEFILEATTRIBUTES = 0x10;
     const uint SHGFI_DISPLAYNAME = 0x0200;
     static extern IntPtr SHGetFileInfo(
        string path, uint fileAttributes, ref SHFILEINFO info, int cbSize,
        uint flags);
    // Output:
    // File exists? True
    //  == Bg?R? ? False
    // Bg?R? == Bg?R? ? False

    If we ask for the display name, the function fails even though the file does exist. If we also pass the SHGFI_USE­FILE­ATTRIBUTES flag to force the system to act as if the file existed, then it returns the file name but with question marks where the non-ASCII characters should be.

    The SH­Get­File­Info function supports non-ASCII characters just fine, provided you call the version that supports non-ASCII characters!

    The customer here fell into the trap of not keeping their eye on the code page. It goes back to an unfortunate choice of defaults in the System.Runtime.Interop­Services namespace: At the time the CLR was originally being developed, Windows operating systems derived from Windows 95 were still in common use, so the CLR folks decided to default to Char­Set.Ansi. This made sense back in the day, since it meant that your program ran the same on Windows 98 as it did in Windows NT. In the passage of time, the Windows 95 series of operating systems became obsolete, so the need to be compatible with it gradually disappeared. But too late. The rules were already set, and the default of Char­Set.Ansi could not be changed.

    The solution is to specify Char­Set.Unicode explicitly in the Struct­Layout and Dll­Import attributes.

    FxCop catches this error, flagging it as Specify­Marshaling­For­PInvoke­String­Arguments. The error explanation talks about the security risks of unmapped characters, which is all well and good, but it is looking too much at the specific issue and not so much at the big picture. As a result, people may ignore the issue because it is flagged as a complicated security issue, and they will think, "Eh, this is just my unit test, I'm not concerned about security here." However, the big picture is

    This is almost certainly an oversight on your part. You didn't really mean to disable Unicode support here.

    Change the lines



     [StructLayout(LayoutKind.Sequential, CharSet=CharSet.Unicode)]
     [DllImport("shell32.dll", CharSet=CharSet.Unicode)]

    and re-run the program. This time, it prints

    File exists? True
    Bg?R? == Bg?R? ? True
    Bg?R? == Bg?R? ? True

    Note that you have to do the string comparison in the program because the console itself has a troubled history with Unicode. At this point, I will simply cue a Michael Kaplan rant and link to an article explaining how to ask nicely.

  • The Old New Thing

    Revival of the Daleks: Act One, Scene One


    In 2009, a group of volunteers on a routine cleanup of a pond in Hampshire, England discovered a Dalek.

    (Later in the episode, the story may introduce a scientist who is thawing out a 30,000-year-old-virus.)

  • The Old New Thing

    If you want to set a thread's apartment model via Thread.CurrentThread.ApartmentState, you need to act quickly


    Welcome to CLR Week 2014. Don't worry, it'll be over in a few days.

    A customer wanted to know why their Folder­Browser­Dialog was displaying the infamous Current thread must be set to single thread apartment (STA) mode before OLE calls can be made error.

    private void btnBrowseFolder_Click(object sender, System.EventArgs e)
      Thread.CurrentThread.ApartmentState = ApartmentState.STA;
      FolderBrowserDialog fbd = new FolderBrowserDialog {
        RootFolder = System.Environment.SpecialFolder.MyComputer,
        ShowNewFolderButton = true,
        Description = "Select the awesome folder..."
      DialogResult dr = fbd.ShowDialog();

    "Even though we set the Apartment­State to STA, the apartment state is still MTA. Curiously, if we put the above code in a standalone test program, it works fine."

    The problem is that the customer is changing the apartment state too late.

    On the first call to unmanaged code, the runtime calls Co­Initialize­Ex to initialize the COM apartment as either an MTA or an STA apartment. You can control the type of apartment created by setting the System.Threading.ApartmentState property on the thread to MTA, STA, or Unknown.

    Notice that the value you specify in Current­Thread.Apartment­State is consulted at the point the runtime initializes the COM apartment (which occurs on the first call to unmanaged code). If you change it after the COM apartment has been initialized, you're revising the blueprints of a house after it has been built.

    The standard way to avoid this problem is to attach the [STAThread] attribute to your Main function, or if you need to set the apartment model of a thread you created yourself, call the Thread.Set­Apartment­State method before the thread starts.

Page 3 of 431 (4,304 items) 12345»