• The Old New Thing

    Dispelling the myths, rumors, and innuendo surrounding the QueryPerformanceCounter function

    • 5 Comments

    The Query­Performance­Counter function has been the subject of much rumor and innuendo. In response to all the confusion, the kernel folks put together a page which tries to settle the controversy once and for all. It discusses the history of QPC over the ages, the problems it had on earlier versions of Windows or older firmware (which is probably where a lot of the myths started), its interaction with hypervisors, offers guidance on how to use it and its alternatives, and includes a very nice Q&A.

  • The Old New Thing

    Why does the OpenThread function behave differently when the target thread belongs to another process?

    • 1 Comments

    A customer discovered strange behavior in the Open­Thread function and wondered whether it was expected.

    We use the Open­Thread function to obtain a thread handle with THREAD_QUE­RY_LIM­IT­ED_IN­FOR­MA­TION, passing in a valid thread ID. We later pass this handle to Get­Exit­Code­Thread to get the thread exit code. We have found that the function succeeds if the thread in question belongs to another process, provided the thread is still running (has not yet exited). On the other hand, if the thread belongs to our own process, then the call always succeeds regardless of whether the thread is running or not. Is this expected behavior? And can we assume that if Open­Thread fails with ERROR_INVALID_PARAMETER, then it means that the target thread has already exited?

    The Open­Thread function fails if you pass it an invalid thread ID. Thread IDs go invalid when the corresponding thread object is destroyed, and thread objects are destroyed when the thread exits and there are no open handles to the thread. Once a thread object is destroyed, its thread ID becomes invalid and may be re-used by a future thread.

    Whether the thread belongs to the same process or a different process does not play a rôle in this determination.

    My guess is that the reason the call succeeds if the target thread belongs to the same process, even if the target thread has already exited, is something much more mundane: They have a thread handle leak in their application.

    The customer never wrote back after receiving this explanation, so we'll never know whether my guess was correct.

    Bonus chatter: If you aren't sure whether you are passing a valid thread ID to Open­Thread, then you most likely already have a bug. Since thread IDs can be reused, if you haven't taken other steps to ensure that the thread you want still exists, then it's possible that the thread you want has already exited, the corresponding thread object has been destroyed, and the thread ID has been reused by some other thread. Your Open­Thread call will now succeed, but it will refer to some totally unrelated thread. Your program will most likely get very confused at this point.

  • The Old New Thing

    Some reasons not to do anything scary in your DllMain, part 3

    • 19 Comments

    In the same week, the shell team was asked to investigate two failures.

    The first one was a deadlock in Explorer. The participating threads look like this:

    • Thread 1 called Free­Library on a shell extension as part of normal Co­Free­Unused­Libraries processing. That DLL called Ole­Uninitialize from its Dll­Main function. This thread blocked because the COM lock was held by thread 2.
    • Thread 2 called Co­Create­Instance, and COM tried to load the DLL which handles the object, but the thread blocked because the loader lock was held by thread 1.

    The shell extension caused this problem because it ignored the rule against calling shell and COM functions from the Dll­Main entry point, as specifically called out in the Dll­Main documentation as examples of functions that should not be called.

    The authors of this shell extension may never have caught this problem in their internal testing (or if they did they didn't understand what it meant) because hitting this deadlock requires that a race window be hit: The shell extension DLL needs to be unloaded on one thread at the exact same moment another thread is inside the COM global lock trying to load another DLL.

    Meanwhile, another failure was traced back to a DLL calling Co­Initialize from their Dll­Main. This extra COM initialization count means that when the thread called Co­Uninitialize thinking that it was uninitializing COM, it actually merely decremented the count to 1. The code then proceeded to do things that are not allowed in a single-threaded apartment, believing that it had already torn down the apartment. But the secret Co­Initialize performed by the shell extension violated that assumption. Result: A thread that stopped responding to messages.

    The authors of both of these shell extensions seemed be calling Co­Uninitialize/Ole­Uninitialize in order to cancel out a Co­Initialize/Ole­Initialize which they performed in their DLL_PROCESS_ATTACH. This is fundamentally unsound not only because of the general rule of not calling COM functions inside Dll­Main but also because OLE initialization is a per-thread state, whereas the thread that gets the DLL_PROCESS_DETACH notification is not necessarily the one that receives the DLL_PROCESS_ATTACH notification.

    It so happens that in the second case, the DLL in question was a shell copy hook, and the hang was occuring not in Explorer but in an application which was using SH­File­Operation to delete some files. We could at least advise the application authors to pass the FOFX_NO­COPY­HOOKS flag to IFile­Operation::Set­Operation­Flags to prevent copy hooks from being loaded.

    Previous articles in this series: Part 1, Part 2.

  • The Old New Thing

    If you're looking for the code that displays a particular dialog box, the most directly way to find it is to look for the dialog box

    • 17 Comments

    Suppose you are working in a large or unfamiliar code base and you want to know where the code is that displays a particular dialog box or message box or something. Probably the most direct way of figuring this out is to look for the strings.

    Say there is a message box that asks for user confirmation. "Are you sure you want to frobulate the flux capacitor?" Search for that string in your source code. It will probably be in a resource file.

    resource.rc:IDS_CONFIRM­FROBULATE "Are you sure you want to frobulate the flux capacitor?"

    Great, now you have the string ID for that message. You can perform a second search for that ID.

    resource.h:#define IDS_CONFIRM­FROBULATE 1024
    resource.rc:IDS_CONFIRM­FROBULATE "Are you sure you want to frobulate the flux capacitor?"
    maintenance.cpp:   strPrompt.LoadString(IDS_CONFIRM­FROBULATE);

    If the thing you are searching for is a dialog box or menu item, then be aware that there may be an accelerator in the string, so a straight grep won't find it.

    No matches for "Enter the new name of the frobulator:"

    For a dialog box, you can tap the Alt key to make the accelerator show up, so you can search for the right string. For a menu, you invoke the menu via the keyboard. Or in either case, you can disable the Hide underlined letters for keyboard navigation setting.

    resource.rc:  LTEXT "Enter the ne&w name of the frobulator:",

    I tend to be lazy and instead of using any of those tricks to make the underlines show up, I just search for a shorter string and hope that the accelerator isn't in it.

    resource.rc:  LTEXT "Enter the ne&w name of the frobulator:",

    "But Raymond, hitting the Alt is just a quick tap on the keyboard. Surely you can't be that lazy!"

    Right. If the dialog box were right in front of me, then I could tap the Alt and be done. But usually, when I am investigating this sort of thing, it's because somebody has sent a screen shot and asks, "Where is the code that displays this?" Tapping Alt on a screen shot doesn't usually get you very far.

    Once you find the code that displays the dialog box or message box or whatever, you can then study the code to answer follow-up questions like "What are the conditions under which this dialog will appear?" or "Is there a setting to suppress this dialog?"

  • The Old New Thing

    My friend and his buddy invented the online shopping cart back in 1994

    • 25 Comments

    Back in 1994 or so, my friend helped out his buddy who worked as the IT department for a local Seattle company known as Sub Pop Records. Here's what their Web site looked like back then. Oh, and in case you were wondering, when I said that his buddy worked as the IT department, I mean that the IT department consisted of one guy, namely him. And this wasn't even his real job. His main job was as their payroll guy; he just did their IT because he happened to know a little bit about computers. (If you asked him, he'd say that his main job was as a band member in Earth.)

    The mission was to make it possible for fans to buy records online. Nobody else was doing this at the time, so they had to invent it all by themselves. The natural metaphor for them was the shopping cart. You wandered through the virtual record store putting records in your basket, and then you went to check out.

    The trick here is how to keep track of the user as they wander through your store. This was 1994. Cookies hadn't been invented yet, or at least if they had been invented, support for them was very erratic, and you couldn't assume that every visitor to your site is using a browser that supported them.

    The solution was to encode the shopping cart state in the URL by making every link on the page include the session ID in the URL. It was crude but it got the job done.

    The site went online, and soon they were taking orders from excited fans around the world. The company loved it, because they probably got to charge full price for the records (rather than losing a cut to the distributor). And my friend told me the deep dark secret of his system: "We do okay if you ask for standard shipping, but the real money is when somebody is impatient and insists on overnight shipping. Overcharging for shipping is where the real money is."

    (Note: Statements about business models for a primitive online shopping site from 1994 are not necessarily accurate today.)

  • The Old New Thing

    News flash: Big houses cost more to maintain

    • 19 Comments

    In 2005, we learned that big houses cost more to heat. In 2006, we learned that big houses cost more to cool.

    But then the research into big houses seems to have stalled.

    No worries. The research journal The Wall Street Journal recently released a paper concluding that big houses cost more to maintain.

  • The Old New Thing

    Deleting elements from an JavaScript array and closing up the gaps

    • 14 Comments

    Today's Little Program is an exercise that solves the following problem:

    Given a JavaScript array a and an unsorted array indices (possibly containing duplicates), calculate the result of deleting all of the elements from the original array as specified by the indices. For example, suppose a = ["alice", "bob", "charles", "david", "eve"] and indices = [2, 4, 2, 0].

    a[0] = "alice";
    a[1] = "bob";
    a[2] = "charles";
    a[3] = "david";
    a[4] = "eve";
    a.length= 5;

    The indices specify that elements 2 (charles), 4 (eve), 2 (charles again, redundant), and 0 (alice) should be deleted from the array, leaving bob and david.

    Now, if you had to delete only one element from the array, it is pretty simple:

    a.splice(n, 1);
    

    The trick with removing multiple elements is that deleting one element shifts the indices, which can throw off future calculations. One solution is to remove the highest-indexed element first; in other words, operate on the indices in reverse sorted order.

    indices.sort().reverse().forEach(function(n) { a.splice(n, 1); });
    

    This technique does still suffer from the problem that if there are duplicates in the indices, extraneous elements get deleted by mistake.

    Another approach is to reinterpret the problem by focusing not on the deletion but on the survivors: Produce the array consisting of elements whose indices are not in the list of indices to be deleted.

    a = a.filter(function(e, i) { return indices.indexOf(i) >= 0; });
    

    The above approach works well if the list of indices to be deleted is short, but it gets quite expensive if the list is long.

    My approach is to use the fact that JavaScript arrays can be sparse. This is a side effect of the fact that JavaScript array indices are actually object properties; the only thing that makes arrays different from generic objects in a language-theoretic sense is the magic length property:

    • If a new property is added, and the property name is the stringification of a whole number, then the length is updated to the numeric value of the added property name, plus 1.
    • If the length property is modified programmatically, the new value must be a whole number, and all properties which are the stringification of a whole number greater than or equal to the new length are deleted.

    (See ECMA-262 sections 15.4, 15.4.5.1, and 15.4.5.2 for nitpicky details.)

    The first step, then, is to delete all the indices that need to be deleted.

    indices.forEach(function(n) { delete a[n]; });
    

    When applied to our sample data, this leaves

    a[1] = "bob";
    a[3] = "david";
    a.length= 5;

    which gets printed in a rather goofy way: a = [, "bob", , "dave", ].

    The next step is to close up the gaps. We take advantage of the fact that the Array enumeration methods operate on indices 0 through length - 1 and that they skip missing elements. Therefore, I can simply apply a dummy filter:

    a = a.filter(function() { return true; });
    

    Exercise: What is the difference (aside from performance) between a.push(1); and a = a.concat(1);? How is this question relevant to today's exercise?

  • The Old New Thing

    The scope of the C# checked/unchecked keyword is static, not dynamic

    • 21 Comments

    C# has operators checked and unchecked to control the behavior of the language in the face of integer overflow. There are also checked and unchecked statements which apply the behavior to blocks of statements rather than single expressions.

    int x;
    
    x = checked(a + b); // evaluate with overflow checking
    y = unchecked(a + b); // evaluate without overflow checking
    
    checked {
     x = a + b; // evaluate with overflow checking
    }
    
    unchecked {
     x = a + b; // evaluate without overflow checking
    }
    

    Why, then, doesn't this code below raise an overflow exception?

    class Program {
     static int Multiply(int a, int b) { return a * b; }
     static int Overflow() { return Multiply(int.MaxValue, 2); }
    
     public static void Main() {
      System.Console.WriteLine(checked(Overflow()));
      checked {
        System.Console.WriteLine(Overflow());
      }
     }
    }
    

    (Mini-exercise: Why couldn't I have just written static int Overflow() { return int.MaxValue * 2; }?)

    The answer is that the scope of the checked or unchecked keyword is static, not dynamic. Whether a particular arithmetic is checked or unchecked is determined at compile time, not at run time. Since the multiplication in the Multiply function is not explicitly marked checked or unchecked, uses the overflow context implied by your compiler options. Assuming you've left it at the default of unchecked, this means that there is no overflow checking in the Multiply function, even if you call it from a checked context. Because once you call the Multiply function, you have left the checked context.

    The C# language specification addresses this issue not once, not twice, but three times! (But it seems that some people miss it, possibly because there is too much documentation.)

    First, there is an explicit list of operations which are controlled by the checked or unchecked keyword:

    • The predefined ++ and -- unary operators, when the operand is of an integral type.
    • The predefined - unary operator, when the operand is of an integral type.
    • The predefined +, -, *, and / binary operators, when both operands are of integral types.
    • Explicit numeric conversions from one integral type to another integral type, or from float or double to an integral type.

    That's all. Note that function calls are not on the list.

    Now, that may have been a bit too subtle (documentation by omission), so the language specific goes ahead and calls it out.

    The checked and unchecked operators only affect the overflow checking context for those operations that are textually contained within the "(" and ")" tokens. The operators have no effect on function members that are invoked as a result of evaluating the contained expression.

    And then, in case you still didn't get it, the language specification even includes an example:

    class Test
    {
       static int Multiply(int x, int y) {
          return x * y;
       }
       static int F() {
          return checked(Multiply(1000000, 1000000));
       }
    }
    

    The use of checked in F does not affect the evaluation of x * y in Multiply, so x * y is evaluated in the default overflow checking context.

    (I wrote my example before consulting the language specification. That we both chose to use multiplication overflow is just a coincidence.)

    Even though the language specification says it three times, in three different ways, there are still people who are under the mistaken impression that the scope of the checked keyword is dynamic.

    Another thing you may have notice is that the checked and unchecked keywords apply only to the built-in arithmetic operations on integers. They do not apply to overloaded operators or to operators on custom classes.

    Which makes sense if you think about it, because in order to define an overloaded operator or an operator on a custom class, you need to write the implementation as a separate function, in which case you have already left the scope of the checked and unchecked keywords.

    And now we are leaving the scope of CLR Week. You can remove your hands from your ears now.

  • The Old New Thing

    Customers not getting the widgets they paid for if they click too fast -or- In C#, the += operator is not merely not guaranteed to be atomic, it is guaranteed not to be atomic

    • 40 Comments
    In the C# language, operation/assignment such as += are explicitly not atomic. But you already knew this, at least for properties.

    Recall that properties are syntactic sugar for method calls. A property declaration

    string Name { get { ... } set { ... } }
    

    is internally converted to the equivalent of

    string get_Name() { ... }
    void set_Name(string value) { ... }
    

    Accessing a property is similarly transformed.

    // o.Name = "fred";
    o.put_Name("fred");
    
    // x = o.Name;
    x = o.get_Name();
    

    Note that the only operations you can provide for properties are get and set. There is no way of customizing any other operations, like +=. Therefore, if you write

    o.Name += ", Jr.";
    

    the compiler has no choice but to convert it to

    o.put_Name(o.get_Name() + ", Jr.");
    

    If all you have is a hammer, everything needs to be converted to a nail.

    Since the read and write are explicitly decoupled, there is naturally a race condition here. The underlying property may change value in between the time you read the old value and the time you write the new value.

    But there are extra subtleties here. Let's dig in.

    The rule for operators like += are spelled out in Section 14.3.2: Compound Assignment:

    [T]he operation is evaluated as x = x op y, except that x is evaluated only once.

    (There is some discussion of what "evaluated only once" means, but that's not important here.)

    The subtleties lurking in that one sentence are revealed when you see how that sentence interacts with other rules in the language.

    Now, you might say, "Sure, it's not atomic, but my program is single-threaded, so this should never affect me."

    Actually, you can get bitten by this even in single-threaded programs. Let's try it:

    class Program
    {
     static int x = 0;
    
     static int f()
     {
      x = x + 10;
      return 1;
     }
    
     public static void Main()
     {
      x += f();
      System.Console.WriteLine(x);
     }
    }
    

    What does this program print?

    You might naïvely think that it prints 11 because x is incremented by 1 by Main and incremented by 10 in f.

    But it actually prints 1.

    What happened here?

    Recall that C# uses strict left-to-right evaluation order. Therefore, the order of operations in the evaluation of x += f() is

    1. Rewrite as x = x + f().
    2. Evaluate both sides of the = operator, left to right.
      1. Left hand side of assignment: Find the variable x.
      2. Right hand side of assignment:
        1. Evaluate both sides of the + operator, left to right.
          1. Evaluate x.
          2. Evaluate f().
        2. Add together the results of steps 2b(i)1 and 2b(i)2.
    3. Take the result of step 2b(ii) and assign it to the variable x found in step 2a.

    The thing to notice is that a lot of things can happen between step 2b(i)1 (evaluating the old value of x), and step 3 (assigning the final result to x). Specifically, we shoved a whole function call in there: f().

    In our case, the function f() also modifies x. That modification takes place after we already captured the value of x in step 2b(i)1. When we get around to adding the values in step 2b(ii), we don't realize that the values are out of date.

    Let's step through this evaluation in our example.

    1. Rewrite as x = x + f().
    2. Evaluate both sides of the = operator, left to right.
      1. Left hand side of assignment: Find the variable x.
      2. Right hand side of assignment:
        1. Evaluate both sides of the + operator, left to right.
          1. Evaluate x. The result is 0.
          2. Evaluate f(). The result is 1. It also happens that x is modified as a side-effect.
        2. Add together the results of steps 2b(i)1 and 2b(i)2. In this case, 0 + 1 = 1.
    3. Take the result of step 2b and assign it to the variable x found in step 2a. In this case, assign 1 to x.

    The modification to x that took place in f was clobbered by the assignment operation that completed the += sequence. And this behavior is not just in some weird "undefined behavior" corner of the language specification. The language specification explicitly requires this behavior.

    Now, you might say, "Okay, I see your point, but this is clearly an unrealistic example, because nobody would write code this weird."

    Maybe you don't intentionally write code this weird, but you can do it accidentally. And this is particularly true if you are using the new await keyword, because an await means, "Hey, like, put my function on hold and do other stuff for a while. When the thing I'm awaiting is ready, then resume execution of my function." And that "do other stuff for a while" might change x.

    Suppose that you have a button in your application called Buy More. When the user clicks it, they can buy more widgets. Let's assume that the Buy­More­Async function return the number of items bought. (If the user cancels the purchase it returns zero.)

    // Suppose the user starts with 100 widgets.
    
    async void BuyMoreButton_OnClick()
    {
     TotalWidgets += await BuyMoreAsync();
    
     Inventory.Text = string.Format("You have {0} widgets.",
                                    TotalWidgets);
    }
    
    async Task<int> BuyMoreAsync()
    {
     int quantity = QuickPurchase.IsChecked ? 1
                                            : await GetQuantityAsync();
     if (quantity != 0) {
      if (await DebitAccountAsync(quantity * PricePerWidget)) {
       return quantity;
      }
     }
     return 0; // user bought no items
    }
    

    You receive a bug report that you track back to the fact that Total­Widgets does not match the number of widgets purchased. It affects only people who checked the quick purchase box, and only people purchasing from overseas.

    Here's what is going on.

    The user clicks the Buy More button, and they have Quick Purchase enabled. The Buy­More­Async function tries to debit the account for the price of one widget.

    While waiting for the server to process the transaction, the user gets impatient and clicks Buy More a second time. This triggers a second task to debit the account for the price of one widget.

    Okay, so you now have two tasks running, each processing one of the clicks. In theory, the worst case is that the user accidentally buys two widgets, but in practice...

    The first Debit­Account­Async task completes, and Buy­More­Async returns 1, which is then added to the value of Total­Widgets at the time the button was clicked, as we discussed above. At the time the button was clicked the first time, the number of widgets was 100, so the total number of widgets is now 101.

    The second Debit­Account­Async task completes, and Buy­More­Async returns 1, which is then added to the value of Total­Widgets at the time the button was clicked, as we discussed above. When the button was clicked the second time, the number of widgets was still 100. We set the total widget count to 100 + 1 = 101.

    Result: The user paid for two widgets but got only one.

    The fix for this is to explicitly move waiting for the purchase to complete outside of the compound assignment.

     int quantity = await BuyMoreAsync();
     TotalWidgets += quantity;
    

    Now, the await is outside the compound assignment so that the value of Total­Widgets is not captured prematurely. When the purchase completes, we update Total­Widgets without interruption from any async operations.

    (You probably also should fix the program so it disables the Buy More button while a transaction is in progress, to avoid the impatient user ends up making an accidental double purchase problem. The above fix merely gets rid of the user pays for two items and gets only one problem.)

    Like closing around the loop control variable, this is the sort of subtle change that should be well-commented so that somebody doesn't "fix" it in a well-intentioned but misguided attempt to remove unnecessary variables. The purpose of the variable is not to break an expression into two but rather to force a particular order of evaluation: You want to to finish the purchase operation before starting to update the widget count.

  • The Old New Thing

    Keep your eye on the code page: C# edition (the mysterious third code page)

    • 12 Comments

    A customer was having trouble manipulating the console from a C# program:

    We found that C# can read only ASCII data from the console. If we try to read non-ASCII data, we get garbage.

    using System;
    using System.Text;
    using System.Runtime.InteropServices;
    
    class Program
    {
      [StructLayout(LayoutKind.Sequential)]
      struct COORD
      {
        public short X;
        public short Y;
      }
    
      [DllImport("kernel32.dll", SetLastError=true)]
      static extern IntPtr GetStdHandle(int nStdHandle);
    
      const int STD_OUTPUT_HANDLE = -11;
    
      [DllImport("kernel32.dll", SetLastError=true)]
      static extern bool ReadConsoleOutputCharacter(
        IntPtr hConsoleOutput,
        [Out] StringBuilder lpCharacter,
        uint nLength,
        COORD dwReadCoord,
        out uint lpNumberOfCharsRead);
    
      public static void Main()
      {
        // Write a string to a fixed position
        System.Console.Clear();
        System.Console.WriteLine("\u00C5ngstr\u00f6m");
    
        // Read it back
        COORD coord  = new COORD { X = 0, Y = 0 };
        StringBuilder sb = new StringBuilder(8);
        uint nRead = 0;
        ReadConsoleOutputCharacter(GetStdHandle(STD_OUTPUT_HANDLE),
                                   sb, (uint)sb.Capacity, coord, out nRead);
        // Trim off any unused excess.
        sb.Remove((int)nRead, sb.Length - (int)nRead);
    
        // Show what we read
        System.Console.WriteLine(sb);
      }
    }
    

    Observe that this program is unable to read the Å and ö characters. They come back as garbage.

    Although there are three code pages that have special treatment in Windows, the CLR gives access to only two of them via Dll­Import.

    • CharSet.Ansi = CP_ACP
    • CharSet.Unicode = Unicode (which in Windows means UTF16-LE unless otherwise indicated).

    Unfortunately, the console traditionally uses the OEM code page.

    Since the Dll­Import did not specify a character set, the CLR defaults (unfortunately) to Char­Set.Ansi. Result: The Read­Console­Output­Character function stores its results in CP_OEM, the CLR treats the buffer as if it were CP_ACP, and the result is confusion.

    The narrow-minded fix is to try to fix the mojibake by taking the misconverted Unicode string, converting it to bytes via the ANSI code page, then converting the bytes to Unicode via the OEM code page.

    The better fix is simply to avoid the 8-bit code page issues entirely and say you want to use Unicode.

      [DllImport("kernel32.dll", SetLastError=true, CharSet=CharSet.Unicode)]
      static extern bool ReadConsoleOutputCharacter(...);
    
Page 1 of 429 (4,287 items) 12345»