• The Old New Thing

    Once you go input-idle, your application is deemed ready to receive DDE messages


    Feel free to stop using DDE.

    There was one customer who confessed that they were still using DDE, and they asked for help debugging a DDE problem. They found that sometimes, when their application was launched for DDE, it never received the WM_DDE_INITIATE message. Instead, the Shell­Execute function returned ERROR_DDE_FAIL. If launched from Explorer, the error message shown to the user was "There was a problem sending the command to the program."

    It took a long time to figure out what was going on, and there were a number of dead ends, but I'll cut to the chase: The problem was that one of the features they added to their program included code that ran during process startup, and that code pumped messages as part of its initialization. Message pumping was not expected there, which is why it took so long to isolate.

    The Wait­For­Input­Idle function was created for DDE. It's how a DDE client determines that the DDE server is ready to accept commands. And as soon as any thread in your process goes input-idle, the entire process is declared to be input-idle, and you end up becoming eligible to receive DDE messages, even if you're not really ready for them.

    In the case of this program, the accidental message pump caused the application to be considered ready to accept DDE commands even though the main DDE server hadn't gotten off the ground yet. The initiation message went to the splash screen, and the splash screen said, "Why are you bothering me with stupid DDE messages? I'm just a splash screen!"

    TL;DR: If your application includes a DDE server, make sure not to pump messages until your DDE server is ready to receive commands.

  • The Old New Thing

    What happened to the Shut Down menu in classic Task Manager?

    The great thing about open comments is that anybody can use them to introduce their favorite gripe as long as it shares at least four letters of the alphabet in common with the putative topic of the base article.

    xpclient "asks" why the Shut Down menu was removed from Task Manager. I put the word "asks" in quotation marks, because it's really a complaint disguised as a question. As in "Why do you guys suck?"

    The first thing to understand is that classic Task Manager went into a state of sustained engineering since Windows 2000. In other words, the component is there, but there is no serious interest in improving it. (That's why it wasn't updated to call Enable­Theme­Dialog­Texture on its pages.) It's not like there's a Task Manager Team of five people permanently dedicated to making Task Manager as awesome as possible for every release of Windows. Rather, the responsibility for maintaining Task Manager is sort of tacked onto somebody whose primary responsibilities are for other parts of the system.

    There are a lot of Windows components in this state of "internal sustained engineering." The infamous "Install font" dialog, for example. The responsibility for maintaining these legacy components is spread out among the product team so that on average, teams are responsible both for cool, exciting things and some not-so-cool, legacy things.

    (On the other hand, according to xpclient, an app must be serving its users really well if it hasn't changed much, so I guess that Install font dialog is the best dialog box in all of Windows at serving its users, seeing as it hasn't changed since 1995.)

    The engineering budget for these components in internal sustained engineering is kept to a minimum, both because there is no intention of adding new features, and also because the components are so old that there is unlikely to be any significant work necessary in the future.

    Every so often, some work becomes necessary, and given that the engineering interest and budget are both very low, the simplest way out when faced with a complicated problem in a rarely-used feature is simply to remove the rarely-used feature.

    And that's what happened to the Shut Down menu. (Note that it's two words "Shut down" since it is being used as a verb, not a noun.) Given the changes to power management in Windows Vista, the algorithm used by Task Manager was no longer accurate. And instead of keeping Task Manager updated with every change, the Shutdown user interface design team agreed to give the Task Manager engineering team a break and say, "Y'know, the Shut Down menu on Task Manager is rarely-used, so we'll let you guys off the hook on this one, so you don't keep getting weekly requests from us to change the way Shut Down works."

    I remember, back in the days of Windows XP, seeing the giant spreadsheet used by the person responsible for overall design of the Shutdown user interface. It tracked the gazillion group policies, user settings, and system configurations which all affect how shutting down is presented to the user. Removing the column for Task Manager from the spreadsheet probably was met with a huge sigh of relief, not just from the Task Manager engineering team, but also from the person responsible for the spreadsheet.

    Remember, engineering is about trade-offs. If you decide to spend more effort making Task Manager awesome, you lose the ability to expend that effort on something else. (And given that you are expending effort in a code base that is relatively old and not fresh in the minds of the people who would be making those changes, you also increase the likelihood that you're going to introduce a bug along the way.)

  • The Old New Thing

    10 is the new 6


    While it may no longer be true that everything at Microsoft is built using various flavors of Visual C++ 5.0, 6.0, and 7.0, there is still a kernel of truth in it: A lot of customers are still using Visual C++ 6.0.

    That's why the unofficial slogan for Visual C++ 2010 was 10 is the new 6. Everybody on the team got a T-shirt with the slogan (because you don't have a product until you have a T-shirt).

  • The Old New Thing

    Who would ever write a multi-threaded GUI program?


    During the development of Windows 95, the user interface team discovered that a component provided by another team didn't work well under multi-threaded conditions. It was documented that the Initialize function had to be the first call made by a thread into the component.

    The user interface team discovered that if one thread called Initialize, and then used the component, then everything worked great. But if a second thread called Initialize, the component crashed whenever the second thread tried to use it.

    The user interface team reported this bug back to the team that provided the component, and some time later, an updated version of the component was delivered.

    Technically, the bug was fixed. When the second thread called Initialize, the function now failed with ERROR_NOT_SUPPORTED.

    The user interface team went back to the team that provided the component. "It's nice that your component detects that it is being used by a multi-threaded client and fails the second thread's attempt to initialize it. But given that design, how can a multi-threaded client use your component?"

    The other team's reply was, "It doesn't matter. Nobody writes multi-threaded GUI programs."

    The user interface team had to politely reply, "Um, we are. The next version of Windows will be built on a multi-threaded shell."

    The other team said, "Oh, um, we weren't really expecting that. Hang on, we'll get back to you."

    The idea that somebody might write a multi-threaded program that used their component caught them by surprise, and they had to come up with a new design of their component that supported multiple threads in a clean way. It was a lot of work, but they came through, and Windows 95 could continue with its multi-threaded shell.

  • The Old New Thing

    Enumerating bit strings with a specific number of bits set (binomial coefficients strike again)


    Today's Little Program prints all bit strings of length n subject to the requirement that the string have exactly k 1-bits. For example, all bit strings of length 4 with 2 bits set are 0011, 0101, 0110, 1001, 1010, and 1100. Let's write BitString(n, k) to mean all the bit strings of length n with exactly k bits set.

    Let's look at the last bit of a typical member of BitString(n, k). If it is a zero, then removing it leaves a string one bit shorter but with the same number of bits set. Conversely every BitString(n − 1, k) string can be extended to a BitString(n, k) string by adding a zero to the end.

    If the last bit is a one, then removing it leaves a bit string of length n − 1 with exactly k − 1 bits set, and conversely every bit string of length n − 1 with exactly k − 1 bits set can be extended to a bit string of length n with exactly k bits set by adding a one to the end.

    Therefore, our algorithm goes like this:

    • Handle base cases.
    • Otherwise,
      • Recursively call BitString(n − 1, k) and add a 0 to the end.
      • Recursively call BitString(n − 1, k − 1) and add a 1 to the end.

    This algorithm may look awfully familiar. The overall recursive structure is the same as enumerating subsets with binomial coefficients; we just decorate the results differently.

    That's because this problem is the same as the problem of enumerating subsets. You can think of the set bits as selecting which elements get put into the subset.

    It's not surprising, therefore, that the resulting code is identical except for how we report the results. (Instead of generating arrays, we generate strings.)

    function repeatString(s, n) {
     return new Array(n+1).join(s);
    function BitString(n, k, f) {
     if (k == 0) { f(repeatString("0", n)); return; }
     if (n == 0) { return; }
     BitString(n-1, k, function(s) { f(s + "0"); });
     BitString(n-1, k-1, function(s) { f(s + "1"); });

    If k is zero, then that means we are looking for strings of length n that contain no bits set at all. There is exactly one of those, namely the string consisting of n zeroes.

    If k is nonzero but n is zero, then that means we want strings of length zero with some bits set. That's not possible, so we return without generating anything.

    Finally, we handle the recursive case by generating the shorter strings and tacking on the appropriate digits.

    Since this is the same algorithm as subset generation, we should be able to write one in terms of the other:

    function BitString(n, k, f) {
     Subsets(n, k, function(s) {
      var str = "";
      for (var i = 1; i <= n; i++) {
       str += s.indexOf(i) >= 0 ? "1" : "0";
  • The Old New Thing

    Non-classical processor behavior: How doing something can be faster than not doing it


    Consider the following program:

    #include <windows.h>
    #include <stdlib.h>
    #include <stdlib.h>
    #include <stdio.h>
    int array[10000];
    int countthem(int boundary)
     int count = 0;
     for (int i = 0; i < 10000; i++) {
      if (array[i] < boundary) count++;
     return count;
    int __cdecl wmain(int, wchar_t **)
     for (int i = 0; i < 10000; i++) array[i] = rand() % 10;
     for (int boundary = 0; boundary <= 10; boundary++) {
      LARGE_INTEGER liStart, liEnd;
      int count = 0;
      for (int iterations = 0; iterations < 100; iterations++) {
       count += countthem(boundary);
      printf("count=%7d, time = %I64d\n",
             count, liEnd.QuadPart - liStart.QuadPart);
     return 0;

    The program generates a lot of random integers in the range 0..9 and then counts how many are less than 0, less than 1, less than 2, and so on. It also prints how long the operation took in QPC units. We don't really care how big a QPC unit is; we're just interested in the relative values. (We print the number of items found merely to verify that the result is close to the expected value of boundary * 100000.)

    Here are the results:

    boundary count time
    0 0 1869  
    1 100000 5482  
    2 200800 8152  
    3 300200 10180  
    4 403100 11982  
    5 497400 12092  
    6 602900 11029  
    7 700700 9235  
    8 797500 7051  
    9 902500 4537  
    10 1000000 1864  

    To the untrained eye, this chart is strange. Here's the naïve analysis:

    When the boundary is zero, there is no incrementing at all, so the entire running time is just loop overhead. You can think of this as our control group. We can subtract 1869 from the running time of every column to remove the loop overhead costs. What remains is the cost of running count increment instructions.

    The cost of a single increment operation is highly variable. At low boundary values, it is around 0.03 time units per increment. But at high boundary values, the cost drops to one tenth that.

    What's even weirder is that once the count crosses 600,000, each addition of another 100,000 increment operations makes the code run faster, with the extreme case when the boundary value reaches 10, where we run faster than if we hadn't done any incrementing at all!

    How can the running time of an increment instruction be negative?

    The explanation for all this is that CPUs are more complicated than the naïve analysis realizes. We saw earlier that modern CPUs contain all sorts of hidden variables. Today's hidden variable is the branch predictor.

    Executing a single CPU instruction takes multiple steps, and modern CPUs kick off multiple instructions in parallel, with each instruction at a different stage of execution, a technique known as pipelining.

    Conditional branch instructions are bad for pipelining. Think about it: When a conditional branch instruction enters the pipeline, the CPU doesn't know whether the condition will be true when the instruction reaches the end of the pipeline. Therefore, it doesn't know what instruction to feed into the pipeline next.

    Now, it could just sit there and let the pipeline sit idle until the branch/no-branch decision is made, at which point it now knows which instruction to feed into the pipeline next. But that wastes a lot of pipeline capacity, because it will take time for those new instructions to make it all the way through the pipeline and start doing productive work.

    To avoid wasting time, the processor has an internal branch predictor which remembers the recent history of which conditional branches were taken and which were not taken. The fanciness of the branch predictor varies. Some processors merely assume that a branch will go the same way that it did the last time it was countered. Others keep complicated branch history and try to infer patterns (such as "the branch is taken every other time").

    When a conditional branch is encountered, the branch predictor tells the processor which instructions to feed into the pipeline. If the branch prediction turns out to be correct, then we win! Execution continues without a pipeline stall.

    But if the branch prediction turns out to be incorrect, then we lose! All of the instructions that were fed into the pipeline need to be recalled and their effects undone, and the processor has to go find the correct instructions and start feeding them into the pipeline.

    Let's look at our little program again. When the boundary is 0, the result of the comparison is always false. Similarly, when the boundary is 10, the result is always true. In those cases, the branch predictor can reach 100% accuracy.

    The worst case is when the boundary is 5. In that case, half of the time the comparison is true and half of the time the comparison is false. And since we have random data, fancy historical analysis doesn't help any. The predictor is going to be wrong half the time.

    Here's a tweak to the program: Change the line

         if (array[i] < boundary) count++;


         count += (array[i] < boundary) ? 1 : 0;

    This time, the results look like this:

    boundary count time
    0 0 2932  
    1 100000 2931  
    2 200800 2941  
    3 300200 2931  
    4 403100 2932  
    5 497400 2932  
    6 602900 2932  
    7 700700 2999  
    8 797500 2931  
    9 902500 2932  
    10 1000000 2931  

    The execution time is now independent of the boundary value. That's because the optimizer was able to remove the branch from the ternary expression:

    ; on entry to the loop, ebx = boundary
        mov edx, offset array ; start at the beginning of the array
        xor ecx, ecx    ; start with zero
        cmp [edx], ebx  ; compare array[i] with boundary
        setl cl         ; if less than boundary, then set al = 1
        add eax, ecx    ; accumulate result in eax
        add edx, 4      ; loop until end of array
        cmp edx, offset array + 40000
        jl $LL3

    Since there are no branching decisions in the inner loop aside from the loop counter, there is no need for a branch predictor to decide which way the comparison goes. The same code executes either way.

    Exercise: Why are the counts exactly the same for both runs, even though the dataset is random?

  • The Old New Thing

    Can we continue to call FindNextFile() with the same handle that returned an error last time?


    A customer wanted to know whether it was okay to call Find­Next­File with the same handle that returned an error last time. In other words, consider the following sequence of events:

    1. h = Find­First­File(...); succeeds
    2. Find­Next­File(h, ...); fails
    3. Find­Next­File(h, ...); ??? profit ???

    The customer elaborated:

    Suppose that the directory contains four files, A, B, C, and D. We expect the following:

    • Find­First­File returns A
    • Find­Next­File returns B
    • Find­Next­File fails (C is selected but an error occurred)
    • Find­Next­File returns D ← is this expected?

    After Find­Next­File returns an error, can we continue to search with the same handle? Or should we close the handle and get a new one from Find­First­File? If it depends on the type of error that occurred, the customer would like to know more details about when the search can be continued and when it cannot.

    We asked the customer what problem they're encountering that is causing them to ask this strange question. The customer replied, "Sometimes we get the error ERROR_FILE_CORRUPT or ERROR_INVALID_FUNCTION, but we don't know what end-user configurations are causing those error codes. We would like to know whether we can continue to use Find­Next­File in these two cases."

    The assumption "C is selected by an error occurred" is already wrong. The error might not have selected C. The error may have failed before C was selected. (For example, maybe the network cable was unplugged, so the server doesn't even know that we tried to select C.) Or the error may result in C and D both being skipped. Since an error occurred, any of these things may have happened.

    There is no value in trying to continue to use a find handle that is in an error state because you cannot reason about it. Maybe it got stuck in a permanent error state (the user removed the USB drive). Maybe it is a transient error state (somebody finds the network cable and plugs it back in). It's like asking, "If something is broken, can I expect it to continue working?"

    Even closing the handle and restarting the enumeration may not succeed. For example, as long as the drive or network cable remains unplugged, your enumeration will fail. And it might be a repeatable error state due to drive corruption which causes enumerations always to fail at the same place.

    (My guess is that ERROR_FILE_CORRUPT is the case of drive corruption, and ERROR_INVALID_FUNCTION is some sort of device driver error state, perhaps because the device was unplugged.)

    You should just accept that you cannot enumerate the contents and proceed accordingly.

  • The Old New Thing

    Why do network servers sometimes show up as a Recycle Bin Location?


    A customer wanted to know why some of their users showed network servers among the locations shown in the Recycle Bin property sheet.

    Answer: Because those users are using Folder Redirection.

    In particular, if you redirect the My Documents folder, then a network Recycle Bin location is created to hold the files deleted from My Documents.

    The Recycle Bin folder in the user interface shows the union of all the recycled files in the individual Recycle Bin locations.

  • The Old New Thing

    Microspeak: Brownbag


    Remember, Microspeak is not merely for jargon exclusive to Microsoft, but it's jargon that you need to know.

    The term brownbag (always one word, accent on the first syllable) refers to a presentation given during lunch. The attendees are expected to bring their lunch to the meeting room and eat while they listen to the presentation.

    A brownbag could be a one-off presentation, or it could be a regular event. The speaker could be an invited guest, or the presenters may come from within the team. In general, the purpose of a brownbag is to familiarize the audience with a new concept or to share information with the rest of the team. Sometimes attendance is optional, sometimes attendance is mandatory, and sometimes attendance is optional but strongly recommended, which puts it in the murky category of mandatory optional.

    You can learn more about each team's plans in brownbags that we will kick off the week of 2/17 and continue regularly through the month.
    Are you going to the brownbag? I'm heading to the cafeteria, want to come along?

    It is common for the slides accompanying a brownbag to be placed on a Web site for future reference. Sometimes the presentation is recorded as well.

    The term brownbag is sometimes extended to mean any presentation which introduces a group of people to a new concept, whether it occurs at lunch or not.

    Virtual brownbag on widget coloring.

    That's the (redacted) subject of a message I sent out to our team. The message described the process you have to go through in order to get a widget coloring certificate. It could have been a brownbag but I was too lazy to book a room for it, so I created a virtual brownbag.

    Due to scheduling conflicts, we will have to move the presentation to Friday at noon. We apologize for the last-minute change. This is now really a brownbag, so grab your lunch in the cafeteria and join us for a great talk and discussion!

    The above is another example of how the term brownbag was applied to something that, at least originally, was not a lunch meeting.

  • The Old New Thing

    Improving the performance of CF_HDROP by providing file attribute information


    The CF_HDROP clipboard format is still quite popular, despite its limitation of being limited to files. You can't use it to represent virtual content, for example.

    For all of you still using CF_HDROP, you can improve the performance of drag/drop operations by adding a little more information to your data object.

    Observe that the CF_HDROP clipboard format is just a list of paths. Some drop targets care about whether the paths refer to directories or to files, and since CF_HDROP does not provide this information, the drop targets are forced to access the disk to get the answer. (This can be expensive for network locations.)

    To help this case, you can add a CFSTR_FILE_ATTRIBUTES_ARRAY to your data object. This contains the file attribute information for the items in your CF_HDROP, thereby saving the drop target the cost of having to go find them.

    Take our tiny drag-drop sample and make the following changes:

    class CTinyDataObject : public IDataObject
      enum {
        // DATA_TEXT,
        DATA_INVALID = -1,
    CTinyDataObject::CTinyDataObject() : m_cRef(1)
     DROPFILES df;
     TCHAR szFile[ARRAYSIZE(TEXT("C:\\Something.txt\0"))];
    } const c_hdrop = {
        { 0, 0 },
        sizeof(TCHAR) == sizeof(WCHAR), // fUnicode
    HRESULT CTinyDataObject::GetData(FORMATETC *pfe, STGMEDIUM *pmed)
      ZeroMemory(pmed, sizeof(*pmed));
      switch (GetDataIndex(pfe)) {
      case DATA_HDROP:
        pmed->tymed = TYMED_HGLOBAL;
        return CreateHGlobalFromBlob(&&c_hdrop, sizeof(c_hdrop),
                                  GMEM_MOVEABLE, &pmed->hGlobal);
      return DV_E_FORMATETC;

    Okay, let's look at what we did here.

    First, we make our data object report a CF_HDROP. We then declare a static DROP­FILES structure which we use for all of our drag-drop operations. (Of course, in real life, you would generate it dynamically, but this is just a Little Program.)

    That's our basic program that drags a file.

    Note that

    you are much better off letting the shell create the data object,

    since that data object will contain much richer information (and this entire article would not be needed). Here's a sample program which uses the Get­UI­Object­Of­File function to do this in just a few lines. It's much shorter than having to cook up this CTiny­Data­Object class. I'm doing it this way on the assumption that your program is deeply invested in the less flexible CF_HDROP format, so changing from CF_HDROP to some other format would be impractical.

    Okay, so that's the program we're starting from. Let's add support for precomputed attributes.

    class CTinyDataObject : public IDataObject
      enum {
        DATA_INVALID = -1,
    CTinyDataObject::CTinyDataObject() : m_cRef(1)
     1, // cItems
     FILE_ATTRIBUTE_ARCHIVE, // OR of attributes
     FILE_ATTRIBUTE_ARCHIVE, // AND of attributes
     { FILE_ATTRIBUTE_ARCHIVE }, // the file attributes
    HRESULT CTinyDataObject::GetData(FORMATETC *pfe, STGMEDIUM *pmed)
      ZeroMemory(pmed, sizeof(*pmed));
      switch (GetDataIndex(pfe)) {
      case DATA_HDROP:
        pmed->tymed = TYMED_HGLOBAL;
        return CreateHGlobalFromBlob(&amp;c_hdrop, sizeof(c_hdrop),
                                  GMEM_MOVEABLE, &pmed->hGlobal);
        pmed->tymed = TYMED_HGLOBAL;
        return CreateHGlobalFromBlob(&c_attr1, sizeof(c_attr1),
                                  GMEM_MOVEABLE, &pmed->hGlobal);
      return DV_E_FORMATETC;

    Okay, let's look at what we did here.

    We added a new data format, CFSTR_FILE_ATTRIBUTES_ARRAY, and we created a static copy of the FILE_ATTRIBUTES_ARRAY variable-length structure that contains the attributes of our one file. Of course, in a real program, you would generate the structure dynamically. Note that I use a sneaky trick here: Since the FILE_ATTRIBUTES_ARRAY ends with an array of length 1, and I happen to need exactly one item, I can just declare the structure as-is and fill in the one slot. (If I had more than one item, then I would have needed more typing.)

    To make things easier for the consumers of the FILE_ATTRIBUTES_ARRAY, the structure also asks you to report the logical OR and logical AND of all the file attributes. This is to allow quick answers to questions like "Is everything in this CF_DROP a file?" or "Is anything in this CF_DROP write-protected?" Since we have only one file, the calculation of these OR and AND values is nearly trivial.

    Okay, so there isn't much benefit to adding file attributes to a drag of a single file from the local hard drive, since the local hard drive is pretty fast, and the file attributes may very well be cached. But if you've placed thousands of files from a network drive onto the clipboard, this shortcut can save a lot of time. (That was in fact the customer problem that inspired this Little Program.)

Page 7 of 430 (4,294 items) «56789»