• The Old New Thing

    What is the default version of a header file?

    • 34 Comments

    The general rule with Windows header files is that if you don't specify which version of the header file you want, you get the latest version. For example, if you have the Windows XP Platform SDK header files and you #include <windows.h>, you're going to get the Windows XP function prototypes, the Windows XP structures, the the Windows XP flags, all that stuff. And unless you're careful, the program you get as a result will most likely run only on Windows XP.

    If you call a function that is new for Windows XP, then your program won't run on earlier versions of Windows because the import can't be resolved.†

    If you use a structure that changed for Windows XP, then your program won't run on earlier versions of Windows because the structure size will be wrong.

    Even if the structure size didn't change, using a flag that was introduced in Windows XP will create difficulties for your program when run on earlier versions of Windows because those earlier versions don't support the flag you're passing. Depending on how the function in question was written, it may ignore the "flag from the future" or it may reject it as invalid.

    If you want your program to run on older versions of Windows, you have a few options. First, you can explicitly "downgrade" your header file by defining an appropriate symbol or symbols before including the windows.h header file.

    #define WINVER         0x0400
    #define _WIN32_WINNT   0x0400
    #define _WIN32_WINDOWS 0x0400
    #define _WIN32_IE      0x0400
    
    #include <windows.h>
    #include <commctrl.h>
    #include <shlobj.h>
    ...
    

    Oh yuck, now we have the messy world of "So what's the difference between _WIN32_WINNT, _WIN32_WINDOWS, _WIN32_IE, and WINVER?" We'll pick up this topic next time, but you're not going to like the answer.

    Nitpicker's corner

    †That statement is from the operating system's‡ point of view. You can of course use techniques like Visual Studio linker's delay-load feature to avoid creating an import dependency, but that's outside the operating system.‡

    ‡s/operating system/Windows operating system/

  • The Old New Thing

    VirtualLock only locks your memory into the working set

    • 60 Comments

    When you lock memory with VirtualLock it locks the memory into your process's working set. It doesn't mean that the memory will never be paged out. It just means that the memory won't be paged out as long as there is a thread executing in your process, because a process's working set need be present in memory only when the process is actually executing.

    (Earlier versions of the MSDN documentation used to say this more clearly. At some point, the text changed to say that it locked the memory physically rather than locking it into the working set. I don't know who changed it, but it was a step backwards.)

    The working set is the set of pages that the memory manager will keep in memory while your program is running, because it is the set of pages that the memory manager predicts your program accesses frequently, so keeping those pages in memory when your program is executing keeps your program running without taking a ridiculous number of page faults. (Of course, if the prediction is wrong, then you get a ridiculous number of page faults anyway.)

    Now look at the contrapositive: If all the threads in your process are blocked, then the working set rules do not apply since the working set is needed only when your process is executing, which it isn't. If all your threads are blocked, then the entire working set is eligible for being paged out.

    I've seen people use VirtualLock expecting that it prevents the memory from being written to the page file. But as you see from the discussion above, there is no such guarantee. (Besides, if the user hibernates the computer, all the pages that aren't in the page file are going to get written to the hibernation file, so they'll end up on disk one way or another.)

    Even if you've magically managed to prevent the data from being written to the page file, you're still vulnerable to another process calling ReadProcessMemory to suck the data out of your address space.

    If you really want to lock memory, you can grant your process the SeLockMemoryPrivilege privilege and use the AWE functions to allocate non-pageable memory. Mind you, this is generally considered to be an anti-social thing to do on a paging system. The AWE functions were designed for large database programs that want to manage their paging manually. And they still won't prevent somebody from using ReadProcessMemory to suck the data out of your address space.

    If you have relatively small chunks of sensitive data, the solution I've seen recommended is to use CryptProtectData and CryptUnprotectData. The encryption keys used by these functions are generated pseudo-randomly at boot time and are kept in kernel mode. (Therefore, nobody can ReadProcessMemory them, and they won't get captured by a user-mode crash dump.) Indeed, this is the mechanism that many components in Windows 2003 Server to reduce the exposure of sensitive information in the page file and hibernation file.

    Follow-up: I've been informed by the memory manager folks that the working set interpretation was overly conservative and that in practice, the memory that has been virtually locked won't be written to the pagefile. Of course, the other concerns still apply, so you still have to worry about the hibernation file and another process sucking the data out via ReadProcessMemory.

  • The Old New Thing

    Do not overload the E_NOINTERFACE error

    • 26 Comments

    One of the more subtle ways people mess up IUnknown::QueryInterface is returning E_NOINTERFACE when the problem wasn't actually an unsupported interface. The E_NOINTERFACE return value has very specific meaning. Do not use it as your generic "gosh, something went wrong" error. (Use an appropriate error such as E_OUTOFMEMORY or E_ACCESSDENIED.)

    Recall that the rules for IUnknown::QueryInterface are that (in the absence of catastrophic errors such as E_OUTOFMEMORY) if a request for a particular interface succeeds, then it must always succeed in the future for that object. Similarly, if a request fails with E_NOINTERFACE, then it must always fail in the future for that object.

    These rules exist for a reason.

    In the case where COM needs to create a proxy for your object (for example, to marshal the object into a different apartment), the COM infrastructure does a lot of interface caching (and negative caching) for performance reasons. For example, if a request for an interface fails, COM remembers this so that future requests for that interface are failed immediately rather than being marshalled to the original object only to have the request fail anyway. Requests for unsupported interfaces are very common in COM, and optimizing that case yields significant performance improvements.

    If you start returning E_NOINTERFACE for problems other than "The object doesn't support this interface", COM will assume that the object really doesn't support the interface and may not ask for it again even if you do. This in turn leads to very strange bugs that defy debugging: You are at a call to IUnknown::QueryInterface, you set a breakpoint on your object's implementation of IUnknown::QueryInterface to see what the problem is, you step over the call and get E_NOINTERFACE back without your breakpoint ever hitting. Why? Because at some point in the past, you said you didn't support the interface, and COM remembered this and "saved you the trouble" of having to respond to a question you already answered. The COM folks tell me that they and their comrades in product support end up spending hours debugging customer's problems like "When my computer is under load, sometimes I start getting E_NOINTERFACE for interfaces I definitely support."

    Save yourself and the COM folks several hours of frustration. Don't return E_NOINTERFACE unless you really mean it.

  • The Old New Thing

    What determines which programs show up on the front page of the Windows XP Start menu?

    • 57 Comments

    The principle is that programs you've run most often recently are the ones that show up on the front page of the Start menu. At least, that's what we started with, but it turns out that some fine-tuning was needed in order to get the experience to be more "natural".

    The basic rule is that each time you launch a program, it "earns a point", and the longer you don't launch a program, the more points it loses. The Start menu then shows the programs that have the most points. That's about all I'm going to say about the mechanics of point-winning for a variety of reasons.

    • The details of the method by which the programs on the Start menu are selected are still patent-pending.†
    • The details are also extraordinarily boring. I get drowsy every time I have to explain them.
    • Exposing the details would just encourage people to "game the system" in order to improve the placement of their programs on the Start menu.
    • Formally documenting the details would implicitly grant permission for people to rely on those details. This would prevent the Start menu designers from making adjustments to the rules (for example, to combat people who "game the system") or to scrap the process entirely and replace it with a whole new method.
    • It's not this part of the Start menu selection algorithm that puzzles people.

    After the basic rule is applied, the fine-tuning and detail-following kick in. Those are the parts that are puzzling to most people. The next several entries will go into many of the subtleties and fine-tuning behind the Start menu's list of frequently-used programs.

    Now, you may wonder about all these subtleties and whether they're really necessary, but it is these little fine-tuning steps that make the final result more useful and feel natural.

    Please hold off your questions until the (two-week!) series is complete, because I suspect a later entry will answer them. (This series is an expansion upon the TechNet column on the same topic. If you've read the TechNet article, then a lot of this series will be review.)

    Pre-emptive snarky comment

    †"Software patents suck!" It's irrelevant what your or my opinion of software patents is. So long as they are legal, they will exist, and you and I will just have to deal with it. If you want a change, write to your congressman. Support candidates whose position on software patents is compatible with yours. Complaining to me accomplishes nothing.‡ It's sad that I have to write this, but any time somebody writes the word "patent" the comments degenerate into flamology about patents. I have a few future entries about patents; the response to this article will determine whether they stay on the schedule or quietly vanish like the stories about Bob.

    ‡Well, it does accomplish something: It gives me another reason to stop blogging.*

    *Along with the people who keep bugging me about using daggers instead of asterisks. Hint: Asterisks already mean something in computer programming.

  • The Old New Thing

    When does an object become available for garbage collection?

    • 13 Comments

    As we saw last time, garbage collection is a method for simulating an infinite amount of memory in a finite amount of memory. This simulation is performed by reclaiming memory once the environment can determine that the program wouldn't notice that the memory was reclaimed. There are a variety of mechanism for determining this. In a basic tracing collector, this determination is made by taking the objects which the program has definite references to, then tracing references from those objects, contining transitively until all accessible objects are found. But what looks like a definite reference in your code may not actually be a definite reference in the virtual machine: Just because a variable is in scope doesn't mean that it is live.

    class SomeClass {
     ...
     string SomeMethod(string s, bool reformulate)
     {
      OtherClass o = new OtherClass(s);
      string result = Frob(o);
      if (reformulate) Reformulate();
      return result;
     }
    }
    

    For the purpose of this discussion, assume that the Frob method does not retain a reference to the object o passed as a parameter. When does the OtherClass object o become eligible for collection? A naïve answer would be that it becomes eligible for collection at the closing-brace of the SomeMethod method, since that's when the last reference (in the variable o) goes out of scope.

    A less naïve answer would be that it become eligible for collection after the return value from Frob is stored to the local variable result, because that's the last line of code which uses the variable o.

    A closer study would show that it becomes eligible for collection even sooner: Once the call to Frob returns, the variable o is no longer accessed, so the object could be collected even before the result of the call to Frob is stored into the local variable result. Optimizing compilers have known this for quite some time, and there is a strong likelihood that the variables o and result will occupy the same memory since their lifetimes do not overlap. Under such conditions, the code generation for the statement could very well be something like this:

      mov ecx, esi        ; load "this" pointer into ecx register
      mov edx, [ebp-8]    ; load parameter ("o") into edx register
      call SomeClass.Frob ; call method
      mov [ebp-8], eax    ; re-use memory for "o" as "result"
    

    But this closer study wasn't close enough. The OtherClass object o becomes eligible for collection even before the call to Frob returns! It is certainly eligible for collection at the point of the ret instruction which ends the Frob function: At that point, the Frob has finished using the object and won't access it again. Although somewhat of a technicality, it does illustrate that

    An object in a block of code can become eligible for collection during execution of a function it called.

    But let's dig deeper. Suppose that Frob looked like this:

    string Frob(OtherClass o)
    {
     string result = FrobColor(o.GetEffectiveColor());
    }
    

    When does the OtherClass object become eligible for collection? We saw above that it is certainly eligible for collection as soon as FrobColor returns, because the Frob method doesn't use o any more after that point. But in fact it is eligible for collection when the call to GetEffectiveColor returns—even before the FrobColor method is called—because the Frob method doesn't use it once it gets the effective color. This illustrates that

    A parameter to a method can become eligible for collection while the method is still executing.

    But wait, is that the earliest the OtherClass object becomes eligible for collection? Suppose that the OtherClass.GetEffectiveColor method went like this:

    Color GetEffectiveColor()
    {
     Color color = this.Color;
     for (OtherClass o = this.Parent; o != null; o = o.Parent) {
      color = BlendColors(color, o.Color);
     }
     return color;
    }
    

    Notice that the method doesn't access any members from its this pointer after the assignment o = this.Parent. As soon as the method retrieves the object's parent, the object isn't used any more.

      push ebp                    ; establish stack frame
      mov ebp, esp
      push esi
      push edi
      mov esi, ecx                ; enregister "this"
      mov edi, [ecx].color        ; color = this.Color // inlined
      jmp looptest
    loop:
      mov ecx, edi                ; load first parameter ("color")
      mov edx, [esi].color        ; load second parameter ("o.Color")
      call OtherClass.BlendColors ; BlendColors(color, o.Color)
      mov edi, eax
    looptest:
      mov esi, [esi].parent       ; o = this.Parent (or o.Parent) // inlined
      // "this" is now eligible for collection
      test esi, esi               ; if o == null
      jnz loop                    ; then rsetart loop
      mov eax, edi                ; return value
      pop edi
      pop esi
      pop ebp
      ret
    

    The last thing we ever do with the Other­Class object (presented in the Get­Effective­Color function by the keyword this) is fetch its parent. As soon that's done (indicated at the point of the comment, when the line is reached for the first time), the object becomes eligible for collection. This illustrates the perhaps-surprising result that

    An object can become eligible for collection during execution of a method on that very object.

    In other words, it is possible for a method to have its this object collected out from under it!

    A crazy way of thinking of when an object becomes eligible for collection is that it happens once memory corruption in the object would have no effect on the program. (Or, if the object has a finalizer, that memory corruption would affect only the finalizer.) Because if memory corruption would have no effect, then that means you never use the values any more, which means that the memory may as well have been reclaimed out from under you for all you know.

    A weird real-world analogy: The garbage collector can collect your diving board as soon as the diver touches it for the last time—even if the diver is still in the air!

    A customer encountered the Call­GC­Keep­Alive­When­Using­Native­Resources FxCop rule and didn't understand how it was possible for the GC to collect an object while one of its methods was still running. "Isn't this part of the root set?"

    Asking whether any particular value is or is not part of the root set is confusing the definition of garbage collection with its implementation. "Don't think of GC as tracing roots. Think of GC as removing things you aren't using any more."

    The customer responded, "Yes, I understand conceptually that it becomes eligible for collection, but how does the garbage collector know that? How does it know that the this object is not used any more? Isn't that determined by tracing from the root set?"

    Remember, the GC is in cahoots with the JIT. The JIT might decide to "help out" the GC by reusing the stack slot which previously held an object reference, leaving no reference on the stack and therefore no reference in the root set. Even if the reference is left on the stack, the JIT can leave some metadata behind that tells the GC, "If you see the instruction pointer in this range, then ignore the reference in this slot since it's a dead variable," similar to how in unmanaged code on non-x86 platforms, metadata is used to guide structured exception handling. (And besides, the this parameter isn't even passed on the stack in the first place.)

    The customer replied, "Gotcha. Very cool."

    Another customer asked, "Is there a way to get a reference to the instance being called for each frame in the stack? (Static methods excepted, of course.)" A different customer asked roughly the same question, but in a different context: "I want my method to walk up the stack, and if its caller is OtherClass.Foo, I want to get the this object for OtherClass.Foo so I can query additional properties from it." You now know enough to answer these questions yourself.

    Bonus: A different customer asked, "The Stack­Frame object lets me get the method that is executing in the stack frame, but how do I get the parameters passed to that method?"

  • The Old New Thing

    Why do operating system files still adhere to the old 8.3 naming convention?

    • 44 Comments

    Commenter Brian Reiter asks a duplicate of a question that was already submitted to the Suggestion Box: Darren asks why operating system† files still (for the most part) adhere to the old 8.3 naming convention.

    There are a few reasons I can think of. I'm not saying that these are the reasons; I'm just brainstorming.

    First, of course, the name of a DLL cannot change once it has been chosen, because that would break programs which linked to that DLL by its old name. Windows 95 did not require the system volume and user profile volume to support long file names, although that was certainly the case by default. Companies which used roaming profiles or redirected folders may have had a heavy investment in servers which did not support long file names. Therefore, all system files on Windows 95 had to conform to the 8.3 naming convention.

    I believe that Windows NT permitted the system volume to be a short-file-names-only FAT partition as late as Windows 2000. Therefore, any DLL that existed in the Windows 2000 era had to conform to the 8.3 naming convention.

    Starting in Windows XP, long file names became mandatory, and a few system files such as shellstyle.dll waded tentatively into the long file name world. (The .NET Framework folks jumped in with both feet with their managed DLLs, but notice that their unmanaged DLLs like mscoree.dll still conform to 8.3.) But the waters in this world can be treacherous for operating system components.

    First of all, you have to worry about the automatically-generated short name. Suppose the operating system setup program is copying the shellstyle.dll file, but there is already a file called shellstuff.dll. The short name for shellstuff.dll will probably be SHELLS~1.DLL, and therefore the short name for shellstyle.dll will likely be SHELLS~2.DLL. Now, this may not be a big deal, except that some programs like to hard-code a file's short name. (There are a lot of programs that assume that the Program Files directory is C:\PROGRA~1, for example.)

    Furthermore, you can create confusion if the same DLL is loaded by both its short and long names, since the loader treats them as distinct:

    #include <stdio.h>
    #include <windows.h>
    
    int __cdecl main(int argc, char **argv)
    {
     printf("%p\n", LoadLibrary("SHELLS~1.DLL"));
     printf("%p\n", LoadLibrary("SHELLSTYLE.DLL"));
     return 0;
    }
    

    If you run this program, you will get something like this:

    6F2C0000
    00340000
    

    Even though the two paths refer to the same DLL, the loader treats them as different, and you end up with two copies of the same DLL loaded into memory. Now things get confusing, since you now have two sets of global variables, and if two components both use SHELLSTYLE.DLL but one used the short name and the other the long name, things get exciting when those two components try to talk about what they think is the same thing.

    It's like that time when I was a child and our family took a trip to Disneyland. Our parents put my brother and me on the gondola ride, and upon arrival at the other end, we were to go to the Autopia ride which was right next door. The plan was that our parents would meet us at the exit to Autopia. When my brother and I exited Autopia, we expected our parents to be waiting there for us, but they were nowhere to be seen. Sticking to the plan, we waited patiently for our parents to arrive. We sat there for what seemed like two hours (but which was probably much less), until eventually we decided that my brother would stay put and I would go looking around, at which point it didn't take long for me to find my father, who was walking around looking for us.

    What went wrong? Well, the problem was that the map of Disneyland showed Autopia, but what the map didn't say was that there were two Autopia rides (and therefore two Autopia exits) right next to each other. My brother and I were waiting by one exit, and our parents were waiting by the other. Each of us thought the other party was simply late.

    Similarly, if a DLL goes by multiple names, you can end up with two copies of it loaded into the process, with different components talking about different copies, unaware that they are talking about different things.

    And one final reason I can think of for sticking with 8.3 file names for operating system DLLs is simply, "Well, that's the way we've always done it. All the problems with 8.3 names are well-understood and under control. If we switched to long file names, we'd end up discovering a whole new set of problems. Why mess with something that works if it isn't broken?"

    Better the devil you know.

    Exercise: Why is it okay for the .NET Framework to use long file names for their managed DLLs?

    Nitpicker's Corner

    †s/operating system/Windows operating system/. Apparently nothing is obvious from context any more.

  • The Old New Thing

    Polling by sleeping versus polling by waiting with a timeout

    • 34 Comments

    Commenter Francois Boucher asks it's better to write a background worker thread that polls with Sleep() and a flag, or polls by waiting for an event with a timeout?

    // method A
    while (fKeepGoing) {
     .. a little work ..
     Sleep(50);
    }
    
    // method B
    do {
     .. a little work ..
    } while (WaitForSingleObject(hEvent, 50) == WAIT_TIMEOUT);
    

    "Which scenario is better? The first one uses only 1 handle for the thread. The second one will use 2. But is the first scenario wasting more thread time? Is it worth using the event (kernel object)?"

    Yeah, whatever.

    I don't quite understand why you want to pause every so often. Why not just do the work at low priority? When there are more important things to do, your background thread will stop doing whatever it's doing. When there is an available CPU, then your background thread will do its thing as fast as it can (until something with higher priority arrives).

    The only thing I can think of is that by adding the pauses, your program won't look like it's consuming 100% CPU while the background processing is going on. Which is true, but then again, that's not much consolation. "Wow, with these changes, my spell check takes only 10% of the CPU. But then again, it also takes 10 times as long." Is that an improvement? You made your customer wait ten times as long for the document to be spell checked. That's ten times less battery life for your laptop.

    And generally speaking, polling should be avoided because it carries negative consequences for system performance. So we're basically asking, "Which is better, hammering with a screwdriver or hammering with pliers?"

    But let's say for the sake of argument that this "back off periodically" polling loop is actually the right design, and the only argument is which of the above two methods is "better" in terms of the criteria listed above (handles, thread time).

    It still doesn't matter.

    Method A has one fewer handle, but one more flag. So the total number of things you have to keep track of is the same either way.

    "Oh, but I save the overhead of an additional handle."

    Dude, you're already burning a thread. A single handle to an event is noise compared to the cost of a thread.

    "But I save the cost of validating the handle and following the reference to the underlying kernel object."

    Dude, you're about to go to sleep for 50 milliseconds. Saving a few thousand clocks is noise compared to 50 milliseconds.

    The flag method does have some other problems, none of which are deal-breaking, but are things to bear in mind.

    First, that flag that you're checking. There's no synchronization on that variable, so the background thread may run a smidge longer than necessary because the change hasn't yet propagated to the CPU running the loop. Similarly, the sleep loop does take a little longer to notice that it should stop working. If the fKeepGoing flag is set to FALSE during the sleep, the sleep will still run to completion before the loop finds out.

    In the grand scheme of things, however, the extra 50 to 100 milliseconds are probably not a big deal. The background thread is a little slow to shut down, that's all. The user will probably not even notice that the CPU meter was higher than normal for an additional tenth of a second. After all, the typical human reaction time is 100 milliseconds anyway.

    I'm assuming that the code that signals the background thread doesn't sit around waiting for the background thread to clean up. If it does, then this 100ms delay may start to combine with other delays to turn into something the user may start to notice. A tenth of a second here, a tenth of a second there, soon you may find yourself accumulating a half second's delay, and that's a delay the human brain can perceive.

  • The Old New Thing

    How do I mark a shortcut file as requiring elevation?

    • 25 Comments

    Specifying whether elevation is required is typically something that is the responsibility of the program. This is done by adding a requestedExecutionLevel element to your manifest. (Bart De Smet shows you how. Calvin Hsia does the same for your Visual FoxPro programs.) But if the program you're running doesn't have such a manifest—maybe it's an old program that you don't have any control over—you can create a shortcut to the program and mark the shortcut as requiring elevation.

    To do this, you set the SLDF_RUNAS_USER flag in the shortcut attributes. Here's a skeleton program that sets the flag on the shortcut whose path is passed on the command line. For expository purposes, I've skimped on the error reporting, and just to shake things up, I've used ATL smart pointers.

    #include <windows.h>
    #include <shlobj.h>
    #include <atlbase.h>
    
    void MarkShortcutRunAs(LPCWSTR pszShortcut)
    {
     CComPtr<IPersistFile> sppf;
     if (FAILED(sppf.CoCreateInstance(CLSID_ShellLink))) return;
     if (FAILED(sppf->Load(pszShortcut, STGM_READWRITE))) return;
     CComQIPtr<IShellLinkDataList> spdl(sppf);
     if (!spdl) return;
     DWORD dwFlags;
     if (FAILED(spdl->GetFlags(&dwFlags))) return;
     dwFlags |= SLDF_RUNAS_USER;
     if (FAILED(spdl->SetFlags(dwFlags))) return;
     if (FAILED(sppf->Save(NULL, TRUE))) return;
     wprintf(L"Succeeded\n");
    }
    
    int __cdecl wmain(int argc, wchar_t *argv[])
    {
     if (argc == 2 && SUCCEEDED(CoInitialize(NULL))) {
      MarkShortcutRunAs(argv[1]);
      CoUninitialize();
     }
     return 0;
    }
    

    There's not really much to this program. It creates a shell link object (CLSID_ShellLink) and asks it to load from the file whose path is given on the command line. It then uses IShellLinkDataList::GetFlags and IShellLinkDataList::SetFlags to fetch the old flags and set new flags that include SLDF_RUNAS_USER. Once that's done, it saves the result back out.

    The hard part was knowing that the SLDF_RUNAS_USER flag existed in the first place.

    (I fear that most people will read this article and say, "Awesome! My program requires elevation, and this is how I can mark my Start menu shortcut to prompt for elevation. Thanks, Raymond!" These people will have completely ignored the opening paragraph, which explains that that is the wrong thing to do.)

  • The Old New Thing

    How to insert a large number of items into a treeview efficiently

    • 18 Comments

    Just a quick tip today.

    If you need to insert a large number of items into a treeview, like tens of thousands, then it's much more efficient to insert them "backwards". (I'm ignoring for now the usability question of having a treeview that large in the first place.) In other words, instead of

    for (i = 0; i < array.Count(); i++) {
     TVINSERTSTRUCT tvis;
     tvis.hParent = hParentNode;
     tvis.hInsertAfter = TVIF_LAST;
     tvis.item.mask = TVIF_TEXT;
     item.item.pszText = array[i].Text();
     TreeView_InsertItem(hwndTreeView, &tvis);
    }
    
    do it this way:
    for (i = array.Count() - 1; i >= 0; i--) {
     TVINSERTSTRUCT tvis;
     tvis.hParent = hParentNode;
     tvis.hInsertAfter = TVIF_FIRST;
     tvis.item.mask = TVIF_TEXT;
     item.item.pszText = array[i].Text();
     TreeView_InsertItem(hwndTreeView, &tvis);
    }
    

    Why is backwards-insertion faster?

    It has to do with the annoying hInsert­After parameter. To validate that the hInsert­After parameter is valid, the treeview needs to verify that the hInsert­After is a valid child of the hParent, and this is done by walking the parent's children looking for a match. The sooner you find the match, the faster the validation completes. (And if you pass TVI_LAST, then the treeview needs to walk to the end of the child list.)

    You'd think that you could verify the parent/child relationship by just doing a Tree­View_Get­Parent(hInsert­After), but that turns out not to be strict enough, because hInsert­After might itself not be a valid parameter. If hInsert­After is a bogus value, then you may crash when you try to read its Parent property. That's if you're lucky. If you're not lucky, then the random memory that hInsert­After points to might look just enough like a valid HTREEITEM that you end up inserting the new node after a completely bogus node, and now the treeview has become corrupted.

    Sure, you got the same problem if you passed a garbage HTREEITEM to Tree­View_Get­Parent, but in that case, it's just garbage in garbage out. Nothing gets corrupted; the application just gets a garbage result. But in the case of Tree­View_Insert­Item, the treeview is going to update its internal data structures based on the garbage you passed in, and that means that the treeview winds up corrupted.

    To ensure that the value passed in is valid, the treeview checks it against the list of valid values for hInsert­After. And therefore, you get better performance if the valid value you pass is the one that it checks first.

    (As it happens, a lot of programs pass garbage for hInsert­After, so this defensive validation step is absolutely necessary.)

    You might say that the treeview could have a one-off optimization for the special TVI_LAST value by remembering the last child so it can be located quickly. The question is whether the complexity of adding that optimization is worth the cost: Any tree rearranging function would have to update the cached value, and if you miss a spot, then the treeview ends up corrupted. That's a pretty high-risk optimization you're suggesting there.

    And think about it this way: You're worrying about a tree where a single node has tens of thousands of children, something which (I am no longer ignoring) is a horrible user interface. That's like a list box with tens of thousands of items, or a dialog box with tens of thousands of checkboxes. You'll probably want to consider a better way of presenting the information than in a tree view that goes on for hundreds of screens. The treeview isn't optimized for this case because you don't optimize for the case where somebody is mis-using your system.

  • The Old New Thing

    Which windows appear in the Alt+Tab list?

    • 16 Comments

    Commenter Phil Quirk wants to know what the rules are for determining which windows appear in the Alt+Tab list. It's actually pretty simple although hardly anything you'd be able to guess on your own. Note: The details of this algorithm are an implementation detail. It can change at any time, so don't rely on it. In fact, it already changed with Flip and Flip3D; I'm just talking about the Classic Alt+Tab window here.

    For each visible window, walk up its owner chain until you find the root owner. Then walk back down the visible last active popup chain until you find a visible window. If you're back to where you're started, then put the window in the Alt+Tab list. In pseudo-code:

    BOOL IsAltTabWindow(HWND hwnd)
    {
     // Start at the root owner
     HWND hwndWalk = GetAncestor(hwnd, GA_ROOTOWNER);
    
     // See if we are the last active visible popup
     HWND hwndTry;
     while ((hwndTry = GetLastActivePopup(hwndWalk)) != hwndTry) {
      if (IsWindowVisible(hwndTry)) break;
      hwndWalk = hwndTry;
     }
     return hwndWalk == hwnd;
    }
    

    The purpose of this algorithm is to assign the most meaningful representative winow from each cluster of windows related by ownership. (Notice that the algorithm doesn't care whether the owned window is modal or non-modal.)

    At least that's the simple rule if you're not playing crazy window style games. The WS_EX_TOOLWINDOW and WS_EX_APPWINDOW extended styles were created so people can play games and put their window in the Alt+Tab list or take it out even if the simple rule would normally have decided otherwise. This is one of those "Okay, if you think you're smarter than Windows, here's your chance to prove it" options. Personally, I would avoid them since it makes your window behave differently from the rest of the windows in the system.

    A window with the WS_EX_TOOLWINDOW extended style is treated as if it weren't visible, even if it is. A window with the WS_EX_APPWINDOW extended style is treated as if it has no owner, even if it does.

    Once you start adding these extended styles, you enter the world of "I'm trying to work around the rules" and the result is typically even worse confusion than what you had without them.

    I'm not sure what the original commenter is getting at. The window hierarchy described in the suggestion (which doesn't make it so much a suggestion as it is a request for me to debug their problem) says that window C is modal on both windows A and B, which doesn't make sense to me, since a window has only one owner.

    The algorithm for choosing the Alt+Tab representative from each cluster of windows may not be the best, but it's what we have. I wouldn't be surprised if the details are tweaked from time to time. No, wait, let me rephrase that. I know that the details are tweaked from time to time. The spirit of the operation is preserved (to show the windows the user can switch to, using the most "natural" candidate for each cluster of windows), but the specific details may be fined-tuned as the concept of "naturalness" is refined.

Page 4 of 450 (4,494 items) «23456»