• The Old New Thing

    Do not overload the E_NOINTERFACE error

    • 26 Comments

    One of the more subtle ways people mess up IUnknown::QueryInterface is returning E_NOINTERFACE when the problem wasn't actually an unsupported interface. The E_NOINTERFACE return value has very specific meaning. Do not use it as your generic "gosh, something went wrong" error. (Use an appropriate error such as E_OUTOFMEMORY or E_ACCESSDENIED.)

    Recall that the rules for IUnknown::QueryInterface are that (in the absence of catastrophic errors such as E_OUTOFMEMORY) if a request for a particular interface succeeds, then it must always succeed in the future for that object. Similarly, if a request fails with E_NOINTERFACE, then it must always fail in the future for that object.

    These rules exist for a reason.

    In the case where COM needs to create a proxy for your object (for example, to marshal the object into a different apartment), the COM infrastructure does a lot of interface caching (and negative caching) for performance reasons. For example, if a request for an interface fails, COM remembers this so that future requests for that interface are failed immediately rather than being marshalled to the original object only to have the request fail anyway. Requests for unsupported interfaces are very common in COM, and optimizing that case yields significant performance improvements.

    If you start returning E_NOINTERFACE for problems other than "The object doesn't support this interface", COM will assume that the object really doesn't support the interface and may not ask for it again even if you do. This in turn leads to very strange bugs that defy debugging: You are at a call to IUnknown::QueryInterface, you set a breakpoint on your object's implementation of IUnknown::QueryInterface to see what the problem is, you step over the call and get E_NOINTERFACE back without your breakpoint ever hitting. Why? Because at some point in the past, you said you didn't support the interface, and COM remembered this and "saved you the trouble" of having to respond to a question you already answered. The COM folks tell me that they and their comrades in product support end up spending hours debugging customer's problems like "When my computer is under load, sometimes I start getting E_NOINTERFACE for interfaces I definitely support."

    Save yourself and the COM folks several hours of frustration. Don't return E_NOINTERFACE unless you really mean it.

  • The Old New Thing

    Reading the output of a command into a batch file variable

    • 28 Comments

    It's Day Two of Batch File Week. Don't worry, it'll be over in a few days.

    There is no obvious way to read the output of a command into a batch file variable. In unix-style shells, this is done via backquoting.

    x=`somecommand`
    

    The Windows command processor does not have direct backquoting, but you can fake it by abusing the FOR command. Here's the evolution:

    The /F flag to the FOR command says that it should open the file you pass in parentheses and set the loop variable to the contents of each line.

    for /f %%i in (words.txt) do echo [%%i]
    

    The loop variable in the FOR command takes one percent sign if you are executing it directly from the command prompt, but two percent signs if you are executing it from a batch file. I'm going to assume you're writing a batch file, so if you want to practice from the command line, remember to collapse the double percent signs to singles.

    I'm cheating here because I know that words.txt contains one word per line. By default, the FOR command sets the loop variable to the first word of each line. If you want to capture the entire line, you need to change the delimiter.

    for /f "delims=" %%i in (names.txt) do echo [%%i]
    

    There are other options for capturing say the first and third word or whatever. See the FOR /? online help for details.

    Now, parsing files is not what we want, but it's closer. You can put the file name in single quotes to say "Instead of opening this file and reading the contents, I want you to run this command and read the contents." For example, suppose you have a program called printappdir which outputs a directory, and you want a batch file that changes to that directory.

    for /f "delims=" %%i in ('printappdir') do cd "%%i"
    

    We ask the FOR command to run the printappdir program and execute the command cd "%%i" for each line of output. Since the program has only one line of output, the loop executes only once, and the result is that the directory is changed to the path that the printappdir program prints.

    If you want to capture the output into a variable, just update the action:

    for /f %%i in ('printappdir') do set RESULT=%%i
    echo The directory is %RESULT%
    

    If the command has multiple lines of output, then this will end up saving only the last line, since previous lines get overwritten by subsequent iterations.

    But what if the line you want to save isn't the last line? Or what if you don't want the entire line?

    If the command has multiple lines of output and you're interested only in a particular one, you can filter it in the FOR command itself...

    for /f "tokens=1-2,14" %%i in ('ipconfig') do ^
        if "%%i %%j"=="IPv4 Address." set IPADDR=%%k
    

    The above command asked to execute the ipconfig command and extract the first, second, and fourteenth words into loop variable starting with %i. In other words, %i gets the first word, %j gets the second word, and %k gets the fourteenth word. (Exercise: What if you want to extract more than 26 words?)

    The loop then checks each line to see if it begins with "IPv4 Address.", and if so, it saves the fourteenth word (the IP address itself) into the IPADDR variable.

    How did I know that the IP address was the fourteenth word? I counted!

       IPv4 Address. . . . . . . . . . . : 192.168.1.1
       ---- -------- - - - - - - - - - - - -----------
         1      2    3 4 5 6 7 8 9  11  13      14
                                  10  12
    

    That's also why my test includes the period after Address: The first dot comes right after the word Address without an intervening space, so it's considered part of the second "word".

    Somebody thought having the eye-catching dots would look pretty, but didn't think about how it makes parsing a real pain in the butt. (Note also that the above script works only for US-English systems, since the phrase IPv4 Address will change based on your current language.)

    Instead of doing the searching yourself, you can have another program do the filtering, which is important if the parsing you want is beyond the command prompt's abilities.

    for /f "tokens=14" %%i in ('ipconfig ^| findstr /C:"IPv4 Address"') do ^
      set IPADDR=%%i
    

    This alternate version makes the findstr program do the heavy lifting, and then saves the fourteenth word. (But this version will get fooled by the line Autoconfiguration IPv4 Address.)

    Yes I know that you can do this in PowerShell

    foreach ($i in Get-WmiObject Win32_NetworkAdapterConfiguration) {
      if ($i.IPaddress) { $i.IPaddress[0] }
    }
    

    You're kind of missing the point of Batch File Week.

  • The Old New Thing

    VirtualLock only locks your memory into the working set

    • 60 Comments

    When you lock memory with VirtualLock it locks the memory into your process's working set. It doesn't mean that the memory will never be paged out. It just means that the memory won't be paged out as long as there is a thread executing in your process, because a process's working set need be present in memory only when the process is actually executing.

    (Earlier versions of the MSDN documentation used to say this more clearly. At some point, the text changed to say that it locked the memory physically rather than locking it into the working set. I don't know who changed it, but it was a step backwards.)

    The working set is the set of pages that the memory manager will keep in memory while your program is running, because it is the set of pages that the memory manager predicts your program accesses frequently, so keeping those pages in memory when your program is executing keeps your program running without taking a ridiculous number of page faults. (Of course, if the prediction is wrong, then you get a ridiculous number of page faults anyway.)

    Now look at the contrapositive: If all the threads in your process are blocked, then the working set rules do not apply since the working set is needed only when your process is executing, which it isn't. If all your threads are blocked, then the entire working set is eligible for being paged out.

    I've seen people use VirtualLock expecting that it prevents the memory from being written to the page file. But as you see from the discussion above, there is no such guarantee. (Besides, if the user hibernates the computer, all the pages that aren't in the page file are going to get written to the hibernation file, so they'll end up on disk one way or another.)

    Even if you've magically managed to prevent the data from being written to the page file, you're still vulnerable to another process calling ReadProcessMemory to suck the data out of your address space.

    If you really want to lock memory, you can grant your process the SeLockMemoryPrivilege privilege and use the AWE functions to allocate non-pageable memory. Mind you, this is generally considered to be an anti-social thing to do on a paging system. The AWE functions were designed for large database programs that want to manage their paging manually. And they still won't prevent somebody from using ReadProcessMemory to suck the data out of your address space.

    If you have relatively small chunks of sensitive data, the solution I've seen recommended is to use CryptProtectData and CryptUnprotectData. The encryption keys used by these functions are generated pseudo-randomly at boot time and are kept in kernel mode. (Therefore, nobody can ReadProcessMemory them, and they won't get captured by a user-mode crash dump.) Indeed, this is the mechanism that many components in Windows 2003 Server to reduce the exposure of sensitive information in the page file and hibernation file.

    Follow-up: I've been informed by the memory manager folks that the working set interpretation was overly conservative and that in practice, the memory that has been virtually locked won't be written to the pagefile. Of course, the other concerns still apply, so you still have to worry about the hibernation file and another process sucking the data out via ReadProcessMemory.

  • The Old New Thing

    Why does Windows not recognize my USB device as the same device if I plug it into a different port?

    • 66 Comments

    You may have noticed that if you take a USB device and plug it into your computer, Windows recognizes it and configures it. Then if you unplug it and replug it into a different USB port, Windows gets a bout of amnesia and thinks that it's a completely different device instead of using the settings that applied when you plugged it in last time. Why is that?

    The USB device people explained that this happens when the device lacks a USB serial number.

    Serial numbers are optional on USB devices. If the device has one, then Windows recognizes the device no matter which USB port you plug it into. But if it doesn't have a serial number, then Windows treats each appearance on a different USB port as if it were a new device.

    (I remember that one major manufacturer of USB devices didn't quite understand how serial numbers worked. They gave all of their devices serial numbers, that's great, but they all got the same serial number. Exciting things happened if you plugged two of their devices into a computer at the same time.)

    But why does Windows treat it as a different device if it lacks a serial number and shows up on a different port? Why can't it just say, "Oh, there you are, over there on another port."

    Because that creates random behavior once you plug in two such devices. Depending on the order in which the devices get enumerated by Plug and Play, the two sets of settings would get assigned seemingly randomly at each boot. Today the settings match up one way, but tomorrow when the devices are enumerated in the other order, the settings are swapped. (You get similarly baffling behavior if you plug in the devices in different order.)

    In other words: Things suck because (1) things were already in bad shape—this would not have been a problem if the device had a proper serial number—and (2) once you're in this bad state, the alternative sucks more. The USB stack is just trying to make the best of a bad situation without making it any worse.

  • The Old New Thing

    What determines which programs show up on the front page of the Windows XP Start menu?

    • 57 Comments

    The principle is that programs you've run most often recently are the ones that show up on the front page of the Start menu. At least, that's what we started with, but it turns out that some fine-tuning was needed in order to get the experience to be more "natural".

    The basic rule is that each time you launch a program, it "earns a point", and the longer you don't launch a program, the more points it loses. The Start menu then shows the programs that have the most points. That's about all I'm going to say about the mechanics of point-winning for a variety of reasons.

    • The details of the method by which the programs on the Start menu are selected are still patent-pending.†
    • The details are also extraordinarily boring. I get drowsy every time I have to explain them.
    • Exposing the details would just encourage people to "game the system" in order to improve the placement of their programs on the Start menu.
    • Formally documenting the details would implicitly grant permission for people to rely on those details. This would prevent the Start menu designers from making adjustments to the rules (for example, to combat people who "game the system") or to scrap the process entirely and replace it with a whole new method.
    • It's not this part of the Start menu selection algorithm that puzzles people.

    After the basic rule is applied, the fine-tuning and detail-following kick in. Those are the parts that are puzzling to most people. The next several entries will go into many of the subtleties and fine-tuning behind the Start menu's list of frequently-used programs.

    Now, you may wonder about all these subtleties and whether they're really necessary, but it is these little fine-tuning steps that make the final result more useful and feel natural.

    Please hold off your questions until the (two-week!) series is complete, because I suspect a later entry will answer them. (This series is an expansion upon the TechNet column on the same topic. If you've read the TechNet article, then a lot of this series will be review.)

    Pre-emptive snarky comment

    †"Software patents suck!" It's irrelevant what your or my opinion of software patents is. So long as they are legal, they will exist, and you and I will just have to deal with it. If you want a change, write to your congressman. Support candidates whose position on software patents is compatible with yours. Complaining to me accomplishes nothing.‡ It's sad that I have to write this, but any time somebody writes the word "patent" the comments degenerate into flamology about patents. I have a few future entries about patents; the response to this article will determine whether they stay on the schedule or quietly vanish like the stories about Bob.

    ‡Well, it does accomplish something: It gives me another reason to stop blogging.*

    *Along with the people who keep bugging me about using daggers instead of asterisks. Hint: Asterisks already mean something in computer programming.

  • The Old New Thing

    Why do operating system files still adhere to the old 8.3 naming convention?

    • 44 Comments

    Commenter Brian Reiter asks a duplicate of a question that was already submitted to the Suggestion Box: Darren asks why operating system† files still (for the most part) adhere to the old 8.3 naming convention.

    There are a few reasons I can think of. I'm not saying that these are the reasons; I'm just brainstorming.

    First, of course, the name of a DLL cannot change once it has been chosen, because that would break programs which linked to that DLL by its old name. Windows 95 did not require the system volume and user profile volume to support long file names, although that was certainly the case by default. Companies which used roaming profiles or redirected folders may have had a heavy investment in servers which did not support long file names. Therefore, all system files on Windows 95 had to conform to the 8.3 naming convention.

    I believe that Windows NT permitted the system volume to be a short-file-names-only FAT partition as late as Windows 2000. Therefore, any DLL that existed in the Windows 2000 era had to conform to the 8.3 naming convention.

    Starting in Windows XP, long file names became mandatory, and a few system files such as shellstyle.dll waded tentatively into the long file name world. (The .NET Framework folks jumped in with both feet with their managed DLLs, but notice that their unmanaged DLLs like mscoree.dll still conform to 8.3.) But the waters in this world can be treacherous for operating system components.

    First of all, you have to worry about the automatically-generated short name. Suppose the operating system setup program is copying the shellstyle.dll file, but there is already a file called shellstuff.dll. The short name for shellstuff.dll will probably be SHELLS~1.DLL, and therefore the short name for shellstyle.dll will likely be SHELLS~2.DLL. Now, this may not be a big deal, except that some programs like to hard-code a file's short name. (There are a lot of programs that assume that the Program Files directory is C:\PROGRA~1, for example.)

    Furthermore, you can create confusion if the same DLL is loaded by both its short and long names, since the loader treats them as distinct:

    #include <stdio.h>
    #include <windows.h>
    
    int __cdecl main(int argc, char **argv)
    {
     printf("%p\n", LoadLibrary("SHELLS~1.DLL"));
     printf("%p\n", LoadLibrary("SHELLSTYLE.DLL"));
     return 0;
    }
    

    If you run this program, you will get something like this:

    6F2C0000
    00340000
    

    Even though the two paths refer to the same DLL, the loader treats them as different, and you end up with two copies of the same DLL loaded into memory. Now things get confusing, since you now have two sets of global variables, and if two components both use SHELLSTYLE.DLL but one used the short name and the other the long name, things get exciting when those two components try to talk about what they think is the same thing.

    It's like that time when I was a child and our family took a trip to Disneyland. Our parents put my brother and me on the gondola ride, and upon arrival at the other end, we were to go to the Autopia ride which was right next door. The plan was that our parents would meet us at the exit to Autopia. When my brother and I exited Autopia, we expected our parents to be waiting there for us, but they were nowhere to be seen. Sticking to the plan, we waited patiently for our parents to arrive. We sat there for what seemed like two hours (but which was probably much less), until eventually we decided that my brother would stay put and I would go looking around, at which point it didn't take long for me to find my father, who was walking around looking for us.

    What went wrong? Well, the problem was that the map of Disneyland showed Autopia, but what the map didn't say was that there were two Autopia rides (and therefore two Autopia exits) right next to each other. My brother and I were waiting by one exit, and our parents were waiting by the other. Each of us thought the other party was simply late.

    Similarly, if a DLL goes by multiple names, you can end up with two copies of it loaded into the process, with different components talking about different copies, unaware that they are talking about different things.

    And one final reason I can think of for sticking with 8.3 file names for operating system DLLs is simply, "Well, that's the way we've always done it. All the problems with 8.3 names are well-understood and under control. If we switched to long file names, we'd end up discovering a whole new set of problems. Why mess with something that works if it isn't broken?"

    Better the devil you know.

    Exercise: Why is it okay for the .NET Framework to use long file names for their managed DLLs?

    Nitpicker's Corner

    †s/operating system/Windows operating system/. Apparently nothing is obvious from context any more.

  • The Old New Thing

    When does an object become available for garbage collection?

    • 13 Comments

    As we saw last time, garbage collection is a method for simulating an infinite amount of memory in a finite amount of memory. This simulation is performed by reclaiming memory once the environment can determine that the program wouldn't notice that the memory was reclaimed. There are a variety of mechanism for determining this. In a basic tracing collector, this determination is made by taking the objects which the program has definite references to, then tracing references from those objects, contining transitively until all accessible objects are found. But what looks like a definite reference in your code may not actually be a definite reference in the virtual machine: Just because a variable is in scope doesn't mean that it is live.

    class SomeClass {
     ...
     string SomeMethod(string s, bool reformulate)
     {
      OtherClass o = new OtherClass(s);
      string result = Frob(o);
      if (reformulate) Reformulate();
      return result;
     }
    }
    

    For the purpose of this discussion, assume that the Frob method does not retain a reference to the object o passed as a parameter. When does the OtherClass object o become eligible for collection? A naïve answer would be that it becomes eligible for collection at the closing-brace of the SomeMethod method, since that's when the last reference (in the variable o) goes out of scope.

    A less naïve answer would be that it become eligible for collection after the return value from Frob is stored to the local variable result, because that's the last line of code which uses the variable o.

    A closer study would show that it becomes eligible for collection even sooner: Once the call to Frob returns, the variable o is no longer accessed, so the object could be collected even before the result of the call to Frob is stored into the local variable result. Optimizing compilers have known this for quite some time, and there is a strong likelihood that the variables o and result will occupy the same memory since their lifetimes do not overlap. Under such conditions, the code generation for the statement could very well be something like this:

      mov ecx, esi        ; load "this" pointer into ecx register
      mov edx, [ebp-8]    ; load parameter ("o") into edx register
      call SomeClass.Frob ; call method
      mov [ebp-8], eax    ; re-use memory for "o" as "result"
    

    But this closer study wasn't close enough. The OtherClass object o becomes eligible for collection even before the call to Frob returns! It is certainly eligible for collection at the point of the ret instruction which ends the Frob function: At that point, the Frob has finished using the object and won't access it again. Although somewhat of a technicality, it does illustrate that

    An object in a block of code can become eligible for collection during execution of a function it called.

    But let's dig deeper. Suppose that Frob looked like this:

    string Frob(OtherClass o)
    {
     string result = FrobColor(o.GetEffectiveColor());
    }
    

    When does the OtherClass object become eligible for collection? We saw above that it is certainly eligible for collection as soon as FrobColor returns, because the Frob method doesn't use o any more after that point. But in fact it is eligible for collection when the call to GetEffectiveColor returns—even before the FrobColor method is called—because the Frob method doesn't use it once it gets the effective color. This illustrates that

    A parameter to a method can become eligible for collection while the method is still executing.

    But wait, is that the earliest the OtherClass object becomes eligible for collection? Suppose that the OtherClass.GetEffectiveColor method went like this:

    Color GetEffectiveColor()
    {
     Color color = this.Color;
     for (OtherClass o = this.Parent; o != null; o = o.Parent) {
      color = BlendColors(color, o.Color);
     }
     return color;
    }
    

    Notice that the method doesn't access any members from its this pointer after the assignment o = this.Parent. As soon as the method retrieves the object's parent, the object isn't used any more.

      push ebp                    ; establish stack frame
      mov ebp, esp
      push esi
      push edi
      mov esi, ecx                ; enregister "this"
      mov edi, [ecx].color        ; color = this.Color // inlined
      jmp looptest
    loop:
      mov ecx, edi                ; load first parameter ("color")
      mov edx, [esi].color        ; load second parameter ("o.Color")
      call OtherClass.BlendColors ; BlendColors(color, o.Color)
      mov edi, eax
    looptest:
      mov esi, [esi].parent       ; o = this.Parent (or o.Parent) // inlined
      // "this" is now eligible for collection
      test esi, esi               ; if o == null
      jnz loop                    ; then rsetart loop
      mov eax, edi                ; return value
      pop edi
      pop esi
      pop ebp
      ret
    

    The last thing we ever do with the Other­Class object (presented in the Get­Effective­Color function by the keyword this) is fetch its parent. As soon that's done (indicated at the point of the comment, when the line is reached for the first time), the object becomes eligible for collection. This illustrates the perhaps-surprising result that

    An object can become eligible for collection during execution of a method on that very object.

    In other words, it is possible for a method to have its this object collected out from under it!

    A crazy way of thinking of when an object becomes eligible for collection is that it happens once memory corruption in the object would have no effect on the program. (Or, if the object has a finalizer, that memory corruption would affect only the finalizer.) Because if memory corruption would have no effect, then that means you never use the values any more, which means that the memory may as well have been reclaimed out from under you for all you know.

    A weird real-world analogy: The garbage collector can collect your diving board as soon as the diver touches it for the last time—even if the diver is still in the air!

    A customer encountered the Call­GC­Keep­Alive­When­Using­Native­Resources FxCop rule and didn't understand how it was possible for the GC to collect an object while one of its methods was still running. "Isn't this part of the root set?"

    Asking whether any particular value is or is not part of the root set is confusing the definition of garbage collection with its implementation. "Don't think of GC as tracing roots. Think of GC as removing things you aren't using any more."

    The customer responded, "Yes, I understand conceptually that it becomes eligible for collection, but how does the garbage collector know that? How does it know that the this object is not used any more? Isn't that determined by tracing from the root set?"

    Remember, the GC is in cahoots with the JIT. The JIT might decide to "help out" the GC by reusing the stack slot which previously held an object reference, leaving no reference on the stack and therefore no reference in the root set. Even if the reference is left on the stack, the JIT can leave some metadata behind that tells the GC, "If you see the instruction pointer in this range, then ignore the reference in this slot since it's a dead variable," similar to how in unmanaged code on non-x86 platforms, metadata is used to guide structured exception handling. (And besides, the this parameter isn't even passed on the stack in the first place.)

    The customer replied, "Gotcha. Very cool."

    Another customer asked, "Is there a way to get a reference to the instance being called for each frame in the stack? (Static methods excepted, of course.)" A different customer asked roughly the same question, but in a different context: "I want my method to walk up the stack, and if its caller is OtherClass.Foo, I want to get the this object for OtherClass.Foo so I can query additional properties from it." You now know enough to answer these questions yourself.

    Bonus: A different customer asked, "The Stack­Frame object lets me get the method that is executing in the stack frame, but how do I get the parameters passed to that method?"

  • The Old New Thing

    Polling by sleeping versus polling by waiting with a timeout

    • 34 Comments

    Commenter Francois Boucher asks it's better to write a background worker thread that polls with Sleep() and a flag, or polls by waiting for an event with a timeout?

    // method A
    while (fKeepGoing) {
     .. a little work ..
     Sleep(50);
    }
    
    // method B
    do {
     .. a little work ..
    } while (WaitForSingleObject(hEvent, 50) == WAIT_TIMEOUT);
    

    "Which scenario is better? The first one uses only 1 handle for the thread. The second one will use 2. But is the first scenario wasting more thread time? Is it worth using the event (kernel object)?"

    Yeah, whatever.

    I don't quite understand why you want to pause every so often. Why not just do the work at low priority? When there are more important things to do, your background thread will stop doing whatever it's doing. When there is an available CPU, then your background thread will do its thing as fast as it can (until something with higher priority arrives).

    The only thing I can think of is that by adding the pauses, your program won't look like it's consuming 100% CPU while the background processing is going on. Which is true, but then again, that's not much consolation. "Wow, with these changes, my spell check takes only 10% of the CPU. But then again, it also takes 10 times as long." Is that an improvement? You made your customer wait ten times as long for the document to be spell checked. That's ten times less battery life for your laptop.

    And generally speaking, polling should be avoided because it carries negative consequences for system performance. So we're basically asking, "Which is better, hammering with a screwdriver or hammering with pliers?"

    But let's say for the sake of argument that this "back off periodically" polling loop is actually the right design, and the only argument is which of the above two methods is "better" in terms of the criteria listed above (handles, thread time).

    It still doesn't matter.

    Method A has one fewer handle, but one more flag. So the total number of things you have to keep track of is the same either way.

    "Oh, but I save the overhead of an additional handle."

    Dude, you're already burning a thread. A single handle to an event is noise compared to the cost of a thread.

    "But I save the cost of validating the handle and following the reference to the underlying kernel object."

    Dude, you're about to go to sleep for 50 milliseconds. Saving a few thousand clocks is noise compared to 50 milliseconds.

    The flag method does have some other problems, none of which are deal-breaking, but are things to bear in mind.

    First, that flag that you're checking. There's no synchronization on that variable, so the background thread may run a smidge longer than necessary because the change hasn't yet propagated to the CPU running the loop. Similarly, the sleep loop does take a little longer to notice that it should stop working. If the fKeepGoing flag is set to FALSE during the sleep, the sleep will still run to completion before the loop finds out.

    In the grand scheme of things, however, the extra 50 to 100 milliseconds are probably not a big deal. The background thread is a little slow to shut down, that's all. The user will probably not even notice that the CPU meter was higher than normal for an additional tenth of a second. After all, the typical human reaction time is 100 milliseconds anyway.

    I'm assuming that the code that signals the background thread doesn't sit around waiting for the background thread to clean up. If it does, then this 100ms delay may start to combine with other delays to turn into something the user may start to notice. A tenth of a second here, a tenth of a second there, soon you may find yourself accumulating a half second's delay, and that's a delay the human brain can perceive.

  • The Old New Thing

    How do I mark a shortcut file as requiring elevation?

    • 25 Comments

    Specifying whether elevation is required is typically something that is the responsibility of the program. This is done by adding a requestedExecutionLevel element to your manifest. (Bart De Smet shows you how. Calvin Hsia does the same for your Visual FoxPro programs.) But if the program you're running doesn't have such a manifest—maybe it's an old program that you don't have any control over—you can create a shortcut to the program and mark the shortcut as requiring elevation.

    To do this, you set the SLDF_RUNAS_USER flag in the shortcut attributes. Here's a skeleton program that sets the flag on the shortcut whose path is passed on the command line. For expository purposes, I've skimped on the error reporting, and just to shake things up, I've used ATL smart pointers.

    #include <windows.h>
    #include <shlobj.h>
    #include <atlbase.h>
    
    void MarkShortcutRunAs(LPCWSTR pszShortcut)
    {
     CComPtr<IPersistFile> sppf;
     if (FAILED(sppf.CoCreateInstance(CLSID_ShellLink))) return;
     if (FAILED(sppf->Load(pszShortcut, STGM_READWRITE))) return;
     CComQIPtr<IShellLinkDataList> spdl(sppf);
     if (!spdl) return;
     DWORD dwFlags;
     if (FAILED(spdl->GetFlags(&dwFlags))) return;
     dwFlags |= SLDF_RUNAS_USER;
     if (FAILED(spdl->SetFlags(dwFlags))) return;
     if (FAILED(sppf->Save(NULL, TRUE))) return;
     wprintf(L"Succeeded\n");
    }
    
    int __cdecl wmain(int argc, wchar_t *argv[])
    {
     if (argc == 2 && SUCCEEDED(CoInitialize(NULL))) {
      MarkShortcutRunAs(argv[1]);
      CoUninitialize();
     }
     return 0;
    }
    

    There's not really much to this program. It creates a shell link object (CLSID_ShellLink) and asks it to load from the file whose path is given on the command line. It then uses IShellLinkDataList::GetFlags and IShellLinkDataList::SetFlags to fetch the old flags and set new flags that include SLDF_RUNAS_USER. Once that's done, it saves the result back out.

    The hard part was knowing that the SLDF_RUNAS_USER flag existed in the first place.

    (I fear that most people will read this article and say, "Awesome! My program requires elevation, and this is how I can mark my Start menu shortcut to prompt for elevation. Thanks, Raymond!" These people will have completely ignored the opening paragraph, which explains that that is the wrong thing to do.)

  • The Old New Thing

    How to insert a large number of items into a treeview efficiently

    • 18 Comments

    Just a quick tip today.

    If you need to insert a large number of items into a treeview, like tens of thousands, then it's much more efficient to insert them "backwards". (I'm ignoring for now the usability question of having a treeview that large in the first place.) In other words, instead of

    for (i = 0; i < array.Count(); i++) {
     TVINSERTSTRUCT tvis;
     tvis.hParent = hParentNode;
     tvis.hInsertAfter = TVIF_LAST;
     tvis.item.mask = TVIF_TEXT;
     item.item.pszText = array[i].Text();
     TreeView_InsertItem(hwndTreeView, &tvis);
    }
    
    do it this way:
    for (i = array.Count() - 1; i >= 0; i--) {
     TVINSERTSTRUCT tvis;
     tvis.hParent = hParentNode;
     tvis.hInsertAfter = TVIF_FIRST;
     tvis.item.mask = TVIF_TEXT;
     item.item.pszText = array[i].Text();
     TreeView_InsertItem(hwndTreeView, &tvis);
    }
    

    Why is backwards-insertion faster?

    It has to do with the annoying hInsert­After parameter. To validate that the hInsert­After parameter is valid, the treeview needs to verify that the hInsert­After is a valid child of the hParent, and this is done by walking the parent's children looking for a match. The sooner you find the match, the faster the validation completes. (And if you pass TVI_LAST, then the treeview needs to walk to the end of the child list.)

    You'd think that you could verify the parent/child relationship by just doing a Tree­View_Get­Parent(hInsert­After), but that turns out not to be strict enough, because hInsert­After might itself not be a valid parameter. If hInsert­After is a bogus value, then you may crash when you try to read its Parent property. That's if you're lucky. If you're not lucky, then the random memory that hInsert­After points to might look just enough like a valid HTREEITEM that you end up inserting the new node after a completely bogus node, and now the treeview has become corrupted.

    Sure, you got the same problem if you passed a garbage HTREEITEM to Tree­View_Get­Parent, but in that case, it's just garbage in garbage out. Nothing gets corrupted; the application just gets a garbage result. But in the case of Tree­View_Insert­Item, the treeview is going to update its internal data structures based on the garbage you passed in, and that means that the treeview winds up corrupted.

    To ensure that the value passed in is valid, the treeview checks it against the list of valid values for hInsert­After. And therefore, you get better performance if the valid value you pass is the one that it checks first.

    (As it happens, a lot of programs pass garbage for hInsert­After, so this defensive validation step is absolutely necessary.)

    You might say that the treeview could have a one-off optimization for the special TVI_LAST value by remembering the last child so it can be located quickly. The question is whether the complexity of adding that optimization is worth the cost: Any tree rearranging function would have to update the cached value, and if you miss a spot, then the treeview ends up corrupted. That's a pretty high-risk optimization you're suggesting there.

    And think about it this way: You're worrying about a tree where a single node has tens of thousands of children, something which (I am no longer ignoring) is a horrible user interface. That's like a list box with tens of thousands of items, or a dialog box with tens of thousands of checkboxes. You'll probably want to consider a better way of presenting the information than in a tree view that goes on for hundreds of screens. The treeview isn't optimized for this case because you don't optimize for the case where somebody is mis-using your system.

Page 4 of 441 (4,407 items) «23456»