Other

  • The Old New Thing

    Everybody thinks about garbage collection the wrong way

    • 89 Comments

    Welcome to CLR Week 2010. This year, CLR Week is going to be more philosophical than usual.

    When you ask somebody what garbage collection is, the answer you get is probably going to be something along the lines of "Garbage collection is when the operating environment automatically reclaims memory that is no longer being used by the program. It does this by tracing memory starting from roots to identify which objects are accessible."

    This description confuses the mechanism with the goal. It's like saying the job of a firefighter is "driving a red truck and spraying water." That's a description of what a firefighter does, but it misses the point of the job (namely, putting out fires and, more generally, fire safety).

    Garbage collection is simulating a computer with an infinite amount of memory. The rest is mechanism. And naturally, the mechanism is "reclaiming memory that the program wouldn't notice went missing." It's one giant application of the as-if rule.¹

    Now, with this view of the true definition of garbage collection, one result immediately follows:

    If the amount of RAM available to the runtime is greater than the amount of memory required by a program, then a memory manager which employs the null garbage collector (which never collects anything) is a valid memory manager.

    This is true because the memory manager can just allocate more RAM whenever the program needs it, and by assumption, this allocation will always succeed. A computer with more RAM than the memory requirements of a program has effectively infinite RAM, and therefore no simulation is needed.

    Sure, the statement may be obvious, but it's also useful, because the null garbage collector is both very easy to analyze yet very different from garbage collectors you're more accustomed to seeing. You can therefore use it to produce results like this:

    A correctly-written program cannot assume that finalizers will ever run at any point prior to program termination.

    The proof of this is simple: Run the program on a machine with more RAM than the amount of memory required by program. Under these circumstances, the null garbage collector is a valid garbage collector, and the null garbage collector never runs finalizers since it never collects anything.

    Garbage collection simulates infinite memory, but there are things you can do even if you have infinite memory that have visible effects on other programs (and possibly even on your program). If you open a file in exclusive mode, then the file will not be accessible to other programs (or even to other parts of your own program) until you close it. A connection that you open to a SQL server consumes resources in the server until you close it. Have too many of these connections outstanding, and you may run into a connection limit which blocks further connections. If you don't explicitly close these resources, then when your program is run on a machine with "infinite" memory, those resources will accumulate and never be released.

    What this means for you: Your programs cannot rely on finalizers keeping things tidy. Finalizers are a safety net, not a primary means for resource reclamation. When you are finished with a resource, you need to release it by calling Close or Disconnect or whatever cleanup method is available on the object. (The IDisposable interface codifies this convention.)

    Furthermore, it turns out that not only can a correctly-written program not assume that finalizers will run during the execution of a program, it cannot even assume that finalizers will run when the program terminates: Although the .NET Framework will try to run them all, a bad finalizer will cause the .NET Framework to give up and abandon running finalizers. This can happen through no fault of your own: There might be a handle to a network resource that the finalizer is trying to release, but network connectivity problems result in the operation taking longer than two seconds, at which point the .NET Framework will just terminate the process. Therefore, the above result can be strengthened in the specific case of the .NET Framework:

    A correctly-written program cannot assume that finalizers will ever run.

    Armed with this knowledge, you can solve this customer's problem. (Confusing terminology is preserved from the original.)

    I have a class that uses Xml­Document. After the class is out of scope, I want to delete the file, but I get the exception System.IO.Exception: The process cannot access the file 'C:\path\to\file.xml' because it is being used by another process. Once the progam exits, then the lock goes away. Is there any way to avoid locking the file?

    This follow-up might or might not help:

    A colleague suggested setting the Xml­Document variables to null when we're done with them, but shouldn't leaving the class scope have the same behavior?

    Bonus chatter: Finalizers are weird, since they operate "behind the GC." There are also lots of classes which operate "at the GC level", such as Weak­Reference GC­Handle and of course System.GC itself. Using these classes properly requires understanding how they interact with the GC. We'll see more on this later.

    Related reading

    Unrelated reading: Precedence vs. Associativity Vs. Order.

    Footnote

    ¹ Note that by definition, the simulation extends only to garbage-collected resources. If your program allocates external resources those external resources continue to remain subject to whatever rules apply to them.

  • The Old New Thing

    On 64-bit Windows, 32-bit programs run in an emulation layer, and if you don't like that, then don't use the emulator

    • 43 Comments

    On 64-bit Windows, 32-bit programs run in an emulation layer. This emulation layer simulates the x86 architecture, virtualizing the CPU, the file system, the registry, the environment variables, the system information functions, all that stuff. If a 32-bit program tries to look at the system, it will see a 32-bit system. For example, if the program calls the GetSystemInfo function to see what processor is running, it will be told that it's running on a 32-bit processor, with a 32-bit address space, in a world with a 32-bit sky and 32-bit birds in the 32-bit trees.

    And that's the point of the emulation: To keep the 32-bit program happy by simulating a 32-bit execution environment.

    Commenter Koro is writing an installer in the form of a 32-bit program that detects that it's running on a 64-bit system and wants to copy files (and presumably set registry entries and do other installery things) into the 64-bit directories, but the emulation layer redirects the operations into the 32-bit locations. The question is "What is the way of finding the x64 Program Files directory from a 32-bit application?"

    The answer is "It is better to work with the system than against it." If you're a 32-bit program, then you're going to be fighting against the emulator each time you try to interact with the outside world. Instead, just recompile your installer as a 64-bit program. Have the 32-bit installer detect that it's running on a 64-bit system and launch the 64-bit installer instead. The 64-bit installer will not run in the 32-bit emulation layer, so when it tries to copy a file or update a registry key, it will see the real 64-bit file system and the real 64-bit registry.

  • The Old New Thing

    Does Windows have a limit of 2000 threads per process?

    • 30 Comments

    Often I see people asking why they can't create more than around 2000 threads in a process. The reason is not that there is any particular limit inherent in Windows. Rather, the programmer failed to take into account the amount of address space each thread uses.

    A thread consists of some memory in kernel mode (kernel stacks and object management), some memory in user mode (the thread environment block, thread-local storage, that sort of thing), plus its stack. (Or stacks if you're on an Itanium system.)

    Usually, the limiting factor is the stack size.

    #include <stdio.h>
    #include <windows.h>
    
    DWORD CALLBACK ThreadProc(void*)
    {
     Sleep(INFINITE);
     return 0;
    }
    
    int __cdecl main(int argc, const char* argv[])
    {
    int i;
     for (i = 0; i < 100000; i++) {
      DWORD id;
      HANDLE h = CreateThread(NULL, 0, ThreadProc, NULL, 0, &id);
      if (!h) break;
      CloseHandle(h);
     }
     printf("Created %d threads\n", i);
     return 0;
    }
    

    This program will typically print a value around 2000 for the number of threads.

    Why does it give up at around 2000?

    Because the default stack size assigned by the linker is 1MB, and 2000 stacks times 1MB per stack equals around 2GB, which is how much address space is available to user-mode programs.

    You can try to squeeze more threads into your process by reducing your stack size, which can be done either by tweaking linker options or manually overriding the stack size passed to the CreateThread functions as described in MSDN.

      HANDLE h = CreateThread(NULL, 4096, ThreadProc, NULL,
                   STACK_SIZE_PARAM_IS_A_RESERVATION, &id);
    

    With this change, I was able to squeak in around 13000 threads. While that's certainly better than 2000, it's short of the naive expectation of 500,000 threads. (A thread is using 4KB of stack in 2GB address space.) But you're forgetting the other overhead. Address space allocation granularity is 64KB, so each thread's stack occupies 64KB of address space even if only 4KB of it is used. Plus of course you don't have free reign over all 2GB of the address space; there are system DLLs and other things occupying it.

    But the real question that is raised whenever somebody asks, "What's the maximum number of threads that a process can create?" is "Why are you creating so many threads that this even becomes an issue?"

    The "one thread per client" model is well-known not to scale beyond a dozen clients or so. If you're going to be handling more than that many clients simultaneously, you should move to a model where instead of dedicating a thread to a client, you instead allocate an object. (Someday I'll muse on the duality between threads and objects.) Windows provides I/O completion ports and a thread pool to help you convert from a thread-based model to a work-item-based model.

    Note that fibers do not help much here, because a fiber has a stack, and it is the address space required by the stack that is the limiting factor nearly all of the time.

  • The Old New Thing

    Why are INI files deprecated in favor of the registry?

    • 139 Comments

    Welcome, Slashdot readers. Remember, this Web site is for entertainment purposes only.

    Why are INI files deprecated in favor of the registry? There were many problems with INI files.

    • INI files don't support Unicode. Even though there are Unicode functions of the private profile functions, they end up just writing ANSI text to the INI file. (There is a wacked out way you can create a Unicode INI file, but you have to step outside the API in order to do it.) This wasn't an issue in 16-bit Windows since 16-bit Windows didn't support Unicode either!
    • INI file security is not granular enough. Since it's just a file, any permissions you set are at the file level, not the key level. You can't say, "Anybody can modify this section, but that section can be modified only by administrators." This wasn't an issue in 16-bit Windows since 16-bit Windows didn't do security.
    • Multiple writers to an INI file can result in data loss. Consider two threads that are trying to update an INI file. If they are running simultaneously, you can get this:
      Thread 1Thread 2
      Read INI file
      Read INI file
      Write INI file + X
      Write INI file + Y
      Notice that thread 2's update to the INI file accidentally deleted the change made by thread 1. This wasn't a problem in 16-bit Windows since 16-bit Windows was co-operatively multi-tasked. As long as you didn't yield the CPU between the read and the write, you were safe because nobody else could run until you yielded.
    • INI files can suffer a denial of service. A program can open an INI file in exclusive mode and lock out everybody else. This is bad if the INI file was being used to hold security information, since it prevents anybody from seeing what those security settings are. This was also a problem in 16-bit Windows, but since there was no security in 16-bit Windows, a program that wanted to launch a denial of service attack on an INI file could just delete it!
    • INI files contain only strings. If you wanted to store binary data, you had to encode it somehow as a string.
    • Parsing an INI file is comparatively slow. Each time you read or write a value in an INI file, the file has to be loaded into memory and parsed. If you write three strings to an INI file, that INI file got loaded and parsed three times and got written out to disk three times. In 16-bit Windows, three consecutive INI file operations would result in only one parse and one write, because the operating system was co-operatively multi-tasked. When you accessed an INI file, it was parsed into memory and cached. The cache was flushed when you finally yielded CPU to another process.
    • Many programs open INI files and read them directly. This means that the INI file format is locked and cannot be extended. Even if you wanted to add security to INI files, you can't. What's more, many programs that parsed INI files were buggy, so in practice you couldn't store a string longer than about 70 characters in an INI file or you'd cause some other program to crash.
    • INI files are limited to 32KB in size.
    • The default location for INI files was the Windows directory! This definitely was bad for Windows NT since only administrators have write permission there.
    • INI files contain only two levels of structure. An INI file consists of sections, and each section consists of strings. You can't put sections inside other sections.
    • [Added 9am] Central administration of INI files is difficult. Since they can be anywhere in the system, a network administrator can't write a script that asks, "Is everybody using the latest version of Firefox?" They also can't deploy scripts that say "Set everybody's Firefox settings to XYZ and deny write access so they can't change them."

    The registry tried to address these concerns. You might argue whether these were valid concerns to begin with, but the Windows NT folks sure thought they were.

    Commenter TC notes that the pendulum has swung back to text configuration files, but this time, they're XML. This reopens many of the problems that INI files had, but you have the major advantage that nobody writes to XML configuration files; they only read from them. XML configuration files are not used to store user settings; they just contain information about the program itself. Let's look at those issues again.

    • XML files support Unicode.
    • XML file security is not granular enough. But since the XML configuration file is read-only, the primary objection is sidestepped. (But if you want only administrators to have permission to read specific parts of the XML, then you're in trouble.)
    • Since XML configuration files are read-only, you don't have to worry about multiple writers.
    • XML configuration files files can suffer a denial of service. You can still open them exclusively and lock out other processes.
    • XML files contain only strings. If you want to store binary data, you have to encode it somehow.
    • Parsing an XML file is comparatively slow. But since they're read-only, you can safely cache the parsed result, so you only need to parse once.
    • Programs parse XML files manually, but the XML format is already locked, so you couldn't extend it anyway even if you wanted to. Hopefully, those programs use a standard-conforming XML parser instead of rolling their own, but I wouldn't be surprised if people wrote their own custom XML parser that chokes on, say, processing instructions or strings longer than 70 characters.
    • XML files do not have a size limit.
    • XML files do not have a default location.
    • XML files have complex structure. Elements can contain other elements.

    XML manages to sidestep many of the problems that INI files have, but only if you promise only to read from them (and only if everybody agrees to use a standard-conforming parser), and if you don't require security granularity beyond the file level. Once you write to them, then a lot of the INI file problems return.

  • The Old New Thing

    Why aren't console windows themed on Windows XP?

    • 124 Comments

    Commenter Andrej Budja asks why cmd.exe is not themed in Windows XP. (This question was repeated by Serge Wautier, proving that nobody checks whether their suggestion has already been submitted before adding their own. It was also asked by a commenter who goes by the name "S", and then repeated again just a few hours later, which proves again that nobody reads the comments either.) Knowledge Base article 306509 explains that this behavior exists because the command prompt window (like all console windows) is run under the ClientServer Runtime System (CSRSS), and CSRSS cannot be themed.

    But why can't CSRSS be themed?

    CSRSS runs as a system service, so any code that runs as part of CSRSS creates potential for mass havoc. The slightest mis-step could crash CSRSS, and with it the entire system. The CSRSS team decided that they didn't want to take the risk of allowing the theme code to run in their process, so they disabled theming for console windows. (There's also an architectural reason why CSRSS cannot use the theming services: CSRSS runs as a subsystem, and the user interface theming services assume that they're running as part of a Win32 program.)

    In Windows Vista, the window frame is drawn by the desktop window manager, which means that your console windows on Vista get the glassy frame just like other windows. But if you take a closer look, you will see that CSRSS itself doesn't use themed windows: Notice that the scroll bars retain the classic look.

    The window manager giveth and the window manager taketh away, for at the same time console windows gained the glassy frame, they also lost drag and drop. You used to be able to drag a file out of Explorer and drop it onto a command prompt, but if you try that in Windows Vista, nothing happens. This is a consequence of tighter security around the delivery of messages from a process running at lower integrity to one running at a higher integrity level (see UIPI). Since CSRSS is a system process, it runs at very high security level and won't let any random program (like Explorer) send it messages, such as the ones used to mediate OLE drag and drop. You'll see the same thing if you log on as a restricted administrator and then kick off an elevated copy of Notepad. You won't be able to drag a file out of Explorer and drop it onto Notepad, for the same reason.

  • The Old New Thing

    Why does Windows not recognize my USB device as the same device if I plug it into a different port?

    • 66 Comments

    You may have noticed that if you take a USB device and plug it into your computer, Windows recognizes it and configures it. Then if you unplug it and replug it into a different USB port, Windows gets a bout of amnesia and thinks that it's a completely different device instead of using the settings that applied when you plugged it in last time. Why is that?

    The USB device people explained that this happens when the device lacks a USB serial number.

    Serial numbers are optional on USB devices. If the device has one, then Windows recognizes the device no matter which USB port you plug it into. But if it doesn't have a serial number, then Windows treats each appearance on a different USB port as if it were a new device.

    (I remember that one major manufacturer of USB devices didn't quite understand how serial numbers worked. They gave all of their devices serial numbers, that's great, but they all got the same serial number. Exciting things happened if you plugged two of their devices into a computer at the same time.)

    But why does Windows treat it as a different device if it lacks a serial number and shows up on a different port? Why can't it just say, "Oh, there you are, over there on another port."

    Because that creates random behavior once you plug in two such devices. Depending on the order in which the devices get enumerated by Plug and Play, the two sets of settings would get assigned seemingly randomly at each boot. Today the settings match up one way, but tomorrow when the devices are enumerated in the other order, the settings are swapped. (You get similarly baffling behavior if you plug in the devices in different order.)

    In other words: Things suck because (1) things were already in bad shape—this would not have been a problem if the device had a proper serial number—and (2) once you're in this bad state, the alternative sucks more. The USB stack is just trying to make the best of a bad situation without making it any worse.

  • The Old New Thing

    No matter where you put an advanced setting, somebody will tell you that you are an idiot

    • 127 Comments

    There are advanced settings in Windows, settings which normal users not only shouldn't be messing with, but which they arguably shouldn't even know about, because that would give them just enough knowledge to be dangerous. And no matter where you put that advanced setting, somebody will tell you that you are an idiot.

    Here they are on an approximate scale. If you dig through the comments on this blog, you can probably find every single position represented somewhere.

    1. It's okay if the setting is hidden behind a registry key. I know how to set it myself.
    2. I don't want to mess with the registry. Put the setting in a configuration file that I pass to the installer.
    3. I don't want to write a configuration file. The program should have an Advanced button that calls up a dialog which lets the user change the advanced setting.
    4. Every setting must be exposed in the user interface.
    5. Every setting must be exposed in the user interface by default. Don't make me call up the extended context menu.
    6. The first time the user does X, show users a dialog asking if they want to change the advanced setting.

    If you implement level N, people will demand that you implement level N+1. It doesn't stop until you reach the last step, which is aggressively user-hostile. (And then there will also be people who complain that you went too far.)

    From a technical standpoint, each of the above steps is about ten to a hundred times harder than the previous one. If you put it in a configuration file, you have to write code to load a parser and extract the value. If you want an Advanced button, now you have to write a dialog box (which is already a lot of work), consult with the usability and user assistance to come up with the correct wording for the setting, write help text, provide guidance to the translators, and now since it is exposed in the user interface, you need to write automated tests and add the setting to the test matrices. It's a huge amount of work to add a dialog box, work that could be spent on something that benefits a much larger set of customers in a more direct manner.

    That's why most advanced settings hang out at level 1 or, in the case of configuring program installation, level 2. If you're so much of a geek that you want to change these advanced settings, it probably won't kill you to fire up a text editor and write a little configuration file.

    Sidebar

    Joel's count of "fifteen ways to shut down Windows" is a bit disingenuous, since he's counting six hardware affordances: "Four FN+key combinations... an on-off button... you can close the lid." Okay, fine, Joel, we'll play it your way. Your proposal to narrow it down to one "Bye" button, still leaves seven ways to shut down Windows.

    And then people will ask how to log off.

  • The Old New Thing

    We can't cut that; it's our last feature

    • 38 Comments

    Many years ago, I was asked to help a customer with a problem they were having. I don't remember the details, and they aren't important to the story anyway, but as I was investigating one of their crashes, I started to wonder why they were even doing it.

    I expressed my concerns to the customer liaison. "Why are they writing this code in the first place? The performance will be terrible, and it'll never work exactly the way they want it to."

    The customer liaison confided, "Yeah, I thought the same thing. But this is a feature they're adding to the next version of their product. The product is so far behind schedule, they've been cutting features like mad to get back on track. But they can't cut this feature. It's the last one left!"

  • The Old New Thing

    What does an invalid handle exception in LeaveCriticalSection mean?

    • 27 Comments

    Internally, a critical section is a bunch of counters and flags, and possibly an event. (Note that the internal structure of a critical section is subject to change at any time—in fact, it changed between Windows XP and Windows 2003. The information provided here is therefore intended for troubleshooting and debugging purposes and not for production use.) As long as there is no contention, the counters and flags are sufficient because nobody has had to wait for the critical section (and therefore nobody had to be woken up when the critical section became available).

    If a thread needs to be blocked because the critical section it wants is already owned by another thread, the kernel creates an event for the critical section (if there isn't one already) and waits on it. When the owner of the critical section finally releases it, the event is signaled, thereby alerting all the waiters that the critical section is now available and they should try to enter it again. (If there is more than one waiter, then only one will actually enter the critical section and the others will return to the wait loop.)

    If you get an invalid handle exception in LeaveCriticalSection, it means that the critical section code thought that there were other threads waiting for the critical section to become available, so it tried to signal the event, but the event handle was no good.

    Now you get to use your brain to come up with reasons why this might be.

    One possibility is that the critical section has been corrupted, and the memory that normally holds the event handle has been overwritten with some other value that happens not to be a valid handle.

    Another possibility is that some other piece of code passed an uninitialized variable to the CloseHandle function and ended up closing the critical section's handle by mistake. This can also happen if some other piece of code has a double-close bug, and the handle (now closed) just happened to be reused as the critical section's event handle. When the buggy code closes the handle the second time by mistake, it ends up closing the critical section's handle instead.

    Of course, the problem might be that the critical section is not valid because it was never initialized in the first place. The values in the fields are just uninitialized garbage, and when you try to leave this uninitialized critical section, that garbage gets used as an event handle, raising the invalid handle exception.

    Then again, the problem might be that the critical section is not valid because it has already been destroyed. For example, one thread might have code that goes like this:

    EnterCriticalSection(&cs);
    ... do stuff...
    LeaveCriticalSection(&cs);
    

    While that thread is busy doing stuff, another thread calls DeleteCriticalSection(&cs). This destroys the critical section while another thread was still using it. Eventually that thread finishes doing its stuff and calls LeaveCriticalSection, which raises the invalid handle exception because the DeleteCriticalSection already closed the handle.

    All of these are possible reasons for an invalid handle exception in LeaveCriticalSection. To determine which one you're running into will require more debugging, but at least now you know what to be looking for.

    Postscript: One of my colleagues from the kernel team points out that the Locks and Handles checks in Application Verifier are great for debugging issues like this.

  • The Old New Thing

    Quick overview of how processes exit on Windows XP

    • 37 Comments

    Exiting is one of the scariest moments in the lifetime of a process. (Sort of how landing is one of the scariest moments of air travel.)

    Many of the details of how processes exit are left unspecified in Win32, so different Win32 implementations can follow different mechanisms. For example, Win32s, Windows 95, and Windows NT all shut down processes differently. (I wouldn't be surprised if Windows CE uses yet another different mechanism.) Therefore, bear in mind that what I write in this mini-series is implementation detail and can change at any time without warning. I'm writing about it because these details can highlight bugs lurking in your code. In particular, I'm going to discuss the way processes exit on Windows XP.

    I should say up front that I do not agree with many steps in the way processes exit on Windows XP. The purpose of this mini-series is not to justify the way processes exit but merely to fill you in on some of the behind-the-scenes activities so you are better-armed when you have to investigate into a mysterious crash or hang during exit. (Note that I just refer to it as the way processes exit on Windows XP rather than saying that it is how process exit is designed. As one of my colleagues put it, "Using the word design to describe this is like using the term swimming pool to refer to a puddle in your garden.")

    When your program calls ExitProcess a whole lot of machinery springs into action. First, all the threads in the process (except the one calling ExitProcess) are forcibly terminated. This dates back to the old-fashioned theory on how processes should exit: Under the old-fashioned theory, when your process decides that it's time to exit, it should already have cleaned up all its threads. The termination of threads, therefore, is just a safety net to catch the stuff you may have missed. It doesn't even wait two seconds first.

    Now, we're not talking happy termination like ExitThread; that's not possible since the thread could be in the middle of doing something. Injecting a call to ExitThread would result in DLL_THREAD_DETACH notifications being sent at times the thread was not prepared for. Nope, these threads are terminated in the style of TerminateThread: Just yank the rug out from under it. Buh-bye. This is an ex-thread.

    Well, that was a pretty drastic move, now, wasn't it. And all this after the scary warnings in MSDN that TerminateThread is a bad function that should be avoided!

    Wait, it gets worse.

    Some of those threads that got forcibly terminated may have owned critical sections, mutexes, home-grown synchronization primitives (such as spin-locks), all those things that the one remaining thread might need access to during its DLL_PROCESS_DETACH handling. Well, mutexes are sort of covered; if you try to enter that mutex, you'll get the mysterious WAIT_ABANDONED return code which tells you that "Uh-oh, things are kind of messed up."

    What about critical sections? There is no "Uh-oh" return value for critical sections; EnterCriticalSection doesn't have a return value. Instead, the kernel just says "Open season on critical sections!" I get the mental image of all the gates in a parking garage just opening up and letting anybody in and out. [See correction.]

    As for the home-grown stuff, well, you're on your own.

    This means that if your code happened to have owned a critical section at the time somebody called ExitProcess, the data structure the critical section is protecting has a good chance of being in an inconsistent state. (Afer all, if it were consistent, you probably would have exited the critical section! Well, assuming you entered the critical section because you were updating the structure as opposed to reading it.) Your DLL_PROCESS_DETACH code runs, enters the critical section, and it succeeds because "all the gates are up". Now your DLL_PROCESS_DETACH code starts behaving erratically because the values in that data structure are inconsistent.

    Oh dear, now you have a pretty ugly mess on your hands.

    And if your thread was terminated while it owned a spin-lock or some other home-grown synchronization object, your DLL_PROCESS_DETACH will most likely simply hang indefinitely waiting patiently for that terminated thread to release the spin-lock (which it never will do).

    But wait, it gets worse. That critical section might have been the one that protects the process heap! If one of the threads that got terminated happened to be in the middle of a heap function like HeapAllocate or LocalFree, then the process heap may very well be inconsistent. If your DLL_PROCESS_DETACH tries to allocate or free memory, it may crash due to a corrupted heap.

    Moral of the story: If you're getting a DLL_PROCESS_DETACH due to process termination,† don't try anything clever. Just return without doing anything and let the normal process clean-up happen. The kernel will close all your open handles to kernel objects. Any memory you allocated will be freed automatically when the process's address space is torn down. Just let the process die a quiet death.

    Note that if you were a good boy and cleaned up all the threads in the process before calling ExitThread, then you've escaped all this craziness, since there is nothing to clean up.

    Note also that if you're getting a DLL_PROCESS_DETACH due to dynamic unloading, then you do need to clean up your kernel objects and allocated memory because the process is going to continue running. But on the other hand, in the case of dynamic unloading, no other threads should be executing code in your DLL anyway (since you're about to be unloaded), so—assuming you coded up your DLL correctly—none of your critical sections should be held and your data structures should be consistent.

    Hang on, this disaster isn't over yet. Even though the kernel went around terminating all but one thread in the process, that doesn't mean that the creation of new threads is blocked. If somebody calls CreateThread in their DLL_PROCESS_DETACH (as crazy as it sounds), the thread will indeed be created and start running! But remember, "all the gates are up", so your critical sections are just window dressing to make you feel good.

    (The ability to create threads after process termination has begun is not a mistake; it's intentional and necessary. Thread injection is how the debugger breaks into a process. If thread injection were not permitted, you wouldn't be able to debug process termination!)

    Next time, we'll see how the way process termination takes place on Windows XP caused not one but two problems.

    Footnotes

    †Everybody reading this article should already know how to determine whether this is the case. I'm assuming you're smart. Don't disappoint me.

Page 1 of 94 (936 items) 12345»