July, 2005

  • The Old New Thing

    Does Windows have a limit of 2000 threads per process?

    • 30 Comments

    Often I see people asking why they can't create more than around 2000 threads in a process. The reason is not that there is any particular limit inherent in Windows. Rather, the programmer failed to take into account the amount of address space each thread uses.

    A thread consists of some memory in kernel mode (kernel stacks and object management), some memory in user mode (the thread environment block, thread-local storage, that sort of thing), plus its stack. (Or stacks if you're on an Itanium system.)

    Usually, the limiting factor is the stack size.

    #include <stdio.h>
    #include <windows.h>
    
    DWORD CALLBACK ThreadProc(void*)
    {
     Sleep(INFINITE);
     return 0;
    }
    
    int __cdecl main(int argc, const char* argv[])
    {
    int i;
     for (i = 0; i < 100000; i++) {
      DWORD id;
      HANDLE h = CreateThread(NULL, 0, ThreadProc, NULL, 0, &id);
      if (!h) break;
      CloseHandle(h);
     }
     printf("Created %d threads\n", i);
     return 0;
    }
    

    This program will typically print a value around 2000 for the number of threads.

    Why does it give up at around 2000?

    Because the default stack size assigned by the linker is 1MB, and 2000 stacks times 1MB per stack equals around 2GB, which is how much address space is available to user-mode programs.

    You can try to squeeze more threads into your process by reducing your stack size, which can be done either by tweaking linker options or manually overriding the stack size passed to the CreateThread functions as described in MSDN.

      HANDLE h = CreateThread(NULL, 4096, ThreadProc, NULL,
                   STACK_SIZE_PARAM_IS_A_RESERVATION, &id);
    

    With this change, I was able to squeak in around 13000 threads. While that's certainly better than 2000, it's short of the naive expectation of 500,000 threads. (A thread is using 4KB of stack in 2GB address space.) But you're forgetting the other overhead. Address space allocation granularity is 64KB, so each thread's stack occupies 64KB of address space even if only 4KB of it is used. Plus of course you don't have free reign over all 2GB of the address space; there are system DLLs and other things occupying it.

    But the real question that is raised whenever somebody asks, "What's the maximum number of threads that a process can create?" is "Why are you creating so many threads that this even becomes an issue?"

    The "one thread per client" model is well-known not to scale beyond a dozen clients or so. If you're going to be handling more than that many clients simultaneously, you should move to a model where instead of dedicating a thread to a client, you instead allocate an object. (Someday I'll muse on the duality between threads and objects.) Windows provides I/O completion ports and a thread pool to help you convert from a thread-based model to a work-item-based model.

    Note that fibers do not help much here, because a fiber has a stack, and it is the address space required by the stack that is the limiting factor nearly all of the time.

  • The Old New Thing

    What's the difference between My Documents and Application Data?

    • 21 Comments

    The most important difference between My Documents and Application Data is that My Documents is where users store their files, whereas Application Data is where programs store their files.

    In other words, if you put something in CSIDL_MYDOCUMENTS (My Documents), you should expect the user to be renaming it, moving it, deleting it, emailing it to their friends, all the sorts of things users do with their files. Therefore, files that go there should be things that users will recognize as "their stuff". Documents they've created, music they've downloaded, that sort of thing.

    On the other hand, if you put something in CSIDL_APPDATA, (Application Data), the user is less likely to be messing with it. This is where you put your program's supporting data that isn't really something you want the user messing with, but which should still be associated with the user. High score tables, program settings, customizations, spell check exceptions...

    There is another directory called CSIDL_LOCAL_APPDATA (Local Settings\Application Data) which acts like CSIDL_APPDATA, except that it does not get copied if the user profile roams. (The "Local Settings" branch is not copied as part of the roaming user profile.) Think of it as a per-user-per-machine storage location. Caches and similar non-essential data should be kept here, especially if they are large. Other examples of non-roaming per-user data are your %TEMP% and Temporary Internet Files directories.

  • The Old New Thing

    What are SYSTEM_FONT and DEFAULT_GUI_FONT?

    • 22 Comments

    Among the things you can get with the GetStockObject function are two fonts called SYSTEM_FONT and DEFAULT_GUI_FONT. What are they?

    They are fonts nobody uses any more.

    Back in the old days of Windows 2.0, the font used for dialog boxes was a bitmap font called System. This is the font that SYSTEM_FONT retrieves, and it is still the default dialog box font for compatibility reasons. Of course, nobody nowadays would ever use such an ugly font for their dialog boxes. (Among other things, it's a bitmap font and therefore does not look good at high resolutions, nor can it be anti-aliased.)

    DEFAULT_GUI_FONT has an even less illustrious history. It was created during Windows 95 development in the hopes of becoming the new default GUI font, but by July 1994, Windows itself stopped using it in favor of the various fonts returned by the SystemParametersInfo function. Its existence is now vestigial.

    One major gotcha with SYSTEM_FONT and DEFAULT_GUI_FONT is that on a typical US-English machine, they map to bitmap fonts that do not support ClearType.

  • The Old New Thing

    The apocryphal history of file system tunnelling

    • 34 Comments

    One of the file system features you may find yourself surprised by is tunneling, wherein the creation timestamp and short/long names of a file are taken from a file that existed in the directory previously. In other words, if you delete some file "File with long name.txt" and then create a new file with the same name, that new file will have the same short name and the same creation time as the original file. You can read this KB article for details on what operations are sensitive to tunnelling.

    Why does tunneling exist at all?

    When you use a program to edit an existing file, then save it, you expect the original creation timestamp to be preserved, since you're editing a file, not creating a new one. But internally, many programs save a file by performing a combination of save, delete, and rename operations (such as the ones listed in the linked article), and without tunneling, the creation time of the file would seem to change even though from the end user's point of view, no file got created.

    As another example of the importance of tunneling, consider that file "File with long name.txt", whose short name is say "FILEWI~1.TXT". You load this file into a program that is not long-filename-aware and save it. It deletes the old "FILEWI~1.TXT" and creates a new one with the same name. Without tunnelling, the associated long name of the file would be lost. Instead of a friendly long name, the file name got corrupted into this thing with squiggly marks. Not good.

    But where did the name "tunneling" come from?

    From quantum mechanics.

    Consider the following analogy: You have two holes in the ground, and a particle is in the first hole (A) and doesn't have enough energy to get out. It only has enough energy to get as high as the dotted line.

             
      A   B  

    You get distracted for a little while, maybe watch the Super Bowl halftime show, and when you come back, the particle somehow is now in hole B. This is impossible in classical mechanics, but thanks to the wacky world of quantum mechanics, it is not only possible, but actually happens. The phenomenon is known as tunneling, because it's as if the particle "dug a tunnel" between the two holes, thereby allowing it to get from one hole to another without ever going above the dotted line.

    In the case of file system tunneling, it is information that appears to violate the laws of classical mechanics. The information was destroyed (by deleting or renaming the file), yet somehow managed to reconstruct itself on the other side of a temporal barrier.

    The developer who was responsible for implementing tunneling on Windows 95 got kind of carried away with the quantum mechanics analogy: The fragments of information about recently-deleted or recently-renamed files are kept in data structures called "quarks".

  • The Old New Thing

    Where did the names of the computer Hearts opponents come from?

    • 13 Comments

    A Windows 95 story in commemoration of the tenth anniversary of its release to manufacturing (RTM).

    Danny Glasser explains where the names for the computer opponents in the game Hearts came from.

    I didn't myself know where the names came from, but Danny's explanation of the source of the Windows 95 names brought back memories of the child of one of our co-workers, whose name I will not reveal but you can certainly narrow it down to one of three. He/she was exceedingly well-behaved and definitely helped to make those long hours slightly more tolerable. I remember once we heard the receptionist's voice come over the public address system, which was itself quite a shock because nobody ever uses the public address system. The message was, "Will X please come to the receptionist's desk. Your son/daughter is here."

    Space Cadet JimH picks up the story and explains how he went about writing the computer player logic. (And no, the computer players don't cheat.)

  • The Old New Thing

    What is the difference between WM_DESTROY and WM_NCDESTROY?

    • 16 Comments

    There are two window messages closely-associated with window destruction, the WM_DESTROY message and the WM_NCDESTROY message. What's the difference?

    The difference is that the WM_DESTROY message is sent at the start of the window destruction sequence, whereas the WM_NCDESTROY message is sent at the end. This is an important distinction when you have child windows. If you have a parent window with a child window, then the message traffic (in the absence of weirdness) will go like this:

    hwnd = parent, uMsg = WM_DESTROY
    hwnd = child, uMsg = WM_DESTROY
    hwnd = child, uMsg = WM_NCDESTROY
    hwnd = parent, uMsg = WM_NCDESTROY
    

    Notice that the parent receives the WM_DESTROY before the child windows are destroyed, and it receives the WM_NCDESTROY message after they have been destroyed.

    Having two destruction messages, one sent top-down and the other bottom-up, means that you can perform clean-up appropriate to a particular model when handling the corresponding message. If there is something that must be cleaned up top-down, then you can use the WM_DESTROY message, for example.

    The WM_NCDESTROY is the last message your window will receive (in the absence of weirdness), and it is therefore the best place to do "final cleanup". This is why our new scratch program waits until WM_NCDESTROY to destroy its instance variables.

    These two destruction messages are paired with the analogous WM_CREATE and WM_NCCREATE messages. Just as WM_NCDESTROY is the last message your window receives, the WM_NCCREATE message is the first message, so that's a good place to create your instance variables. Note also that if you cause the WM_NCCREATE message to return failure, then all you will get is WM_NCDESTROY; there will be no WM_DESTROY since you never got the corresponding WM_CREATE.

    What's this "absence of weirdness" I keep alluding to? We'll look at that next time.

    [Typos corrected, 9:30am]

  • The Old New Thing

    Why does FindFirstFile find short names?

    • 21 Comments

    The FindFirstFile function matches both the short and long names. This can produce somewhat surprising results. For example, if you ask for "*.htm", this also gives you the file "x.html" since its short name is "X~1.HTM".

    Why does it bother matching short names? Shouldn't it match only long names? After all, only old 16-bit programs use short names.

    But that's the problem: 16-bit programs use short names.

    Through a process known as generic thunks, a 16-bit program can load a 32-bit DLL and call into it. Windows 95 and the Windows 16-bit emulation layer in Windows NT rely heavily on generic thunks so that they don't have to write two versions of everything. Instead, the 16-bit version just thunks up to the 32-bit version.

    Note, however, that this would mean that 32-bit DLLs would see two different views of the file system, depending on whether they are hosted from a 16-bit process or a 32-bit process.

    "Then make the FindFirstFile function check to see who its caller is and change its behavior accordingly," doesn't fly because you can't trust the return address.

    Even if this problem were solved, you would still have the problem of 16/32 interop across the process boundary.

    For example, suppose a 16-bit program calls WinExec("notepad X~1.HTM"). The 32-bit Notepad program had better open the file X~1.HTM even though it's a short name. What's more, a common way to get properties of a file such as its last access time is to call FindFirstFile with the file name, since the WIN32_FIND_DATA structure returns that information as part of the find data. (Note: GetFileAttributesEx is a better choice, but that function is comparatively new.) If the FindFirstFile function did not work for short file names, then the above trick would fail for short names passed across the 16/32 boundary.

    As another example, suppose the DLL saves the file name in a location external to the process, say a configuration file, the registry, or a shared memory block. If a 16-bit program program calls into this DLL, it would pass short names, whereas if a 32-bit program calls into the DLL, it would pass long names. If the file system functions returned only long names for 32-bit programs, then the copy of the DLL running in a 32-bit program would not be able to read the data written by the DLL running in a 16-bit program.

  • The Old New Thing

    What's the point of DeferWindowPos?

    • 23 Comments

    The purpose of the DeferWindowPos function is to move multiple child windows at one go. This reduces somewhat the amount of repainting that goes on when windows move around.

    Take that DC brush sample from a few months ago and make the following changes:

    HWND g_hwndChildren[2];
    
    BOOL
    OnCreate(HWND hwnd, LPCREATESTRUCT lpcs)
    {
     const static COLORREF s_rgclr[2] =
        { RGB(255,0,0), RGB(0,255,0) };
     for (int i = 0; i < 2; i++) {
      g_hwndChildren[i] = CreateWindow(TEXT("static"), NULL,
            WS_VISIBLE | WS_CHILD, 0, 0, 0, 0,
            hwnd, (HMENU)IntToPtr(s_rgclr[i]), g_hinst, 0);
      if (!g_hwndChildren[i]) return FALSE;
     }
     return TRUE;
    }
    

    Notice that I'm using the control ID to hold the desired color. We retrieve it when choosing our background color.

    HBRUSH OnCtlColor(HWND hwnd, HDC hdc, HWND hwndChild, int type)
    {
      Sleep(500);
      SetDCBrushColor(hdc, (COLORREF)GetDlgCtrlID(hwndChild));
      return GetStockBrush(DC_BRUSH);
    }
    
        HANDLE_MSG(hwnd, WM_CTLCOLORSTATIC, OnCtlColor);
    

    I threw in a half-second sleep. This will make the painting a little easier to see.

    void
    OnSize(HWND hwnd, UINT state, int cx, int cy)
    {
      int cxHalf = cx/2;
      SetWindowPos(g_hwndChildren[0],
                   NULL, 0, 0, cxHalf, cy,
                   SWP_NOZORDER | SWP_NOOWNERZORDER | SWP_NOACTIVATE);
      SetWindowPos(g_hwndChildren[1],
                   NULL, cxHalf, 0, cx-cxHalf, cy,
                   SWP_NOZORDER | SWP_NOOWNERZORDER | SWP_NOACTIVATE);
    }
    

    We place the two child windows side by side in our client area. For our first pass, we'll use the SetWindowPos function to position the windows.

    Compile and run this program, and once it's up, click the maximize box. Observe carefully which parts of the green rectangle get repainted.

    Now let's change our positioning code to use the DeferWindowPos function. The usage pattern for the deferred window positioning functions is as follows:

    HDWP hdwp = BeginDeferWindowPos(n);
    if (hdwp) hdwp = DeferWindowPos(hdwp, ...); // 1 [fixed 7/7]
    if (hdwp) hdwp = DeferWindowPos(hdwp, ...); // 2
    if (hdwp) hdwp = DeferWindowPos(hdwp, ...); // 3
    ...
    if (hdwp) hdwp = DeferWindowPos(hdwp, ...); // n
    if (hdwp) EndDeferWindowPos(hdwp);
    

    There are some key points here.

    • The value you pass to the BeginDeferWindowPos function is the number of windows you intend to move. It's okay if you get this value wrong, but getting it right will reduce the number of internal reallocations.
    • The return value from DeferWindowPos is stored back into the hdwp because the return value is not necessarily the same as the value originally passed in. If the deferral bookkeeping needs to perform a reallocation, the DeferWindowPos function returns a handle to the new defer information; the old defer information is no longer valid. What's more, if the deferral fails, the old defer information is destroyed. This is different from the realloc function which leaves the original object unchanged if the reallocation fails. The pattern p = realloc(p, ...) is a memory leak, but the pattern hdwp = DeferWindowPos(hdwp, ...) is not.

    That second point is important. Many people get it wrong.

    Okay, now that you're all probably scared of this function, let's change our repositioning code to take advantage of deferred window positioning. It's really not that hard at all. (Save these changes to a new file, though. We'll want to run the old and new versions side by side.)

    void
    OnSize(HWND hwnd, UINT state, int cx, int cy)
    {
      HDWP hdwp = BeginDeferWindowPos(2);
      int cxHalf = cx/2;
      if (hdwp) hdwp = DeferWindowPos(hdwp, g_hwndChildren[0],
                   NULL, 0, 0, cxHalf, cy,
                   SWP_NOZORDER | SWP_NOOWNERZORDER | SWP_NOACTIVATE);
      if (hdwp) hdwp = DeferWindowPos(hdwp, g_hwndChildren[1],
                   NULL, cxHalf, 0, cx-cxHalf, cy,
                   SWP_NOZORDER | SWP_NOOWNERZORDER | SWP_NOACTIVATE);
      if (hdwp) EndDeferWindowPos(hdwp);
    }
    

    Compile and run this program, and again, once it's up, maximize the window and observe which regions repaint. Observe that there is slightly less repainting in the new version compared to the old version.

  • The Old New Thing

    If InitCommonControls doesn't do anything, why do you have to call it?

    • 13 Comments

    One of the problems beginners run into when they start using shell common controls is that they forget to call the InitCommonControls function. But if you were to disassemble the InitCommonControls function itself, you'll see that it, like the FlushInstructionCache function, doesn't actually do anything.

    Then why do you need to call it?

    As with FlushInstructionCache, what's important is not what it performs, but just the fact that you called it.

    Recall that merely listing a lib file in your dependencies doesn't actually cause your program to be bound to the corresponding DLL. You have to call a function in that DLL in order for there to be an import entry for that DLL. And InitCommonControls is that function.

    Without the InitCommonControls function, a program that wants to use the shell common controls library would otherwise have no reference to COMCTL32.DLL in its import table. This means that when the program loads, COMCTL32.DLL is not loaded and therefore is not initialized. Which means that it doesn't register its window classes. Which means that your call to the CreateWindow function fails because the window class has not been registered.

    That's why you have to call a function that does nothing. It's for your own good.

    (Of course, there's the new InitCommonControlsEx function that lets you specify which classes you would like to be registered. Only the classic Windows 95 classes are registered when COMCTL32.DLL loads. For everything else you have to ask for it explicitly.)

  • The Old New Thing

    Using script to query information from Internet Explorer windows

    • 14 Comments

    Some time ago, we used C++ to query information from the ShellWindows object and found it straightforward but cumbersome.

    This is rather clumsy from C++ because the ShellWindows object was designed for use by a scripting language like JScript or Visual Basic.

    Let's use one of the languages the ShellWindows object was designed for to enumerate all the open shell windows. Run it with the command line cscript sample.js.

    var shellWindows = new ActiveXObject("Shell.Application").Windows();
    for (var i = 0; i < shellWindows.Count; i++) {
      var w = shellWindows.Item(i);
      WScript.StdOut.WriteLine(w.LocationName + "=" + w.LocationURL);
    }
    

    Well that was quite a bit shorter, wasn't it!

Page 1 of 4 (37 items) 1234