July, 2005

  • The Old New Thing

    What struck me about life in the Republic

    • 41 Comments

    When people asked me for my reaction to the most recent Star Wars movie, I replied that what struck me most was that the Republic doesn't appear to have any building codes. There are these platforms several hundred meters above the ground with no railings. For example, Padmé Amidala's fancy apartment has a front porch far above the ground. Consider: You're carrying a load of packages to the car, the kids are running around, you turn around to yell at one of them, miss a step, and over the rim you go. How many people fall to their deaths in that galaxy?

  • The Old New Thing

    The apocryphal history of file system tunnelling

    • 34 Comments

    One of the file system features you may find yourself surprised by is tunneling, wherein the creation timestamp and short/long names of a file are taken from a file that existed in the directory previously. In other words, if you delete some file "File with long name.txt" and then create a new file with the same name, that new file will have the same short name and the same creation time as the original file. You can read this KB article for details on what operations are sensitive to tunnelling.

    Why does tunneling exist at all?

    When you use a program to edit an existing file, then save it, you expect the original creation timestamp to be preserved, since you're editing a file, not creating a new one. But internally, many programs save a file by performing a combination of save, delete, and rename operations (such as the ones listed in the linked article), and without tunneling, the creation time of the file would seem to change even though from the end user's point of view, no file got created.

    As another example of the importance of tunneling, consider that file "File with long name.txt", whose short name is say "FILEWI~1.TXT". You load this file into a program that is not long-filename-aware and save it. It deletes the old "FILEWI~1.TXT" and creates a new one with the same name. Without tunnelling, the associated long name of the file would be lost. Instead of a friendly long name, the file name got corrupted into this thing with squiggly marks. Not good.

    But where did the name "tunneling" come from?

    From quantum mechanics.

    Consider the following analogy: You have two holes in the ground, and a particle is in the first hole (A) and doesn't have enough energy to get out. It only has enough energy to get as high as the dotted line.

             
      A   B  

    You get distracted for a little while, maybe watch the Super Bowl halftime show, and when you come back, the particle somehow is now in hole B. This is impossible in classical mechanics, but thanks to the wacky world of quantum mechanics, it is not only possible, but actually happens. The phenomenon is known as tunneling, because it's as if the particle "dug a tunnel" between the two holes, thereby allowing it to get from one hole to another without ever going above the dotted line.

    In the case of file system tunneling, it is information that appears to violate the laws of classical mechanics. The information was destroyed (by deleting or renaming the file), yet somehow managed to reconstruct itself on the other side of a temporal barrier.

    The developer who was responsible for implementing tunneling on Windows 95 got kind of carried away with the quantum mechanics analogy: The fragments of information about recently-deleted or recently-renamed files are kept in data structures called "quarks".

  • The Old New Thing

    Does Windows have a limit of 2000 threads per process?

    • 30 Comments

    Often I see people asking why they can't create more than around 2000 threads in a process. The reason is not that there is any particular limit inherent in Windows. Rather, the programmer failed to take into account the amount of address space each thread uses.

    A thread consists of some memory in kernel mode (kernel stacks and object management), some memory in user mode (the thread environment block, thread-local storage, that sort of thing), plus its stack. (Or stacks if you're on an Itanium system.)

    Usually, the limiting factor is the stack size.

    #include <stdio.h>
    #include <windows.h>
    
    DWORD CALLBACK ThreadProc(void*)
    {
     Sleep(INFINITE);
     return 0;
    }
    
    int __cdecl main(int argc, const char* argv[])
    {
    int i;
     for (i = 0; i < 100000; i++) {
      DWORD id;
      HANDLE h = CreateThread(NULL, 0, ThreadProc, NULL, 0, &id);
      if (!h) break;
      CloseHandle(h);
     }
     printf("Created %d threads\n", i);
     return 0;
    }
    

    This program will typically print a value around 2000 for the number of threads.

    Why does it give up at around 2000?

    Because the default stack size assigned by the linker is 1MB, and 2000 stacks times 1MB per stack equals around 2GB, which is how much address space is available to user-mode programs.

    You can try to squeeze more threads into your process by reducing your stack size, which can be done either by tweaking linker options or manually overriding the stack size passed to the CreateThread functions as described in MSDN.

      HANDLE h = CreateThread(NULL, 4096, ThreadProc, NULL,
                   STACK_SIZE_PARAM_IS_A_RESERVATION, &id);
    

    With this change, I was able to squeak in around 13000 threads. While that's certainly better than 2000, it's short of the naive expectation of 500,000 threads. (A thread is using 4KB of stack in 2GB address space.) But you're forgetting the other overhead. Address space allocation granularity is 64KB, so each thread's stack occupies 64KB of address space even if only 4KB of it is used. Plus of course you don't have free reign over all 2GB of the address space; there are system DLLs and other things occupying it.

    But the real question that is raised whenever somebody asks, "What's the maximum number of threads that a process can create?" is "Why are you creating so many threads that this even becomes an issue?"

    The "one thread per client" model is well-known not to scale beyond a dozen clients or so. If you're going to be handling more than that many clients simultaneously, you should move to a model where instead of dedicating a thread to a client, you instead allocate an object. (Someday I'll muse on the duality between threads and objects.) Windows provides I/O completion ports and a thread pool to help you convert from a thread-based model to a work-item-based model.

    Note that fibers do not help much here, because a fiber has a stack, and it is the address space required by the stack that is the limiting factor nearly all of the time.

  • The Old New Thing

    What is this "web site" thing you are talking about?

    • 29 Comments

    One reaction I've seen when people learn about all the compatibility work done in the Windows 95 kernel is to say,

    Why not add code to the installer wizard [alas, page is now 404] which checks to see if you're installing SimCity and, if so, informs you of a known design flaw, then asks you to visit Electronic Arts' webpage for a patch?

    Let's ignore the issue of the "installer wizard"; most people do not go through the Add and Remove Programs control panel to install programs, so any changes to that control panel wouldn't have helped anyway.

    But what about detecting that you're running SimCity and telling you to get a patch from Electronic Arts' web site?

    Remember, this was 1993. Almost nobody had web sites. The big thing was the "Information Superhighway". (Remember that? I don't think it ever got built; the Internet sort of stole its thunder.) If you told somebody, "Go to Electronic Arts' web site and download a patch", you'd get a blank stare. What's a "web site"? How do I access that from Prodigy? I don't have a modem. Can you mail me their web site?

    In Windows XP, when Windows detects that you're running a program with which it is fundamentally incompatible, you do get a pop-up window directing you to the company's web site. But that's because it's now 2005 and even hermits living in caves have email addresses.

    In 1993, things were a little different.

    (Heck, even by 1995 things most people did not have Internet access and those few that did used modems. Requiring users to obtain Internet access in order to set the computer clock via NTP would have been rather presumptuous.)

  • The Old New Thing

    I hope you weren't using those undocumented critical section fields

    • 25 Comments

    I hope you weren't using those undocumented critical section fields, because in Windows Server 2003 Service Pack 1, they've changed.

    Mike Dodd tells an interesting story of a vendor who used reserved fields and then complained when the system started using them!

  • The Old New Thing

    What's the point of DeferWindowPos?

    • 23 Comments

    The purpose of the DeferWindowPos function is to move multiple child windows at one go. This reduces somewhat the amount of repainting that goes on when windows move around.

    Take that DC brush sample from a few months ago and make the following changes:

    HWND g_hwndChildren[2];
    
    BOOL
    OnCreate(HWND hwnd, LPCREATESTRUCT lpcs)
    {
     const static COLORREF s_rgclr[2] =
        { RGB(255,0,0), RGB(0,255,0) };
     for (int i = 0; i < 2; i++) {
      g_hwndChildren[i] = CreateWindow(TEXT("static"), NULL,
            WS_VISIBLE | WS_CHILD, 0, 0, 0, 0,
            hwnd, (HMENU)IntToPtr(s_rgclr[i]), g_hinst, 0);
      if (!g_hwndChildren[i]) return FALSE;
     }
     return TRUE;
    }
    

    Notice that I'm using the control ID to hold the desired color. We retrieve it when choosing our background color.

    HBRUSH OnCtlColor(HWND hwnd, HDC hdc, HWND hwndChild, int type)
    {
      Sleep(500);
      SetDCBrushColor(hdc, (COLORREF)GetDlgCtrlID(hwndChild));
      return GetStockBrush(DC_BRUSH);
    }
    
        HANDLE_MSG(hwnd, WM_CTLCOLORSTATIC, OnCtlColor);
    

    I threw in a half-second sleep. This will make the painting a little easier to see.

    void
    OnSize(HWND hwnd, UINT state, int cx, int cy)
    {
      int cxHalf = cx/2;
      SetWindowPos(g_hwndChildren[0],
                   NULL, 0, 0, cxHalf, cy,
                   SWP_NOZORDER | SWP_NOOWNERZORDER | SWP_NOACTIVATE);
      SetWindowPos(g_hwndChildren[1],
                   NULL, cxHalf, 0, cx-cxHalf, cy,
                   SWP_NOZORDER | SWP_NOOWNERZORDER | SWP_NOACTIVATE);
    }
    

    We place the two child windows side by side in our client area. For our first pass, we'll use the SetWindowPos function to position the windows.

    Compile and run this program, and once it's up, click the maximize box. Observe carefully which parts of the green rectangle get repainted.

    Now let's change our positioning code to use the DeferWindowPos function. The usage pattern for the deferred window positioning functions is as follows:

    HDWP hdwp = BeginDeferWindowPos(n);
    if (hdwp) hdwp = DeferWindowPos(hdwp, ...); // 1 [fixed 7/7]
    if (hdwp) hdwp = DeferWindowPos(hdwp, ...); // 2
    if (hdwp) hdwp = DeferWindowPos(hdwp, ...); // 3
    ...
    if (hdwp) hdwp = DeferWindowPos(hdwp, ...); // n
    if (hdwp) EndDeferWindowPos(hdwp);
    

    There are some key points here.

    • The value you pass to the BeginDeferWindowPos function is the number of windows you intend to move. It's okay if you get this value wrong, but getting it right will reduce the number of internal reallocations.
    • The return value from DeferWindowPos is stored back into the hdwp because the return value is not necessarily the same as the value originally passed in. If the deferral bookkeeping needs to perform a reallocation, the DeferWindowPos function returns a handle to the new defer information; the old defer information is no longer valid. What's more, if the deferral fails, the old defer information is destroyed. This is different from the realloc function which leaves the original object unchanged if the reallocation fails. The pattern p = realloc(p, ...) is a memory leak, but the pattern hdwp = DeferWindowPos(hdwp, ...) is not.

    That second point is important. Many people get it wrong.

    Okay, now that you're all probably scared of this function, let's change our repositioning code to take advantage of deferred window positioning. It's really not that hard at all. (Save these changes to a new file, though. We'll want to run the old and new versions side by side.)

    void
    OnSize(HWND hwnd, UINT state, int cx, int cy)
    {
      HDWP hdwp = BeginDeferWindowPos(2);
      int cxHalf = cx/2;
      if (hdwp) hdwp = DeferWindowPos(hdwp, g_hwndChildren[0],
                   NULL, 0, 0, cxHalf, cy,
                   SWP_NOZORDER | SWP_NOOWNERZORDER | SWP_NOACTIVATE);
      if (hdwp) hdwp = DeferWindowPos(hdwp, g_hwndChildren[1],
                   NULL, cxHalf, 0, cx-cxHalf, cy,
                   SWP_NOZORDER | SWP_NOOWNERZORDER | SWP_NOACTIVATE);
      if (hdwp) EndDeferWindowPos(hdwp);
    }
    

    Compile and run this program, and again, once it's up, maximize the window and observe which regions repaint. Observe that there is slightly less repainting in the new version compared to the old version.

  • The Old New Thing

    When Marketing edits your PDC talk description

    • 23 Comments

    A few years ago, I told a story of how Marketing messed up a bunch of PDC slides by "helpfully" expanding acronyms... into the wrong phrases. Today I got to see Marketing's handiwork again, as they edited my talk description. (Oh, and psst, Marketing folks, you might want to link to the full list of PDC sessions from your Conference Tracks and Sessions page. Unless, of course, y'know, you don't want people to know about it.)

    For one thing, they stuck my name into the description of the talk, thereby drawing attention to me rather than putting the focus on the actual talk topic. Because I'm not there to be me. I'm there to give a talk. If I were just there to be me, the title would be "Raymond Chen reads the newspaper for an hour while listening to music on his headphones."

    (That's why I don't do interviews. Interviews are about the interviewee, and I don't want to talk about me. People should care about the technology, not the people behind it.)

    They also trimmed my topic list but stopped before the punch line.

    ... asynchronous input queues, the hazards of attaching thread input, and other tricks and traps ...

    The punch line was "... and how it happens without your knowledge." After all, you don't care about the fine details of a feature you don't use. The point is that it's happening behind your back so you'd better know about it because you're using it whether you realize it or not.

    They also took out the reference to finger puppets.

  • The Old New Thing

    What are SYSTEM_FONT and DEFAULT_GUI_FONT?

    • 22 Comments

    Among the things you can get with the GetStockObject function are two fonts called SYSTEM_FONT and DEFAULT_GUI_FONT. What are they?

    They are fonts nobody uses any more.

    Back in the old days of Windows 2.0, the font used for dialog boxes was a bitmap font called System. This is the font that SYSTEM_FONT retrieves, and it is still the default dialog box font for compatibility reasons. Of course, nobody nowadays would ever use such an ugly font for their dialog boxes. (Among other things, it's a bitmap font and therefore does not look good at high resolutions, nor can it be anti-aliased.)

    DEFAULT_GUI_FONT has an even less illustrious history. It was created during Windows 95 development in the hopes of becoming the new default GUI font, but by July 1994, Windows itself stopped using it in favor of the various fonts returned by the SystemParametersInfo function. Its existence is now vestigial.

    One major gotcha with SYSTEM_FONT and DEFAULT_GUI_FONT is that on a typical US-English machine, they map to bitmap fonts that do not support ClearType.

  • The Old New Thing

    What's the difference between My Documents and Application Data?

    • 21 Comments

    The most important difference between My Documents and Application Data is that My Documents is where users store their files, whereas Application Data is where programs store their files.

    In other words, if you put something in CSIDL_MYDOCUMENTS (My Documents), you should expect the user to be renaming it, moving it, deleting it, emailing it to their friends, all the sorts of things users do with their files. Therefore, files that go there should be things that users will recognize as "their stuff". Documents they've created, music they've downloaded, that sort of thing.

    On the other hand, if you put something in CSIDL_APPDATA, (Application Data), the user is less likely to be messing with it. This is where you put your program's supporting data that isn't really something you want the user messing with, but which should still be associated with the user. High score tables, program settings, customizations, spell check exceptions...

    There is another directory called CSIDL_LOCAL_APPDATA (Local Settings\Application Data) which acts like CSIDL_APPDATA, except that it does not get copied if the user profile roams. (The "Local Settings" branch is not copied as part of the roaming user profile.) Think of it as a per-user-per-machine storage location. Caches and similar non-essential data should be kept here, especially if they are large. Other examples of non-roaming per-user data are your %TEMP% and Temporary Internet Files directories.

  • The Old New Thing

    Watching the game of "Telephone" play out on the Internet

    • 21 Comments

    Let's see if I can get this straight.

    First, Chris Pirillo says (timecode 37:59) he's not entirely pleased with the word "podcast" in Episode 11 of This Week in Tech. The Seattle-PI then reports that the sentiment is shared with "several Microsoft employees" who have coined the word "blogcast" to replace it. Next, c|net picks up the story and says that the word "podcast" is a "faux-pas" on Microsoft campus. [Typo fixed: 9am]

    In this manner, a remark by someone who isn't even a Microsoft employee becomes, through rumor, speculation, and wild extrapolation, a word-ban at Microsoft.

    Pretty neat trick.

Page 1 of 4 (37 items) 1234