March, 2006

  • The Old New Thing

    Das Buch der verrückten Experimente


    The Annals of Improbable Research tipped me off to Reto Schneider's Das Buch der verrückten Experimente (The Book of Weird Experiments in English), a collection of descriptions of one hundred scientific experiments throughout the course of history. As you might expect from the title, the experiments are all somewhat strange, yet nevertheless fascinating.

    For example, how do cats always manage to land on their feet? Newton's laws would say that it's impossible:

    The problem is that a falling cat has nothing to push against. Each turn that it makes with its forequarters causes its hindquarters to turn in the opposite direction. A half-clockwise turn in front means a half-counterclockwise turn behind. Theoretically, the cat should land all twisted up, which obviously is not the case.

    The web site includes excerpts in German and English, including links to videos of cold research, feathers dropped onto the surface of the moon, and of course, falling cats.

    Despite the tease of English-language excerpts, the book itself is available only in German. Unfortunately my German is not strong enough that I'd be able to read this book with very much success. (I can only just barely read Harry Potter und der Stein der Weisen without the aid of a dictionary.) But I'll at least add it to my list of "books I might be able to read someday".

    [Updated 2pm to correct the English URL.]

  • The Old New Thing

    On the fuzzy definition of a "Unicode application"


    Commenter mpz wondered why the IME cannot detect whether it is sending characters to a Unicode or non-Unicode application and generate the appropriate character accordingly.

    But what exactly is a Unicode application? Actually, let me turn the question around: What is a non-Unicode application?

    Suppose you write a program and don't #define UNICODE, so you'd think you have a non-Unicode application. But your program uses a control provided by another library, and the authors of that library defined UNICODE. The controls created by that library are therefore Unicode, aren't they? Now you type that frustrating character to a control created by that library. Should it generate a U+00A5 or a U+005C?

    To know the answer to that question requires psychic powers. If the control takes the character and uses it exclusively internally, then presumably the IME should generate U+00A5. But if the control takes the character and returns it back to your program (say the control is a fancy edit control), then presumably the IME should generate U+005C. How does it know? It's not going to do some sort of analysis of the code in the helper library to decide what it's going to do with that character. Even human beings with access to the source code may have difficulty deciding whether the character will ever get converted to the CP_ACP code page in the future. Indeed, if the decision is based on the user's future actions, then you will need to invoke some sort of clairvoyance (and relinquishing of free will) to get the correct answer.

    Note that this helper library might be in the form of a static library, in which case your application is really neither Unicode nor ANSI, but rather a mix of the two. Parts of it are Unicode and parts are ANSI. What's a poor IME to do?

  • The Old New Thing

    Top ten things to do to make your application a Vista application


    On MSDN, there's a series of articles on the top ten things to do to make your application a Vista application. The series began last December, and just this month, they covered a topic dear to my heart: Application compatibility.

    [Update 2pm: If you have feedback about these articles, posting that feedback here won't accomplish much since I am not the author of those articles.]

  • The Old New Thing

    Controlling resource consumption by meting out work items


    At the PDC, one person came to talk to me for advice on a resource management problem they were having. To simplify, their system generated dozens of work items, each of which required significant resource consumption. For the sake of illustration, let's say that each of the work items was a single-threaded computationally-intensive operation that required 180MB of memory. The target machine was, say, a quad-processor machine with 1GB of RAM. (This was a standalone system, so it could assume that no other programs were running on the computer.)

    Their first design involved creating a thread for each work item and letting them all fight it out for resources. This didn't work out so great because all of the work items would be fighting over the four CPUs and requiring several times the available system memory, resulting in thrashing both the scheduler (more runnable threads than CPUs) as well as the memory manager (working set larger than available memory). The result was horrible.

    The second design was to serialize the work items. Run one work item, then when it completes, run the next work item, and so on until all the work items were complete. This ran much better because only one work item was active at a time, so it could monopolize the CPU and memory without interference from other work items. However, this didn't scale because the performance of the program didn't improve after adding processors or memory. Since only one work item ran at a time, all of the extra CPUs and memory simply sat idle.

    The solution to this problem is to make sure you maximize one of the limiting resources. Here, we have two limiting resources, CPU and memory. Since each work item required an entire CPU, running more work items simultaneously than available CPUs would result in scheduler thrashing, so the first cap on the number of work items was determined by the number of CPUs. Next, since each work item required 180MB of memory, you could run five of them in a 1GB machine before you started thrashing the memory manager. Therefore, this work item list will saturate the CPUs first, at four work items.

    Once you figure out how many work items you can run at once, it's a simple matter of running only that many. A master scheduler thread maintained a list of work to be done and fired off the first four, then waited for them to complete, using the WaitForMultipleObjects function and passing bWaitAll = FALSE so that it was woken up as soon as any work item completed. (This was not a GUI application so it didn't need to worry about pumping messages.) As soon as one of the work items completed, a new one was started. In this manner, there were always four work items in progress, taking maximum advantage of the resources available. (Because our preliminary mathematics showed that running five work items simultaneously would cause the scheduler to thrash.)

    In real life, some of the work items were really child processes, and there were dependencies among the work items, but those complications could be accomodated in the basic design.

  • The Old New Thing

    The social skills of a thermonuclear device, part 2


    I guess I'm living up to my reputation of having the social skills of a thermonuclear device:

    From: <name withheld>

    It'd be awful swell of you to add my blog to your blogroll. You don't know me, and I know you only by your superlative writings, but I'm a big fan.

    http://<link withheld>

    P.S. Would you like you home remortgaged? Are you in need of prescription drugs? Can you help me free up my Nigerian wealth?

    My reply:

    hey there - i actually am extremely selective about who goes on my blogroll. in fact I'm working on making it smaller, not bigger. sorry.

    And the response:

    From: <name withheld>


    Maybe the author of this cartoon was researching how to finish the story...

  • The Old New Thing

    A thread waiting on a synchronization object could be caught napping


    If you have a synchronization object, say a semaphore, and two threads waiting on the semaphore, and you then release two semaphore tokens with a single call to ReleaseSempahore, you would expect that each of the waiting threads would be woken, each obtaining one token. And in fact, that's what happens—most of the time.

    Recall in our discussion of why the PulseEvent function is fundamentally flawed that a thread in a wait state could be momentarily woken to service a kernel APC and therefore miss out on the pulse. The same thing might happen to the thread waiting for the semaphore. If the ReleaseSemaphore happens to occur while a thread has been taken out of the wait state to service a kernel APC, it will not claim the token immediately but rather will attempt to claim the token when the kernel APC completes and the thread is about to be returned to the wait state.

    Normally this is not a problem, because the token will still be there waiting for the thread. But if you have multiple threads all competing for the token, there is a small probability that in the time it took the thread to service the kernel APC, that other thread which was also waiting for a token not only got the first token, but managed to complete whatever work was associated with the token and issue a new WaitForSingleObject which claims the second token! The first thread was caught napping and missed out on both tokens.

    Fortunately, the cases where you have this sort of rampant competition for semaphore tokens are typically producer/consumer scenarios where it doesn't really matter who consumes the token, as long as somebody does.

    Exercise: If there are multiple threads waiting on an auto-reset event and the event is signalled, can you predict which thread will be released?

  • The Old New Thing

    Betsy's interview tip: Wear pants


    Last year, our retiring Blog Queen Betsy Aoki reminded us to wear pants.

  • The Old New Thing

    Why does the size of a combo box include the size of the drop-down?


    Many people are surprised to discover that when you create a combo box (either in code via CreateWindow or indirectly via a dialog box template), the size you specify describes the size of the combo box including the drop-down list box, even though the drop-down list box is not visible on the screen. For example, if you say that you want the combo box to be 200 pixels tall, it will be visible on the screen as a 20-pixel-tall (say) edit control, and when the user drops the list box by clicking on the drop-down arrow, the list box will be 180 pixels tall. This has the unfortunate consequence that if you fail to take this into account and specify in your dialog box template that you wanted a 20-pixel-tall combo box, you will end up with a combo box whose drop-down listbox is zero pixels tall! That's not a very useful combo box. Those who have switched to version 6 of the common controls library may have noticed that the version 6 combo box detects this common mistake and "auto-repairs" it: It recognizes that the height passed by the creator of the combo box is absurdly small and automatically enlarges it to something more reasonable. This was done at the request of the user interface designers who were fed up with struggling with program after program that set their combo box heights too small and ended up showing uselessly-small combo box drop-down list boxes. Imagine, for example, a "Choose your state" combo box where the drop-down shows only two states at a time! (The change was not made to pre-version 6 combo boxes for compatibility reasons.)

    But I still haven't answered the question, "Why does the size of a combo box include the size of the drop-down?" The reason is that the original combo box did not have a drop-down. Originally, a combo box was just an edit control and a list box glued together. (You can still see this "old-timey-style combo box" in Notepad's Font dialog.) You can think of the original combo box as a modern combo box where the drop-down was pinned open. Under this original design, it was reasonable for the size of the combo box to include both the edit control and the list box, since that's how much space it took up.

    When the "drop-down" style of combo box was invented, the designers wanted to make the transition from "old-timey combo box" to "slick new drop-down combo box" as easy as possible, so the sizing behavior was retained so that code and dialog boxes wouldn't have to change to take advantage of the new drop-down style combo box aside from changing to the CBS_DROPDOWN or CBS_DROPDOWNLIST style in the parameters to CreateWindow or in the dialog template.

    And that's why the size of a combo box includes the size of the drop-down. It's a chain of backwards compatibility going all the way back to the old-timey days before combo boxes learned how to drop down.

  • The Old New Thing

    If you ask for a window caption, you also get a border


    Some people may have noticed that the WS_CAPTION is defined as the combination of WS_BORDER and WS_DLGFRAME:

    #define WS_CAPTION          0x00C00000L     /* WS_BORDER | WS_DLGFRAME  */
    #define WS_BORDER           0x00800000L
    #define WS_DLGFRAME         0x00400000L

    Since WS_CAPTION includes WS_BORDER, it is impossible to get a caption without a border.

    Sometimes people don't quite get this and keep asking the question over and over again, and I have to keep trying to explain the laws of logic in different ways until one of them finally sinks in.

    "I noticed that if I set the WS_CAPTION style, I get a window with a title bar and a border. I don't want the border. How do I get rid of the border? I tried all sorts of combinations of window styles but none of them get me what I want."

    "If you look at the definition, WS_CAPTION includes WS_BORDER, so you can't get a caption without a border."

    "But I see other controls that don't have a border. Static controls on a dialog box, for example, don't have a border, so obviously it's possible to remove the border. How do I do that?"

    "They don't have borders, but then again, they don't have captions either. Caption implies border."

    "But I want a window with a caption and no border. What window styles do I need to use to get that? Do I have to implement it some other way?"

    "Caption implies border. Contrapositive: No border implies no caption. If you don't like that, you'll have to take it up with Russell and Whitehead."

    Of course, you can just choose to leave the system entirely and use none of the styles at all and just paint a custom caption. What you get isn't a real caption, though with enough work you can make it look and act like one. Or at least, make it look and act like one up to the present time. If you have the power of clairvoyance, you might be able to make it look and act like a caption in future versions of the operating system, but I suspect your psychic powers are not quite powerful enough to pull that off.

  • The Old New Thing

    Reading the fine print, episode 3: What's in the bottle?


    Caught out by the FDA.

    I happened to be in the bug spray section of the store when I spotted a bottle of mosquito repellant that proudly proclaimed "100% DEET".

    But the FDA-mandated labelling tells a different story:

    Active ingredients
    N, N diethyl-m-toluamide      
    Other isomers  

    Similarly, foods labeled "zero fat" are actually allowed to contain up to a half gram of fat. (Well, up to but not including.) This is a definition of "zero" with which I had previously been unfamiliar.

    (Episode 1, Episode 2.)

Page 3 of 5 (41 items) 12345