September, 2007

  • The Old New Thing

    Playing the hippie poetry game for four cents per line

    • 5 Comments

    The party game goes by many names. Hippie poetry, Beat poetry, Dada poetry. To play, have a group of people sit in a circle and give each person a piece of paper and writing implement. To start, each person writes a single line of poetry and hands it to the person to his or her left or right. (The direction isn't important, as long as it's consistent.)

    At each round, you add one line to the growing poem, then fold over the top of the page so that only the line you added is visible. Pass the paper to the left (or right), and repeat. Popular stopping conditions are when the paper is full or when the page returns to the person who started the poem. Once the poem is complete, the paper is unfolded and everyone takes turns reading the results. They are usually quite absurd.

    That was a long and ultimately unsatisfying set-up for the fact that you can make money playing this game. And you do it via the Internet from your home. The catch: You get paid four cents per line. Oh, and the poem is about sex.

  • The Old New Thing

    What happens if you pass a source length greater than the actual string length?

    • 26 Comments

    Many functions accept a source string that consists of both a pointer and a length. And if you pass a length that is greater than the length of the string, the result depends on the function itself.

    Some of those functions, when given a string and a length, will stop either when the length is exhausted or a null terminator is reached whichever comes first. For example, if you pass a cchSrc greater than the length of the string to the StringCchCopyN function, it will stop at the null terminator.

    On the other hand, many other functions (particularly those in the NLS family) will cheerfully operate past a null character if you ask them to. The idea here is that since you passed an explicit size, you're consciously operating on a buffer which might contain embedded null characters. Because, after all, if you passed an explicit source size, you really meant it, right? (Maybe you're operating on a BSTR, which supports embedded nulls; to get the size of a BSTR you must use a function like SysStringLen.) For example, if you call CharUpperBuff(psz, 20), then the function really will convert to uppercase 20 TCHARs starting at psz. If there happens to be a null character at psz[10], the function will convert the null to uppercase and continue converting the next ten TCHARs as well.

    I've seen programs crash because they thought that functions like CharUpperBuff and MultiByteToWideChar stopped when they encountered a null. For example, somebody might write

    // buggy code - see discussion
    void someFunction(char *pszFile)
    {
     CharUpperBuff(pszFile, MAX_PATH);
     ... do something with pszFile ...
    }
    
    void Caller()
    {
     char buffer[80];
     sprintf(buffer, "file%d", get_fileNumber());
     someFunction(buffer);
    }
    

    The intent here was for someFunction to convert the string to uppercase before operating on it, up to MAX_PATH characters' worth, but instead what happens is that the MAX_PATH characters starting at pszFile are converted, even though the actual buffer is shorter! As a result, MAX_PATH − 80 = 220 characters beyond the end of buffer are also converted. And since that's a stack buffer, those bytes are likely to include the return address. Result: Crash-o-rama. Things get even more interesting if the short buffer had been allocated on the heap instead. Then instead of corrupting your return address (which you would probably notice as soon as the function returned), you corrupt the heap, which typically doesn't manifest itself in a crash until long after the offending function has left the scene of the crime.

    Critique, then, this replacement function:

    //  buggy code - do not use
    int invariant_strnicmp(char *s1, char *s2, size_t n)
    {
     // [Update: 9:30am - typo fixed]
     return CompareStringA(LOCALE_INVARIANT, NORM_IGNORECASE,
                           s1, n, s2, n) - CSTR_EQUAL;
    }
    

    (Michael Kaplan has one answer different from the one I was looking for.)

  • The Old New Thing

    Japanese street fashion reaches Finland

    • 15 Comments

    På hemsidan av Martin Frid, en av programledarna av NHK:s japanska nyheter på svenska, hittade jag dem här fotona av japanskt gataklädemode... i Finland! (Don't worry, the web site is in English.) Nu behöver du inte resa till Japan. Om det är en positiv utveckling är jag inte helt säker på...

  • The Old New Thing

    Why is my delay-rendered format being rendered too soon?

    • 5 Comments

    Here's a customer question:

    I've put data on the clipboard as delay-rendered, but I'm getting a WM_RENDERFORMAT request for my CF_HDROP for many operations even though nobody actually looks at the files. Operations such as right-clicking a blank space on the desktop or opening the Edit menu. I don't want to render the data until the user hits Paste because generating the data requires me to download the file from a Web server.

    The CF_HDROP format is a list of file names, and at the time the format is generated, the files must already exist. That's because the whole purpose of CF_HDROP is to describe files that already exist.

    These simple operations cause a request for CF_HDROP because, as a simple list of file names, it is expected to be a fast format. The data object merely has to provide a list of things that already exist; it doesn't have to go make them. It's interesting that the customer wants to delay generating the CF_HDROP format until the user selects Paste, because the shell is asking for CF_HDROP specifically to see whether it should enable the Paste command in the first place!

    That you shouldn't generate dynamic data in response to CF_HDROP is also clear when you think about the lifetime issues. If you're going to generate the files on the fly, when do you know that it's safe to delete them? If the user drops the file onto Internet Explorer or Firefox, the Web browser is going to view the file as a Web page. You get no notification when the user closes the Web browser, and therefore you don't know when it's safe to delete the file. The CF_HDROP format describes files that already exist independent of the data object.

    What is the correct thing to do if you want to delay-render a virtual file? Use the FileGroupDescriptor clipboard format. That's what it's for: Delay-rendering of virtual file contents.

    (I'm assuming an advanced audience that knows how to use a FileGroupDescriptor. There will be a remedial course in the use of the FileGroupDescriptor sometime next year.)

  • The Old New Thing

    Nearly everybody has a $500 flashlight

    • 30 Comments

    When I described using a laptop computer as an impromptu flashlight, that triggered a lot of comments from people using their cell phone or PDA as a flashlight. One nickname I've heard for this phenomenon is "the $500 flashlight", $500 being the price of a PDA or SmartPhone at the time the term was coined. (I was amused to find that even Scott Adams does this.)

    A few years ago, on a trip to Tikal, I took part in a pre-dawn hike into the park in order to observe the sunrise from atop Temple IV. For many people (including me), this was an impulse tour rather than a planned option, so we hadn't packed flashlights in our gear. There were a lot of $500 flashlights on that hike.

  • The Old New Thing

    What do I do with per-user data when I uninstall?

    • 42 Comments

    If the user chooses to uninstall your program, what do you do with the data your program kept in HKEY_CURRENT_USER, and other parts of the user profile? Should you enumerate all the profiles on the machine and clean them up?

    No. Let the data go.

    First, messing with the profiles of users that aren't logged on can result in data corruption, as we saw when we looked at roaming user profiles. Users don't like it when their data get corrupted.

    Second, the users might want to keep that local data, especially if they are uninstalling your program only temporarily. For example, maybe your program's installation got corrupted and the user is doing an uninstall/reinstall in an attempt to repair it. Or the user might be uninstalling your program only because they're going to be reinstalling a newer version of your program. If you deleted all the users' saved games, say, they are going to be kind of miffed that they won't be able to finish off the big boss that they've been working weeks to defeat.

    Now, if you want, you can clean up the per-user data for the current user (after asking for permission), but you definitely should not be messing with the profiles of other users.

    (Remember, this is my personal recommendation, not an official recommendation of Microsoft Corporation. I know that had I not included this explicit disclaimer, somebody would have written an article somewhere saying that "Microsoft says to...")

  • The Old New Thing

    Another type of misplaced apology: Apologizing for not knowing the penalty

    • 37 Comments

    You may remember this story from a few years ago. A college student printed his own bar codes (for inexpensive items), placed them over the bar code for expensive items, then went through the register and ended up paying $4.99 for a $149.99 iPod, for example. Ironically, he would have gotten off lighter if he had merely shoplifted the items, because manufacturing fake bar codes brings the crime to the level of forgery, a felony.

    But what really struck me was the nature of the apology. He didn't say, "I should not have done it," or even the unconvincing "I didn't know it was wrong" He said, "I did this not knowing of the serious penalty that lies behind it." In other words, "I only break the law when the penalty is mild!"

  • The Old New Thing

    The code page on the server is not necessarily the code page on the client

    • 10 Comments

    It's not enough to choose a code page. You have to choose the right code page.

    We have a system that reformats and reinstalls a network client computer each time it boots up. The client connects to the server to obtain a loader program, and the loader program then connects to the server to download the actual operating system. If anything goes wrong, the server sends an error message to the client, which is displayed on the screen while it's still in character mode. (No Unicode available here.)

    Initially, we used FormatMessageA to generate the error message, but somebody told us we should use FormatMessageW followed by WideCharToMultiByte(CP_OEM). I'm not sure whether this is a valid suggestion, because the client hasn't yet installed Unicode support so it only is capable of displaying 8-bit text, and using CP_OEM will use the OEM code page on the server, which doesn't necessarily match the OEM code page on the client.

    What is the correct way of generating the error message string?

    Now, mind you, the argument against using CP_OEM is the same argument against using FormatMessageA! In neither case are you sure that the code page on the server matches the code page on the client. If CP_OEM is wrong, then so too is FormatMessageA (which uses CP_ACP).

    The correct solution is to use FormatMessageW followed by WideCharToMultiByte(x), where x is the OEM code page of the client. You need to get this information from the client to the server somehow so that the server knows what character set the client is going to use for displaying strings.

    There's really nothing deep going on here. If you're going to display an 8-bit string, you need to use the same code page when generating the string as you will use when displaying it. Keep your eye on the code page.

  • The Old New Thing

    Snatching defeat from the jaws of victory now more popular than vice versa

    • 15 Comments

    If you go to your favorite search engine and search for the phrase "defeat from the jaws of victory", you'll find that it turns up several times more its than the phrase "victory from the jaws of defeat". I just find it oddly amusing that the joke has become more popular than the phrase it came from.

  • The Old New Thing

    Why isn't QuickEdit on by default in console windows?

    • 46 Comments

    In the properties of console windows, you can turn on QuickEdit mode, which allows the mouse to be used to select text without having to go explicitly into Mark mode. (In a sense, the console window is permanently in Mark mode.) Why isn't this on by default? It's so useful!

    Somebody thought the same thing and changed the default in one of the earlier versions of Windows (my dim recollection was that it was Windows 2000) without telling anyone, especially not the manager responsible for the feature itself.

    The change was slipped in late in the game and made it into the released product.

    And then all the complaints came streaming in.

    Since the change wasn't part of any of the betas or release candidates, customers never got a chance to register their objections before the product hit the streets.

    Why did customers object anyway? Because it breaks console programs that want to use the mouse. A console program that calls ReadConsoleInput can receive input records related to mouse activity. A full-screen character mode editor like, say, emacs or vi or something more Windows-centric like the M editor that came with the Programmer's Workbench will use the mouse to select text within the editor itself. Or they might hook up a context menu to right-clicks. At any rate, turning QuickEdit on by default means that the mouse stops working in all of these programs.

    A hotfix had to be issued to change the QuickEdit default back to "off". If it means that much to you, you can turn it on yourself.

Page 2 of 4 (36 items) 1234