• The Old New Thing

    2007 year-end link clearance


    A few random links that I've collected over the last six months.

    And then the obligatory plug for my column in TechNet Magazine, which, despite the fact that Microsoft's name is on the magazine cover, does not establish the official Microsoft position on anything.

  • The Old New Thing

    Bonus material for The Old New Thing (the book) is now available for download


    I've just been informed by my publisher that the bonus chapters from my book are now available for download. Click on "Sample Chapters". Sorry they're late.

    The source code for the programs in the book can be downloaded from the "Source Code" link. And on a more embarrassing note, there's that "Errata" link, too.

  • The Old New Thing

    But who's going to set up their own email server?


    Many many years ago, back in the days when Microsoft's email address had exclamation points, an internal tool was developed to permit Microsoft employees to view and update their Benefits information from the comfort of their very own offices. Welcome to the paperless office!

    One of my friends noticed an odd sentence in the instructions for using the tool: "Before running the program, make sure you are logged onto your email server."

    "That's strange," my friend thought. "Why does it matter that you're logged onto your email server? This tool doesn't use email."

    Since my friend happened at the time to be a tester for Microsoft's email product, he tried a little experiment. He created a brand new email server on one of his test machines and created an account on it called billg. He then signed onto that email server and then ran the tool.

    Welcome, Bill Gates. Here are your current Benefits selections...

    "Uh-oh," my friend thought. "This is a pretty bad security hole." The tool apparently performed authentication by asking your email server, "Hey, who are you logged in as?" The answer that came back was assumed to be an accurate representation of the user who is running the tool. The back-end server itself was not secured at all; it relied on the client application to do the security checks.

    My friend sent email to the vice president of Human Resources informing him of this problem. "You need to shut down this tool immediately. I have found a security hole that allows anybody to see anybody else's Benefits information."

    The response from the vice president of Human Resources was calm and reassuring. "My developers tell me that the tool is secure. Just enjoy the convenience of updating your Benefits information electronically."

    Frustrated by this, my friend decided to create another account on his test email server, namely one corresponding to the vice president of Human Resources. He then sent the vice president another email message.

    "Please reconsider your previous decision. Your base salary is $xxx and your wife's name is Yyyy. Would you like me to remind you one week before your son's tenth birthday? It's coming up next month."

    A reply was quickly received. "We're looking into this."

    Shortly thereafter, the tool was taken offline "for maintenance."

    Bonus reading: JenK shares her experience with the same incident.

  • The Old New Thing

    What's the row of numbers on the copyright page of books?


    On the copyright page of a book (typically the back of the title page), you'll find a row of numbers. Something like this:

    Printed in the United States of America
    10   9   8   7   6   5   4   3   2   1

    As Dave Taylor explains, the smallest number tells you which printing of the book you have. For example, if you see "10 9 8 7 6 5 4" then you have a fourth printing. Dave doesn't explain why printers use this convention, however.

    I forget where I learned this; I think I read it in one of Don Knuth's books. It has to do with how books are historically made. Each page of a book is converted to a metal plate which is used to make impressions. If another printing run is necessary, you load the plates back onto the printing machine and off you go. But how do you indicate that this is a second printing? It would be expensive to burn a brand new plate just to change the word "first" to "second" on the copyright page. Instead, you pre-load all the printing numbers onto your master, and each time you start a new printing run, you scratch off the lowest number.

    Even though a lot of book printing nowadays is done with computers rather than metal plates, the old method of indicating a printing is retained out of tradition.

  • The Old New Thing

    German adjectives really aren't that hard; they just look that way


    I may have scared a bunch of people with that chart of German adjective endings, but as several commenters noted, native speakers don't refer to the charts; they just say what comes naturally. (Well, except for Leo Petr, who claims that native Russian speakers actually study these charts in grade school.) Commenter Helga Waage noted that one quickly sees patterns in the charts that make them much easier to digest. And that's true. But I taught myself the German adjective endings a completely different way. If you're a student of German, you might find this helpful. If you're not, then you probably just want to skip the rest of this entry.

    As a side note, you have to make sure you put the columns in the right order. In many textbooks, the columns are ordered as "masculine, feminine, neuter, plural", but this fails to highlight the strong similarity between the masculine and neuter genders. From a grammatical standpoint, German neuter nouns are "90% masculine, 10% feminine"; therefore, it's more natural to put the neuter column between the masculine and feminine columns. I therefore prefer the order "masculine, neuter, feminine, plural", which as it so happens appears to be the order that Germans themselves use.

    I'm going to do away with the terms "strong", "weak", and "mixed". Instead, I'm going to reduce it to the question "How much work does the adjective have to do?" which breaks down into two inflections. In my mind, I don't have terms for these two inflections, but for the purpose of this discussion I'll call them "hardworking" and "lazy".

    We start with the lazy inflection, which is used when the definite article or a word that has the same ending as the definite article is present. The lazy inflection is simple: In the singular of the nominative and accusative cases (the "easy cases"), the ending is "-e". In the plural and in the genitive and dative cases (the "hard cases"), the ending is "-en".

    M N F   P
    Nom   -e   -en
    Acc -en
    Dat -en

    There is only one exception to this general rule, which I highlighted in the table above. But even that exception is natural, because the masculine gender is the only one whose articles change between the nominative and the accusative, from "der" to "den" and "ein" to "einen", so you're already used to sticking an extra "-en" in the masculine accusative singular.

    (By the way, I call the nominative and accusative the "easy" cases since most textbooks teach them them within the first few weeks, which means that you've quickly become familiar with them and treat them as old friends. On the other hand, the dative and genitive are not usually introduced until second year, thereby making them "hard" due to their relative unfamiliarity.)

    The hardworking inflection is even easier than the lazy inflection. You use the hardworking inflection when there is no word that has the same ending as the definite article. In this case, the adjective must step up and take the ending itself. (I've included the definite article in the chart for reference.)

    M N F P

    Hey, wait, I left two boxes blank. What's going on here?

    Well, because in those two cases, even if there is nothing else to carry the ending of the definite article, the noun itself gets modified by adding "-s". For example, the genitive of the neuter noun "Wasser" (water) is "Wassers" (of water). The word that carries the ending of the definite article is the noun itself! That's why I leave the boxes blank: The scenario never occurs in German.

    It is those empty boxes, however, that always trip me up. When it comes time to decide what ending to put on the adjective, and I'm in one of those two boxes, the word with the ending of the definite article hasn't appeared yet so I think I'm in the "hardworking" case. And then when I get around to saying the "-s" at the end of "Wassers", I realize, "Oh, crap, there's that indicator. I should have used the lazy form." But it's too late, I already said the adjective with the wrong ending. I could go back and fix it, but that would interrupt the flow of the conversation, so I usually decide to let it slide and take the hit of sounding stupid. (Or, more precisely, sounding more stupid.) If you listen carefully, you may notice me pause for a fraction of a second just as I reach the "-s" and the realization dawns on me that I messed up again.

    If you compare my charts to the official charts with strong, weak and mixed inflections, you'll see that my "lazy" inflection matches the weak inflection exactly, and my "hardworking" inflection matches the "strong" inflection except for those empty boxes. (Because, under my rules, those empty boxes are lazy.) The mixed inflection matches the "lazy" inflection except in three places, which I count as "hardworking" because the indefinite article "ein" does not take an ending in exactly those three places.

    Anyway, so there's how I remember my German adjective endings. Mind you, I don't work through the details of these rules each time I have to decide on an ending. I just have to make the simple note of whether the definite article ending has already appeared (or in the case I always forget: will soon appear). If not, then I put it on the adjective.

  • The Old New Thing

    Raymond's comment policy


    Okay, I was hoping it wasn't going to be needed but it takes only one bad apple...

    Here are the ground rules.

    • I reserve the right to edit, delete, or ignore any comment.
    • If I edit your comment in any significant way, I promise to make that fact clear in the edit. (Exception: Broken links and typos will be repaired without comment.)

    Things that increase the likelihood that your comment will be edited or deleted:

    • Offensive or abusive language or disrespectful behavior. (Insults count as disrespectful behavior. Examples of insulting words/phrases: "moron", "designed without adult supervision".)
    • Misrepresentation. (I.e., claiming to be somebody you're not.) If you don't want to use your real name, that's fine, as long as your "handle" isn't offensive, abusive, or misrepresentative.
    • Comment spamming. This includes posting multiple comments in rapid succession, even if they are different. If you are prone to spurts of creativity, collect them into batches and post them as a single comment.
    • Comments that are off-topic, particular when the discussion turns into a shouting match.
    • Comments that attempt to "out" a company/program/person. E.g., trying to guess the identity of a program whose name I did not reveal.
    • More generally, comments that identify a specific company, program, or person. You can talk about a program, but you have to call it "Program X" or something like that. The purpose is to discuss problems and solutions, not to assign blame and ridicule.
    • Comments that expose me to potential legal action. Examples: Disclosing any company's trade secrets, posting copyrighted source code, violating Microsoft's company guidelines.
    • Giving yourself a star. (This is another case of mispresentation.)

    If a wave of comment spam is under way, I may choose to moderate or even close comments until the problem subsides.

    More rules may be added later, but I hope it won't be necessary.

    Disclaimer: All postings are provided "AS IS" with no warranties and confer no rights. Opinions expressed are those of the respective authors. More legal stuff here.

    [31 May 2004: Add exceptions for broken link repair.

    2 Dec 2004: Add disclaimer and exception for fixing typos.

    13 Dec 2004: Add remark on temporary closure of comments during spam attacks.

    15 Mar 2005: Add remark for off-topic comments.

    20 Mar 2007: No "outing".

    25 Dec 2008: No disrespectful behavior.

    12 March 2009: Examples of insults.

    8 July 2010: Self-starring.]

  • The Old New Thing

    What's with all those spam ping-bots?


    Last December, some people started to get annoyed by the pingback-bots, and others were confused by them. What's the deal with those pingback-bots?

    It's all about fooling the search engines in order to make money, taking advantage of friendly policies at domain registrars to make it less costly an undertaking.

    Step one: Register a bunch of domains with a domain registrar that includes a money-back guarantee.

    Step two: Set up fake blogs on each of those sites, with different keywords.

    Step three: Use a script to search the blogosphere for articles that contain keywords that match your site. (There appears to be a single script that 90% of the spam blogs use, since they all look exactly the same, and have the same bugs!)

    Step four: Create a bogus blog entry for each one that say something like "Hey, here's something interesting I found on the Internet" and then reprints the article in question. (You may notice that many of these sites mis-attribute the authorship; some of them even claim to have written the article themselves!)

    Step five: Host ads on the site.

    Step six: Just before the money-back guarantee period expires, look at each of your fake blogs to see which ones have made money from the ads and which ones haven't. Cancel the domain registrations of the ones that didn't make money.

    Most of these sites are in existence for only a few days, so trying to stop each individual site is a waste of effort; the site is going away soon anyway. The way to get the attention of the spammers is to hit them in the pocketbook.

    Go to the site and look at the ads. if they're using Google Ads, look for violations of the terms of service, such as having more than three sets of ads on a single page or hosting ads from other companies on the same page. Even if you can't find anything wrong, click the "Ads by Google" link.

    From the Google Ads page, click "Send Google your thoughts on the site or the ads you just saw," then "Also report a violation," and then say that you had a problem with "the website," and then say that "The site violates AdSense policies in other ways." Here is where you can write "Hosted more than three ad blocks" or "Also hosts ads from competing vendor." But always write "Contains no original content."

    The theory here is that once Google has determined that the site is violating AdSense policies, they will shut down the account, preventing them from getting any more money, which was the whole point of their scam in the first place.

    Now, I don't hold out much hope that this will work, since I've reported sites and found that even weeks later, the site is still up, happily serving up Google ads and pocketing the click-throughs. But maybe it's because they don't act until there is some critical mass of complaints.

    (I can find no way of reporting violations to the Yahoo Publisher Network.)

    Another category of these types of sites is just people who reprint blog articles (usually erroneously attributed) in order to improve the search engine ranking of the non-spam part of the site.

    Now, you may notice also that there is a "The site is hosting/distributing my copyrighted content" checkbox. That box is useless to me because I am not the copyright owner of the content of this blog. The content of this blog is owned by Microsoft Corporation, If you check that box, Google demands that you file a formal DMCA complain, and I'm pretty sure our legal department is busy with plenty of more important things than chasing down people who rip off the content of some random employee's blog in order to generate ad revenue.

    Normally you don't see the spam pingbacks because I tend to delete them pretty quickly. If you're really clever, you might use the fact that the spam pingbacks linger for days at a time to determine that I'm out of the office.

    Sidebar: Here are some examples of spambots. Feel free to report them to the ad vendor, if they are hosting ads. And as I already noted above, some of these sites may already be down.

    Update: The victory over 247blogging was short-lived. Within a month, they moved to a new ad company whose terms of service have no problem with sites with no original content.

    One annoying consequence of all these content-scraping sites is that they end up ranking higher in Google than me, and I'm the one who wrote the article in the first place! For example, a Google search for Joshua Roman groupies on 17 February 2008 doesn't even show my blog article; instead, the top hits are

    1. A site which scraped my entry.
    2. Another page from the same site as #1 which also scraped my entry.
    3. A different site which scraped my entry.
    4. An article from this Web site but not the one that says Joshua Roman groupies in the title.
    5. Another misfire from this Web site.
    6. A third site which scraped my entry.
    7. A fourth site which scraped my entry.
    8. A fifth site which scraped my entry.
    9. An unrelated hit.
    10. Another unrelated hit.

    So there you go. The top ten search results contain five sites that scraped my entry and no links to the original! On the other hand, Live Search is not fooled and finds the right article as the top search result. Yahoo ranks my article as #1 and #3 (go figure), which is nice, but all but one of the remaining hits are for scrapers.

    A Google search for bands of Valentine minstrels is even worse. The first three hits are sites which scraped my article and there are no hits at all to this Web site in the top 100 search results, although nine scrapers rank in the top 100. Again, Live Search is not fooled and finds my article as its #1 hit. Yahoo also ranks my article at #1 although a scraper sneaks in at #2.

  • The Old New Thing

    Stupid Raymond talent: Screaming carrier


    Similar to Mike, I was able to scream (not whistle: scream) a 300 baud carrier tone. This skill proved useful when I was in college and the mainframe system was down. Instead of sitting around waiting for the system to come back, I just went about my regular business around campus. Every so often, I would go to a nearby campus phone (like a free public phone but it can only make calls to other locations on campus), dial the 300 baud dial-up number, and scream the carrier tone. If I got a response, that meant that the mainframe was back online and I should wrap up what I was doing and head back to the lab.

    Mind you, this skill isn't very useful nowadays.

    What stupid computer talent do you have?

  • The Old New Thing

    You can do anything at, anything at all


    Zombo has been around for many years, and yet I still find it hilarious (requires Flash).

    I just went back to check, and alas the introduction actually ends. But fortunately, they made it even cooler by having a text-only version. (Still requires sound.)

  • The Old New Thing

    User interface design for vending machines


    How hard can it be to design the user interface of a vending machine?

    You accept money, you have some buttons, the user pushes the button, they get their product and their change.

    At least in the United States, many vending machines arrange their product in rows and columns (close-up view). To select a product, you type the letter of the row and the number of the column. Could it be any simpler?

    Take a closer look at that vending machine design. Do you see the flaw?

    (Ignore the fact that the picture is a mock-up and repeats row C over and over again.)

    The columns are labelled 1 through 10. That means that if you want to buy product C10, you have to push the buttons "C" and "10". But in our modern keyboard-based world, there is no "10" key. Instead, people type "1" followed by "0".

    What happens if you type "C"+"1"+"0"? After you type the "1", product C1 drops. Then you realize that there is no "0" key. And you bought the wrong product.

    This is not a purely theoretical problem. I have seen this happen myself.

    How would you fix this?

    One solution is simply not to put so many items on a single row, considering that people have difficulty making decisions if given too many options. On the other hand, the vendor might not like that design, since their goal is to maximize the number of products.

    Another solution is to change the labels so that there are no items where the number of button presses needed do not match the number of characters in the label. In other words, no buttons with two characters on them (like the "10" button).

    Switch the rows and columns, so that the products are labelled "1A" through "1J" across the top row, and "9A" through "9J" across the bottom. This assumes you don't have more than nine rows. (This won't work for super size vending machines - look at the buttons on that thing; they go up to "U"!

    You can see another solution in that most recent vending machine: Instead of calling the tenth column "10", call it "0". Notice that they also removed rows "I" and "O" to avoid possible confusion with "1" and "0".

    A colleague of mine pointed out that some vending machines use numeric codes for all items rather than a letter and a digit. For example, if the cookies are product number 23, you punch "2" "3". If you want the chewing gum (product code 71), you punch "7" "1". He poses the following question:

    What are some problems with having your products numbered from 1 to 99?

    Answers next time.

Page 1 of 135 (1,342 items) 12345»