• The Old New Thing

    Catholic baseball fans want to eat meat on opening day


    So it happens that Opening Day of the baseball season coincides with Good Friday, a day of "fasting and abstinence" according to Catholic tradition. (Then again, after Vatican II, the definition of "fasting and abstinence" weakened significantly. All that most people remember any more is "no meat".)

    Catholics in Boston have applied to the archdiocese for a special dispensation so they can have a hot dog at the game. The Church said "Nice try".

    But at least you can still order a beer.

  • The Old New Thing

    The car with no user-serviceable parts inside


    For the first time, a team of women is challenged to develop a car, and the car they come up with requires an oil change only every 50,000 kilometers and doesn't even have a hood, so you can't poke around the engine.

    To me, a car has no user-serviceable parts inside. The only times I have opened the hood is when somebody else said, "Hey, let me take a look at the engine of your car." (I have a Toyota Prius.) On my previous car, the only time I opened the hood was to check the oil.

    Sometimes the open-source folks ask, "Would you buy a car whose hood can't be opened?" It looks like that a lot of people (including me) would respond, "Yes."

  • The Old New Thing

    Why is the line terminator CR+LF?

    This protocol dates back to the days of teletypewriters. CR stands for "carriage return" - the CR control character returned the print head ("carriage") to column 0 without advancing the paper. LF stands for "linefeed" - the LF control character advanced the paper one line without moving the print head. So if you wanted to return the print head to column zero (ready to print the next line) and advance the paper (so it prints on fresh paper), you need both CR and LF.

    If you go to the various internet protocol documents, such as RFC 0821 (SMTP), RFC 1939 (POP), RFC 2060 (IMAP), or RFC 2616 (HTTP), you'll see that they all specify CR+LF as the line termination sequence. So the the real question is not "Why do CP/M, MS-DOS, and Win32 use CR+LF as the line terminator?" but rather "Why did other people choose to differ from these standards documents and use some other line terminator?"

    Unix adopted plain LF as the line termination sequence. If you look at the stty options, you'll see that the onlcr option specifies whether a LF should be changed into CR+LF. If you get this setting wrong, you get stairstep text, where

    where the previous line left off. So even unix, when left in raw mode, requires CR+LF to terminate lines. The implicit CR before LF is a unix invention, probably as an economy, since it saves one byte per line.

    The unix ancestry of the C language carried this convention into the C language standard, which requires only "\n" (which encodes LF) to terminate lines, putting the burden on the runtime libraries to convert raw file data into logical lines.

    The C language also introduced the term "newline" to express the concept of "generic line terminator". I'm told that the ASCII committee changed the name of character 0x0A to "newline" around 1996, so the confusion level has been raised even higher.

    Here's another discussion of the subject, from a unix perspective.

  • The Old New Thing

    On a server, paging = death


    Chris Brumme's latest treatise contained the sentence "Servers must not page". That's because on a server, paging = death.

    I had occasion to meet somebody from another division who told me this little story: They had a server that went into thrashing death every 10 hours, like clockwork, and had to be rebooted. To mask the problem, the server was converted to a cluster, so what really happened was that the machines in the cluster took turns being rebooted. The clients never noticed anything, but the server administrators were really frustrated. ("Hey Clancy, looks like number 2 needs to be rebooted. She's sucking mud.") [Link repaired, 8am.]

    The reason for the server's death? Paging.

    There was a four-bytes-per-request memory leak in one of the programs running on the server. Eventually, all the leakage filled available RAM and the server was forced to page. Paging means slower response, but of course the requests for service kept coming in at the normal rate. So the longer you take to turn a request around, the more requests pile up, and then it takes even longer to turn around the new requests, so even more pile up, and so on. The problem snowballed until the machine just plain keeled over.

    After much searching, the leak was identified and plugged. Now the servers chug along without a hitch.

    (And since the reason for the cluster was to cover for the constant crashes, I suspect they reduced the size of the cluster and saved a lot of money.)

  • The Old New Thing

    More on the AMD64 calling convention


    Josh Williams picks up the 64-bit ball with an even deeper discussion of the AMD64 (aka x64) calling convention and things that go wrong when you misdeclare your function prototypes.

  • The Old New Thing

    Why do text files end in Ctrl+Z?


    Actually, text files don't need to end in Ctrl+Z, but the convention persists in certain circles. (Though, fortunately, those circles are awfully small nowadays.)

    This story requires us to go back to CP/M, the operating system that MS-DOS envisioned itself as a successor to. (Since the 8086 envisioned itself as the successor to the 8080, it was natural that the operating system for the 8086 would view itself as the successor to the primary operating system on the 8080.)

    In CP/M, files were stored in "sectors" of 128 bytes each. If your file was 64 byte long, it was stored in a full sector. The kicker was that the operating system tracked the size of the file as the number of sectors. So if your file was not an exact multiple of 128 bytes in size, you needed some way to specify where the "real" end-of-file was.

    That's where Ctrl+Z came in.

    By convention, the unused bytes at the end of the last sector were padded with Ctrl+Z characters. According to this convention, if you had a program that read from a file, it should stop when it reads a Ctrl+Z, since that meant that it was now reading the padding.

    To retain compatibility with CP/M, MS-DOS carried forward the Ctrl+Z convention. That way, when you transferred your files from your old CP/M machine to your new PC, they wouldn't have garbage at the end.

    Ctrl+Z hasn't been needed for years; MS-DOS records file sizes in bytes rather than sectors. But the convention lingers in the "COPY" command, for example.
  • The Old New Thing

    Still more creative uses for CAPTCHA


    I want to say up front that I think CAPTCHA is a stupid name. CAPTCHA stands for "Computer-Aided Process for Testing..." something something.

    Why do people feel the urge the create some strained cutesy acronym for their little invention?

    Anyway, it has already been noted how spammers are getting around these tests by harvesting a practically-free resource on the Internet: the desire to see pornography.

    Someone designed a software robot that would fill out a registration form and, when confronted with an image processing test, would post it on a free porn site. Visitors to the porn site would be asked to complete the test before they could view more pornography, and the software robot would use their answer to complete the e-mail registration.

    Ah, remember the days when you had to whisper the word "pornography"?

    Anyway, it looks like the virus-writers have also taken the two-edged sword and pointed it in the other direction. (Ah, another one of Raymond's tortured mixed metaphors.)

    As you may be aware, the latest trend in virus-detection-avoidance is to attach an encrypted ZIP file, since virus-checkers don't know how to decrypt them. To get the sucker to activate the payload, you put the password in the message body.

    Well, virus checkers figured this out rather quickly and scanned the message body to see if there's a password in the text.

    Now the virus-writers have upped the ante. The Bagle-N virus attaches an encrypted ZIP file and provides the password as an image, using the same trick as the anti-robot people.

    Fortunately, the image generator they use is pretty easy to do OCR on, since they don't make any attempt to fuzz the images.

    I predict the next step will be that the virus-writers send two messages to each victim. The first contains the payload, and the second contains the password. That way the virus-scanning software is completely helpless since the password to decrypt the ZIP file isn't even in the message being scanned!

    Once again, just goes to show that social engineering can beat out pretty much any technological security mechanism.

    (I think virus scanners are now starting to block any password-protected ZIP. But that won't stop the viruses for long. They'll just have a link to a ZIP file or something.)

  • The Old New Thing

    How do I convert a SID between binary and string forms?


    Of course, if you want to do this programmatically, you would use ConvertSidToStringSid and ConvertStringSidtoSid, but often you're studying a memory dump or otherwise need to do the conversion manually.

    If you have a SID like S-a-b-c-d-e-f-g-...

    Then the bytes are

    N(number of dashes minus two)
    bbbbbb(six bytes of "b" treated as a 48-bit number in big-endian format)
    cccc(four bytes of "c" treated as a 32-bit number in little-endian format)
    dddd(four bytes of "d" treated as a 32-bit number in little-endian format)
    eeee(four bytes of "e" treated as a 32-bit number in little-endian format)
    ffff(four bytes of "f" treated as a 32-bit number in little-endian format)

    So for example, if your SID is S-1-5-21-2127521184-1604012920-1887927527-72713, then your raw hex SID is


    This breaks down as follows:

    05(seven dashes, seven minus two = 5)
    000000000005(5 = 0x000000000005, big-endian)
    15000000(21 = 0x00000015, little-endian)
    A065CF7E(2127521184 = 0x7ECF65A0, little-endian)
    784B9B5F(1604012920 = 0x5F9B4B78, little-endian)
    E77C8770(1887927527 = 0X70877CE7, little-endian)
    091C0100(72713 = 0x00011c09, little-endian)

    Yeah, that's great, Raymond, but what do all those numbers mean?

    S-1-version number (SID_REVISION)
    -...-...-...-these identify the machine that issued the SID
    72713unique user id on the machine

    Each machine generates a unique ID that it uses to stamp all the SIDs it creates (-...-...-...-). The last number is a "relative id (RID)" that represents a user created by that machine. There are a bunch of predefined RIDs; you can see them in the header file ntseapi.h, which is also where I got these names from. The system reserves RIDs up to 999, so the first non-builtin account gets assigned ID number 1000. The number 72713 means that this particular SID is the 71714th SID created by the issuer. (The machine that issued this SID is clearly a domain controller, responsible for creating the accounts of tens of thousands of users.)

    (Actually, I lied above when I said that this is the 71714th SID created by the issuer. Large servers can delegate SID creation to helpers, in which case SID issuance is no longer strictly consecutive.)

    Security isn't my area of expertise, so it's entirely possibly (perhaps even likely) that I got something wrong up above. But it's mostly correct, I think.

  • The Old New Thing

    Senators are really good at stock-picking

    A Georgia State University study shows that U.S. senators have an uncanny knack for picking stocks that outpace the overall market. Professor Alan Ziobrowski's analysis of senators' financial disclosure data found that over a period of six years, the lawmakers outperformed the market by 12 percent.

    Professor Ziobrowski seems convinced that this is evidence of unethical behavior.

  • The Old New Thing

    What is the default security descriptor?


    All these functions have an optional LPSECURITY_ATTRIBUTES parameter, for which everybody just passes NULL, thereby obtaining the default security descriptor. But what is the default security descriptor?

    Of course, the place to start is MSDN, in the section titled Security Descriptors for New Objects.

    It says that the default DACL comes from inheritable ACEs (if the object belongs to a hierarchy, like the filesystem or the registry); otherwise, the default DACL comes from the primary or impersonation token of the creator.

    But what is the default primary token?

    Gosh, I don't know either. So let's write a program to find out.

    #include <windows.h>
    #include <sddl.h> // ConvertSecurityDescriptorToStringSecurityDescriptor
    int WINAPI
     HANDLE Token;
     if (OpenProcessToken(GetCurrentProcess(), TOKEN_QUERY, &Token)) {
     DWORD RequiredSize = 0;
     GetTokenInformation(Token, TokenDefaultDacl, NULL, 0, &RequiredSize);
     TOKEN_DEFAULT_DACL* DefaultDacl =
         reinterpret_cast<TOKEN_DEFAULT_DACL*>(LocalAlloc(LPTR, RequiredSize));
     if (DefaultDacl) {
      LPTSTR StringSd;
      if (GetTokenInformation(Token, TokenDefaultDacl, DefaultDacl,
                              RequiredSize, &RequiredSize) &&
          InitializeSecurityDescriptor(&Sd, SECURITY_DESCRIPTOR_REVISION) &&
          SetSecurityDescriptorDacl(&Sd, TRUE,
              DefaultDacl->DefaultDacl, FALSE) &&
       MessageBox(NULL, StringSd, TEXT("Result"), MB_OK);
     return 0;

    Okay, I admit it, the whole purpose of this entry is just so I can call the function ConvertSecurityDescriptorToStringSecurityDescriptor, quite possibly the longest function name in the Win32 API. And just for fun, I used the NT variable naming convention instead of Hungarian.

    If you run this program you'll get something like this:


    Pull out our handy reference to the Security Descriptor String Format to decode this.

    • "D:" - This introduces the DACL.
    • "(A;;GA;;;S-...)" - "Allow" "Generic All" access to "S-...", which happens to be me. Every user by default has full access to their own process.
    • "(A;;GA;;;SY)" - "Allow" "Generic All" access to "Local System".

    Next time, I'll teach you how to decode that S-... thing.

Page 415 of 444 (4,431 items) «413414415416417»