History

  • The Old New Thing

    Why is the line terminator CR+LF?

    • 40 Comments
    This protocol dates back to the days of teletypewriters. CR stands for "carriage return" - the CR control character returned the print head ("carriage") to column 0 without advancing the paper. LF stands for "linefeed" - the LF control character advanced the paper one line without moving the print head. So if you wanted to return the print head to column zero (ready to print the next line) and advance the paper (so it prints on fresh paper), you need both CR and LF.

    If you go to the various internet protocol documents, such as RFC 0821 (SMTP), RFC 1939 (POP), RFC 2060 (IMAP), or RFC 2616 (HTTP), you'll see that they all specify CR+LF as the line termination sequence. So the the real question is not "Why do CP/M, MS-DOS, and Win32 use CR+LF as the line terminator?" but rather "Why did other people choose to differ from these standards documents and use some other line terminator?"

    Unix adopted plain LF as the line termination sequence. If you look at the stty options, you'll see that the onlcr option specifies whether a LF should be changed into CR+LF. If you get this setting wrong, you get stairstep text, where

    each
        line
            begins
    
    where the previous line left off. So even unix, when left in raw mode, requires CR+LF to terminate lines. The implicit CR before LF is a unix invention, probably as an economy, since it saves one byte per line.

    The unix ancestry of the C language carried this convention into the C language standard, which requires only "\n" (which encodes LF) to terminate lines, putting the burden on the runtime libraries to convert raw file data into logical lines.

    The C language also introduced the term "newline" to express the concept of "generic line terminator". I'm told that the ASCII committee changed the name of character 0x0A to "newline" around 1996, so the confusion level has been raised even higher.

    Here's another discussion of the subject, from a unix perspective.

  • The Old New Thing

    On a server, paging = death

    • 40 Comments

    Chris Brumme's latest treatise contained the sentence "Servers must not page". That's because on a server, paging = death.

    I had occasion to meet somebody from another division who told me this little story: They had a server that went into thrashing death every 10 hours, like clockwork, and had to be rebooted. To mask the problem, the server was converted to a cluster, so what really happened was that the machines in the cluster took turns being rebooted. The clients never noticed anything, but the server administrators were really frustrated. ("Hey Clancy, looks like number 2 needs to be rebooted. She's sucking mud.") [Link repaired, 8am.]

    The reason for the server's death? Paging.

    There was a four-bytes-per-request memory leak in one of the programs running on the server. Eventually, all the leakage filled available RAM and the server was forced to page. Paging means slower response, but of course the requests for service kept coming in at the normal rate. So the longer you take to turn a request around, the more requests pile up, and then it takes even longer to turn around the new requests, so even more pile up, and so on. The problem snowballed until the machine just plain keeled over.

    After much searching, the leak was identified and plugged. Now the servers chug along without a hitch.

    (And since the reason for the cluster was to cover for the constant crashes, I suspect they reduced the size of the cluster and saved a lot of money.)

  • The Old New Thing

    More on the AMD64 calling convention

    • 2 Comments

    Josh Williams picks up the 64-bit ball with an even deeper discussion of the AMD64 (aka x64) calling convention and things that go wrong when you misdeclare your function prototypes.

  • The Old New Thing

    Why do text files end in Ctrl+Z?

    • 34 Comments

    Actually, text files don't need to end in Ctrl+Z, but the convention persists in certain circles. (Though, fortunately, those circles are awfully small nowadays.)

    This story requires us to go back to CP/M, the operating system that MS-DOS envisioned itself as a successor to. (Since the 8086 envisioned itself as the successor to the 8080, it was natural that the operating system for the 8086 would view itself as the successor to the primary operating system on the 8080.)

    In CP/M, files were stored in "sectors" of 128 bytes each. If your file was 64 byte long, it was stored in a full sector. The kicker was that the operating system tracked the size of the file as the number of sectors. So if your file was not an exact multiple of 128 bytes in size, you needed some way to specify where the "real" end-of-file was.

    That's where Ctrl+Z came in.

    By convention, the unused bytes at the end of the last sector were padded with Ctrl+Z characters. According to this convention, if you had a program that read from a file, it should stop when it reads a Ctrl+Z, since that meant that it was now reading the padding.

    To retain compatibility with CP/M, MS-DOS carried forward the Ctrl+Z convention. That way, when you transferred your files from your old CP/M machine to your new PC, they wouldn't have garbage at the end.

    Ctrl+Z hasn't been needed for years; MS-DOS records file sizes in bytes rather than sectors. But the convention lingers in the "COPY" command, for example.
  • The Old New Thing

    Why are dialog boxes initially created hidden?

    • 17 Comments

    You may not have noticed it until you looked closely, but dialog boxes are actually created hidden initially, even if you specify WS_VISIBLE in the template. The reason for this is historical.

    Rewind back to the old days (we're talking Windows 1.0), graphics cards are slow and CPUs are slow and memory is slow. You can pick a menu option that displays a dialog and wait a second or two for the dialog to get loaded off the floppy disk. (Hard drives are for the rich kids.) And then you have to wait for the dialog box to paint.

    To save valuable seconds, dialog boxes are created initially hidden and all typeahead is processed while the dialog stays hidden. Only after the typeahead is finished is the dialog box finally shown. And if you typed far ahead enough and hit Enter, you might even have been able to finish the entire dialog box without it ever being shown! Now that's efficiency.

    Of course, nowadays, programs are stored on hard drives and you can't (normally) out-type a hard drive, so this optimization is largely wasted, but the behavior remains for compatibility reasons.

    Actually this behavior still serves a useful purpose: If the dialog were initially created visible, then the user would be able to see all the controls being created into it, and watch as WM_INITDIALOG ran (changing default values, hiding and showing controls, moving controls around...) This is both ugly and distracting. ("How come the box comes up checked, then suddenly unchecks itself before I can click on it?")

  • The Old New Thing

    Why do operations on "byte" result in "int"?

    • 38 Comments
    (The following discussion applies equally to C/C++/C#, so I'll use C#, since I talk about it so rarely.)

    People complain that the following code elicits a warning:

    byte b = 32;
    byte c = ~b;
    // error CS0029: Cannot implicitly convert type 'int' to 'byte'
    

    "The result of an operation on 'byte' should be another 'byte', not an 'int'," they claim.

    Be careful what you ask for. You might not like it.

    Suppose we lived in a fantasy world where operations on 'byte' resulted in 'byte'.

    byte b = 32;
    byte c = 240;
    int i = b + c; // what is i?
    

    In this fantasy world, the value of i would be 16! Why? Because the two operands to the + operator are both bytes, so the sum "b+c" is computed as a byte, which results in 16 due to integer overflow. (And, as I noted earlier, integer overflow is the new security attack vector.)

    Similarly,

    int j = -b;
    
    would result in j having the value 224 and not -32, for the same reason.

    Is that really what you want?

    Consider the following more subtle scenario:

    struct Results {
     byte Wins;
     byte Games;
    };
    
    bool WinningAverage(Results captain, Results cocaptain)
    {
     return (captain.Wins + cocaptain.Wins) >=
            (captain.Games + cocaptain.Games) / 2;
    }
    

    In our imaginary world, this code would return incorrect results once the total number of games played exceeded 255. To fix it, you would have to insert annoying int casts.

     return ((int)captain.Wins + cocaptain.Wins) >=
            ((int)captain.Games + cocaptain.Games) / 2;
    
    So no matter how you slice it, you're going to have to insert annoying casts. May as well have the language err on the side of safety (forcing you to insert the casts where you know that overflow is not an issue) than to err on the side of silence (where you may not notice the missing casts until your Payroll department asks you why their books don't add up at the end of the month).
  • The Old New Thing

    Defrauding the WHQL driver certification process

    • 81 Comments

    In a comment to one of my earlier entries, someone mentioned a driver that bluescreened under normal conditions, but once you enabled the Driver Verifier (to try to catch the driver doing whatever bad thing it was doing), the problem went away. Another commenter bemoaned that WHQL certification didn't seem to improve the quality of the drivers.

    Video drivers will do anything to outdo their competition. Everybody knows that they cheat benchmarks, for example. I remember one driver that ran the DirectX "3D Tunnel" demonstration program extremely fast, demonstrating how totally awesome their video card is. Except that if you renamed TUNNEL.EXE to FUNNEL.EXE, it ran slow again.

    There was another one that checked if you were printing a specific string used by a popular benchmark program. If so, then it only drew the string a quarter of the time and merely returned without doing anything the other three quarters of the time. Bingo! Their benchmark numbers just quadrupled.

    Anyway, similar shenanigans are not unheard of when submitting a driver to WHQL for certification. Some unscrupulous drivers will detect that they are being run by WHQL and disable various features so they pass certification. Of course, they also run dog slow in the WHQL lab, but that's okay, because WHQL is interested in whether the driver contains any bugs, not whether the driver has the fastest triangle fill rate in the industry.

    The most common cheat I've seen is drivers which check for a secret "Enable Dubious Optimizations" switch in the registry or some other place external to the driver itself. They take the driver and put it in an installer which does not turn the switch on and submit it to WHQL. When WHQL runs the driver through all its tests, the driver is running in "safe but slow" mode and passes certification with flying colors.

    The vendor then takes that driver (now with the WHQL stamp of approval) and puts it inside an installer that enables the secret "Enable Dubious Optimizations" switch. Now the driver sees the switch enabled and performs all sorts of dubious optimizations, none of which were tested by WHQL.

  • The Old New Thing

    Why are HANDLE return values so inconsistent?

    • 21 Comments

    If you look at the various functions that return HANDLEs, you'll see that some of them return NULL (like CreateThread) and some of them return INVALID_HANDLE_VALUE (like CreateFile). You have to check the documentation to see what each particular function returns on failure.

    Why are the return values so inconsistent?

    The reasons, as you may suspect, are historical.

    The values were chosen to be compatible with 16-bit Windows. The 16-bit functions OpenFile, _lopen and _lcreat return -1 on failure, so the 32-bit CreateFile function returns INVALID_HANDLE_VALUE in order to facilitate porting code from Win16.

    (Armed with this, you can now answer the following trivia question: Why do I call CreateFile when I'm not actually creating a file? Shouldn't it be called OpenFile? Answer: Yes, OpenFile would have been a better name, but that name was already taken.)

    On the other hand, there are no Win16 equivalents for CreateThread or CreateMutex, so they return NULL.

    Since the precedent had now been set for inconsistent return values, whenever a new function got added, it was a bit of a toss-up whether the new function returned NULL or INVALID_HANDLE_VALUE.

    This inconsistency has multiple consequences.

    First, of course, you have to be careful to check the return values properly.

    Second, it means that if you write a generic handle-wrapping class, you have to be mindful of two possible "not a handle" values.

    Third, if you want to pre-initialize a HANDLE variable, you have to initialize it in a manner compatible with the function you intend to use. For example, the following code is wrong:

    HANDLE h = NULL;
    if (UseLogFile()) {
        h = CreateFile(...);
    }
    DoOtherStuff();
    if (h) {
       Log(h);
    }
    DoOtherStuff();
    if (h) {
        CloseHandle(h);
    }
    
    This code has two bugs. First, the return value from CreateFile is checked incorrectly. The code above checks for NULL instead of INVALID_HANDLE_VALUE. Second, the code initializes the h variable incorrectly. Here's the corrected version:
    HANDLE h = INVALID_HANDLE_VALUE;
    if (UseLogFile()) {
        h = CreateFile(...);
    }
    DoOtherStuff();
    if (h != INVALID_HANDLE_VALUE) {
       Log(h);
    }
    DoOtherStuff();
    if (h != INVALID_HANDLE_VALUE) {
        CloseHandle(h);
    }
    

    Fourth, you have to be particularly careful with the INVALID_HANDLE_VALUE value: By coincidence, the value INVALID_HANDLE_VALUE happens to be numerically equal to the pseudohandle returned by GetCurrentProcess(). Many kernel functions accept pseudohandles, so if if you mess up and accidentally call, say, WaitForSingleObject on a failed INVALID_HANDLE_VALUE handle, you will actually end up waiting on your own process. This wait will, of course, never complete, because a process is signalled when it exits, so you ended up waiting for yourself.

  • The Old New Thing

    Why 16-bit DOS and Windows are still with us

    • 65 Comments
    Many people are calling for the abandonment of 16-bit DOS and 16-bit Windows compatibility subsystems. And trust me, when it comes time to pull the plug, I'll be fighting to be the one to throw the lever. (How's that for a mixed metaphor.)

    But that time is not yet here.

    You see, folks over in the Setup and Deployment group have gone and visited companies around the world, learned how they use Windows in their businesses, and one thing keeps showing up, as it relates to these compatibility subsystems:

    Companies still rely on them. Heavily.

    Every company has its own collection of Line of Business (LOB) applications. These are programs that the company uses for its day-to-day business, programs the company simply cannot live without. For example, at Microsoft two of our critical LOB applications are our defect tracking system and our source control system.

    And like Microsoft's defect tracking system and source control system, many of the LOB applications at major corporations are not commercial-available software; they are internally-developed software, tailored to the way that company works, and treated as trade secrets. At a financial services company, the trend analysis and prediction software is what makes the company different from all its competitors.

    The LOB application is the deal-breaker. If a Windows upgrade breaks a LOB application, it's game over. No upgrade. No company is going to lose a program that is critical to their business.

    And it happens that a lot of these LOB applications are 16-bit programs. Some are DOS. Some are 16-bit Windows programs written in some ancient version of Visual Basic.

    "Well, tell them to port the programs to Win32."

    Easier said than done.

    • Why would a company go to all the effort of porting a program when the current version still works fine. If it ain't broke, don't fix it.
    • The port would have to be debugged and field-tested in parallel with the existing system. The existing system is probably ten years old. All its quirks are well-understood. It survived that time in 1998 when there was a supply chain breakdown and when production finally got back online, they had to run at triple capacity for a month to catch up. The new system hasn't been stress-tested. Who knows whether it will handle these emergencies as well as the last system.
    • Converting it from a DOS program to a Windows program would incur massive retraining costs for its employees ("I have always used F4 to submit a purchase order. Now I have this toolbar with a bunch of strange pictures, and I have to learn what they all mean." Imagine if somebody took away your current editor and gave you a new one with different keybindings. "But the new one is better.")
    • Often the companies don't have the source code to the programs any more, so they couldn't port it if they wanted to. It may use a third-party VB control from a company that has since gone out of business. It may use a custom piece of hardware that they have only 16-bit drivers for. And even if they did have the source code, the author of the program may no longer work at the company. In the case of a missing driver, there may be nobody at the company qualified to write a 32-bit Windows driver. (I know one company that used foot-pedals to control their software.)

    Perhaps with a big enough carrot, these companies could be convinced to undertake the effort (and risk!) of porting (or in the case of lost source code and/or expertise, rewriting from scratch) their LOB applications.

    But it'll have to be a really big carrot.

    Real example: Just this past weekend I was visiting a friend who lived in a very nice, professionally-managed apartment complex. We had occasion to go to the office, and I caught a glimpse of their computer screen. The operating system was Windows XP. And the program they were running to do their apartment management? It was running in a DOS box.

  • The Old New Thing

    Why do timestamps change when I copy files to a floppy?

    • 17 Comments
    Floppy disks use the FAT filesystem, as do DOS-based and Windows 95-based operating systems. On the other hand, Windows NT-based systems (Windows 2000, XP, 2003, ...) tend to use the NTFS filesystem. (Although you can format a drive as FAT on Windows NT-based systems, it is not the default option.)

    The NTFS and FAT filesystems store times and dates differently. Note, for example, that FAT records last-write time only to two-second accuracy. So if you copy a file from NTFS to FAT, the last-write time can change by as much as two seconds.

    Why is FAT so much lamer than NTFS? Because FAT was invented in 1977, back before people were worried about such piddling things like time zones, much less Unicode. And it was still a major improvement over CP/M, which didn't have timestamps at all.

    It is also valuable to read and understand the consequences of FAT storing filetimes in local time, compared to NTFS storing filetimes in UTC. In addition to the Daylight Savings time problems, you also will notice that the timestamp will appear to change if you take a floppy across timezones. Create a file at, say, 9am Pacific time, on a floppy disk. Now move the floppy disk to Mountain time. The file was created at 10am Mountain time, but if you look at the disk it will still say 9am, which corresponds to 8am Pacific time. The file travelled backwards in time one hour. (In other words, the timestamp failed to change when it should.)

Page 43 of 50 (497 items) «4142434445»