July, 2011

  • The Old New Thing

    What is that horrible grinding noise coming from my floppy disk drive?

    • 41 Comments

    Wait, what's a floppy disk drive?

    For those youngsters out there, floppy disks are where we stored data before the invention of the USB earring. A single floppy disk could hold up to two seconds of CD-quality audio. This may not sound like a lot, but it was in fact pretty darned awesome, because CDs hadn't been invented yet either.

    Anyway, if you had a dodgy floppy disk (say, because you decided to fold it in half), you often heard a clattering sound from the floppy disk drive as it tried to salvage what data it could from the disk. What is that sound?

    That sound is recalibration.

    The floppy disk driver software kept getting errors back from the drive saying "I can't find any good data." The driver figures, "Hm, maybe the problem is that the drive head is not positioned where I think it is." You see, floppy drives do not report the actual head position; you have to infer it by taking the number of "move towards the center" commands you have issued and subtracting the number of "move towards the edge" commands. The actual location of the drive head could differ from your calculations due to an error in the driver, or it could just be that small physical errors have accumulated over time, resulting in a significant difference between the theoretical and actual positions. (In the same way that if you tell somebody to step forward ten steps, then backward ten steps, they probably won't end up exactly where they started.)

    To get the logical and physical positions back in sync, the driver does what it can to get the drive head to a known location. It tells the hardware, "Move the drive head one step toward the edge of the disk. Okay, take another step. One more time. Actually, 80 more times." Eventually, the drive head reaches the physical maximum position, and each time the driver tells the hardware to move the head one more step outward, it just bumps against the physical boundary of the drive hardware and makes a click sound. If you issue at least as many "one more step outward" commands as there are steps from the innermost point of the disk to the edge, then the theory is that at the end of the operation, the head is in fact at track zero. At that point, you can set your internal "where is the drive head?" variable to zero and restart the original operation, this time with greater confidence that the drive head is where you think it is.

    The amount of clattering depends on where the drive head was when the operation began. If the drive head were around track 40, then the first 40 requests to move one step closer to the center would do exactly that, and then next 43 requests would make a clicking noise. On the other hand, if the drive head were closer to track zero already, then nearly all of the requests result in the drive head bumping against the physical boundary of the drive hardware, and you get a longer, noisier clicking or grinding sound.

    You can hear the recalibration at the start of this performance.

    Bonus floppy drive music.

    Bonus reading: Tim Paterson, author of DOS, discusses all those floppy disk formats.

  • The Old New Thing

    Be careful when redirecting both a process's stdin and stdout to pipes, for you can easily deadlock

    • 18 Comments

    A common problem when people create a process and redirect both stdin and stdout to pipes is that they fail to keep the pipes flowing. Once a pipe clogs, the disturbance propagates backward until everything clogs up.

    Here is a common error, in pseudocode:

    // Create pipes for stdin and stdout
    CreatePipe(&hReadStdin, &hWriteStdin, NULL, 0);
    CreatePipe(&hReadStdout, &hWriteStdout, NULL, 0);
    
    // hook them up to the process we're about to create
    startup_info.hStdOutput = hWriteStdout;
    startup_info.hStdInput = hReadStdin;
    
    // create the process
    CreateProcess(...);
    
    // write something to the process's stdin
    WriteFile(hWriteStdin, ...);
    
    // read the result from the process's stdout
    ReadFile(hReadStdout, ...);
    

    You see code like this all over the place. I want to generate some input to a program and capture the output, so I pump the input as the process's stdin and read the output from the process's stdout. What could possibly go wrong?

    This problem is well-known to unix programmers, but it seems that the knowledge hasn't migrated to Win32 programmers. (Or .NET programmers, who also encounter this problem.)

    Recall how anonymous pipes work. (Actually, this description is oversimplified, but it gets the point across.) A pipe is a marketplace for a single commodity: Bytes in the pipe. If there is somebody selling bytes (Write­File), the seller waits until there is a buyer (Read­File). If there is somebody looking to buy bytes, then the buyer waits until there is a seller.

    In other words, when somebody writes to a pipe, the call to Write­File waits until somebody issues a Read­File. Conversely, when somebody reads from a pipe, the call to Read­File waits until somebody calls Write­File. When there is a matching read and write, the bytes are transferred from the writer's buffer to the reader's buffer. If the reader asks for fewer bytes than the writer provided, then the writer continues waiting until all the bytes have been read. (On the other hand, if the writer provides fewer bytes than the reader requested, the reader is given a partial read. Yes, there's asymmetry there.)

    Okay, so where's the deadlock in the above code fragment? We write some data into one pipe (connected to a process's stdin) and then read from another pipe (connected to a process's stdout). For example, the program might take some input, do some transformation on it, and print the result to stdout. Consider:

    YouHelper
    WriteFile(stdin, "AB")
    (waits for reader)
    ReadFile(stdin, ch)
    reads A
    (still waiting since not all data read)
    encounters errors
    WriteFile(stdout, "Error: Widget unavailable\r\n")
    (waits for reader)

    And now we're deadlocked. Your process is waiting for the helper process to finish reading all the data you wrote (specifically, waiting for it to read B), and the helper process is waiting for your process to finish reading the data it wrote to its stdout (specifically, waiting for you to read the error message).

    There's a feature of pipes that can mask this problem for a long time: Buffering.

    The pipe manager might decide that when somebody offers some bytes for sale, instead of making the writer wait for a reader to arrive, the pipe manager will be a market-maker and buy the bytes himself. The writer is then unblocked and permitted to continue execution. Meanwhile, when a reader finally arrives, the request is satisfied from the stash of bytes the pipe manager had previously bought. (But the pipe manager doesn't take a 10% cut.)

    Therefore, the error case above happens to work, because the buffering has masked the problem:

    YouHelper
    WriteFile(stdin, "AB")
    pipe manager accepts the write
    ReadFile(stdout, result)
    (waits for read)
    ReadFile(stdin, ch)
    reads A
    encounters errors
    WriteFile(stdout, "Error: Widget unavailable\r\n")
    Read completes

    As long as the amount of unread data in the pipe is within the budget of the pipe manager, the deadlock is temporarily avoided. Of course, that just means it will show up later under harder-to-debug situations. (For example, if the program you are driving prints a prompt for each line of input, then the problem won't show up until you give the program a large input data set: For small data sets, all the prompts will fit in the pipe buffer, but once you hit the magic number, the program hangs because the pipe is waiting for you to drain all those prompts.)

    To avoid this problem, your program needs to keep reading from stdout while it's writing to stdin, so that neither will block the other. The easiest way to do this is to perform the two operations on separate threads.

    Next time, another common problem with pipes.

    Exercise: A customer reported that this function would sometimes hang waiting for the process to exit. Discuss.

    int RunCommand(string command, string commandParams)
    {
     var info = new ProcessStartInfo(command, commandParams);
     info.UseShellExecute = false;
     info.RedirectStandardOutput = true;
     info.RedirectStandardError = true;
     var process = Process.Start(info);
     while (!process.HasExited) Thread.Sleep(1000);
     return process.ExitCode;
    }
    

    Exercise: Based on your answer to the previous exercise, the customer responds, "I added the following code, but the problem persists." Discuss.

    int RunCommand(string command, string commandParams)
    {
     var info = new ProcessStartInfo(command, commandParams);
     info.UseShellExecute = false;
     info.RedirectStandardOutput = true;
     info.RedirectStandardError = true;
     var process = Process.Start(info);
     var reader = Process.StandardOutput;
     var results = new StringBuilder();
     string lineOut;
     while ((lineOut = reader.ReadLine()) != null) {
      results.AppendLine("STDOUT: " + lineOut);
     }
     reader = Process.StandardError;
     while ((lineOut = reader.ReadLine()) != null) {
      results.AppendLine("STDERR: " + lineOut);
     }
     while (!process.HasExited) Thread.Sleep(1000);
     return process.ExitCode;
    }
    
  • The Old New Thing

    Hey, let's report errors only when nothing is at stake!

    • 37 Comments

    Only an idiot would have parameter validation, and only an idiot would not have it. In an attempt to resolve this paradox, commenter Gabe suggested, "When running for your QA department, it should crash immediately; when running for your customer, it should silently keep going." A similar opinion was expressed by commenter Koro and some others.

    This replaces one paradox with another. Under the new regime, your program reports errors only when nothing is at stake. "Report problems when running in test mode, and ignore problems when running on live data." Isn't this backwards? Shouldn't we be more sensitive to problems with live data than problems with test data? Who cares if test data gets corrupted? That's why it's test data. But live data—we should get really concerned when there's a problem with live data. Allowing execution to continue means that you're attempting to reason about a total breakdown of normal functioning.

    Now, if your program is mission-critical, you probably have some recovery code that attempts to reset your data structures to a "last known good" state or which attempts to salvage what information it can, like how those space probes have a safe mode. And that's great. But silently ignoring the condition means that your program is going to skip happily along, unaware that what it's doing is probably taking a bad situation and subtly making it slightly worse. Eventually, things will get so bad that something catastrophic happens, and when you go to debug the catastrophic failure, you'll have no idea how it got that way.

  • The Old New Thing

    How is it possible to run Wordpad by just typing its name even though it isn't on the PATH?

    • 29 Comments

    In a comment completely unrelated to the topic, Chris Capel asks how Wordpad manages to run when you type its name into the Run dialog even though the command prompt can't find it? In other words, the Run dialog manages to find Wordpad even though it's not on the PATH.

    Chris was unable to find anywhere I discussed this issue earlier, but it's there, just with Internet Explorer as the application instead of Wordpad.

    It's through the magic of App Paths.

    App Paths was introduced in Windows 95 to address the path pollution problem. Prior to the introduction of App Paths, typing the name of a program without a fully-qualified path resulted in a search along the path, and if it wasn't found, then that was the end of that. File not found. As a result, it became common practice for programs, as part of their installation, to edit the user's AUTOEXEC.BAT and add the application's installation directory to the path.

    This had a few problems.

    First of all, editing AUTOEXEC.BAT is decidedlly nontrivial since batch files can have control flow logic like IF and CALL and GOTO. Finding the right SET PATH=... or PATH ... command is an exercise in code coverage analysis, especially since MS-DOS 6 added multi-config support to CONFIG.SYS, so the value of the CONFIG environment variable is determined at runtime. If you wanted to avoid hanging your setup program, you would have to solve the Halting Problem. (You can't just stick at PATH ... at the beginning because it might get wiped out by a later PATH command, and you can't just stick it at the end, because control might never reach last line of the batch file.)

    And of course, very few uninstall programs would take the time to undo the edits the installer performed, and even if they tried, there's no guarantee that the undo would be successful, since the user (or another installer!) may have edited the AUTOEXEC.BAT file in the meantime.

    Even if you postulate the existence of the AUTOEXEC.BAT editing fairy who magically edits your AUTOEXEC.BAT for you, you still run into the PATH length limit. The maximum length of a command line was 128 characters in MS-DOS, and if each program added itself to the PATH, it wouldn't be long before the PATH reached its maximum length.

    Pre-emptive Yuhong Bao irrelevant detail that has no effect on the story: Windows 95 increased the maximum command line length, but the program being launched needed to know where to look for the "long command line". And that didn't help existing installers which were written against the old 128-character limit. Give them an AUTOEXEC.BAT with a line longer than 128 characters and you had a good chance that you'd hit a buffer overflow bug.

    On top of the difficulty of adding more directories to the PATH, there was the recognition that this was another case of using a global setting to solve a local problem. It seemed wasteful to add a directory to the path just so you could find one file. Each additional directory on the path slowed down path sarching operations, even the ones unrelated to locating that one program.

    Enter App Paths. The idea here is that instead of adding your application directory to the path, you just create an entry under the App Paths key saying, "If somebody is looking to execute contoso.exe, I put it over here." Instead of adding an entire directory to the path, you just add a single file, and it's used only for application execution purposes, so it doesn't slow down other path search operations like loading DLLs.

    (Note that the old documentation on App Paths has been superseded by the new documentation linked above.)

    Now that there was a place to store information associated with a particular application, you may as well use it for other stuff as well. A secondary source of path pollution came from applications which added not only the application directory to the path, but also a helper directory where the application kept its DLLs. To address this, an additional Path value specified which directories your application wanted to be added to the path before it was executed. Over time, additional attributes were added to the App Paths key, such as the UseUrl value we saw some time ago.

    When you type the name of a program into the Run dialog (with no path), the Shell­Execute function checks if the name corresponds to an application registered under App Paths. If so, then it uses the registration information to launch the application. Hooray, applications can be run by just typing their name without requiring them to modify the global path.

    Note that this extra lookup is performed only by the Sh­ellExecute family of functions, so if you use Create­Process or Search­Path, you'll still get ERROR_FILE_NOT_FOUND.

    Now, the intent was that the registered full path to the application is the same as the registered short name, just with a full path in front. For example, wordpad.exe registers the full path of %ProgramFiles%\Windows NT\Accessories\WORDPAD.EXE. But there's no check that the two file names match. The Pbrush folks took advantage of this by registering an application path entry for pbrush.exe with a full path of %SystemRoot%\System32\mspaint.exe: That way, when somebody types pbrush into the Run dialog, they get redirected to mspaint.exe.

    Sneaky.

  • The Old New Thing

    The danger of making the chk build stricter is that nobody will run it

    • 31 Comments

    Our old pal Norman Diamond suggested that Windows should go beyond merely detecting dubious behavior on debug builds and should kill the application when it misbehaves.

    The thing is, if you make an operating system so strict that the slightest misstep results in lightning bolts from the sky, then nobody would run it.

    Back in the days of 16-bit Windows, as today, there were two builds, the so-called retail build, which had assertions disabled, and the so-called debug build, which had assertions enabled and broke into the debugger if an application did something suspicious. (This is similar to today's terms checked and free.)

    Now, the Windows development team is big on self-hosting. After all, if you are writing the operating system, you should be running it, too. What's more, it was common to self-host the debug version of the operating system, since that's the one with the extra checks and assertions that help you flush out the bugs.

    As it happens, the defect tracking system we used back in the day triggered a lot of these assertions. As I recall, refreshing a query resulted in about 50 parameter validation errors caught and reported by Windows. This made using the defect tracking system very cumbersome because you had to babysit the debugger and hit "i" (for ignore) 50 times each time you refreshed a query.

    (As I noted in my talk at Reflections|Projections 2009, the great thing about defect tracking systems is that you will hate every single one you use. Sure, the new defect tracking system may have some new features and be easier to use and run faster, but all that does is delay the point at which you begin hating it.)

    If Windows had taken the stance that the slightest error resulted in the death of the application, then it would have been impossible for a member of the Windows development team to run the defect tracking system program itself, because once it hit the first of those 50 parameter validation error reports, the program would have been killed, and the defect tracking system would have been rendered useless.

    Remember, don't change program semantics in the debug build. That just creates Heisenbugs.

    I remember that at one point the Windows team asked the people who supported the defect tracking system, "Hey, your program has a lot of problems that are being reported by the Windows debug build. Can you take a look at it?"

    The response from the defect tracking system support team was somewhat ironic: "Sorry, we don't support running the defect tracking system on a debug build of Windows. We found that the debug version of Windows breaks into the debugger too much."

  • The Old New Thing

    Some mailing lists come with a negative service level agreement, but that's okay, because everybody is in on the joke

    • 14 Comments

    As I noted some time ago, there's a mailing list devoted to chatting among people who work in a particular cluster of buildings. It's not a technical support mailing list, but people will often ask a technical question on the off chance that somebody can help, in the same way that you might ask your friends for some help with something.

    Of course, one consequence of this is that the quality of the responses is highly variable. While there's a good chance that somebody will help you with your problem, there's also a good chance that a technical question will receive a highly unhelpful response just for fun, in the same way your friend might respond to a question with a funny but unhelpful answer. (And there's also a good chance that a technical question will get both types of replies.) You don't complain about this because, well, that's what you sort of expect when you use this mailing list.

    An illustration of this principle comes from the following thread:

    When I do ABC, I get XYZ. How do I get DEF?

    A short while later, the same person replied to his own question.

    Nevermind. I found the right mailing list to ask this question.

    That didn't stop somebody from responding:

    This mailing list is the correct place to send all questions. You have to use a different mailing list to get answers, though.
  • The Old New Thing

    Why is secur32.dll called secur32.dll and not secure32.dll?

    • 22 Comments

    Many years ago, in a discussion of why you shouldn't name your DLL "security.dll", I dug a bit into the history behind the DLL. Here are some other useless tidbits about that file.

    Originally, there were two DLLs called security.dll. One was the 32-bit version and one was the 16-bit version. They could coexist because the 32-bit version was in the system32 directory and the 16-bit version was in the system directory.

    And then Windows 95 showed up and screwed up everything.

    Windows 95 did not have separate system32 and system directories. All the system files, both 16-bit and 32-bit, were lumped together in a single system directory. When the Security Support Provider Interface was ported to Windows 95, this created a problem, for it would require two files in the same directory to have the same name. Since the 16-bit version had seniority (because your Windows 95 installation may have been an upgrade install over Windows 3.1, which would have been the 16-bit version), it was the 32-bit version that had to be renamed.

    Okay, so why secur32.dll? Well, security32.dll was too long, since it exceeded the classic 8.3 limit, and Windows NT supported being run on FAT volumes (which necessarily did not support long file names, since long file names on FAT didn't exist until Windows 95). Okay, but why secur32.dll instead of secure32.dll, which still fits inside the 8.3 constraints?

    Nobody knows for sure any more; the person who chose the name left Microsoft a long time ago. Perhaps because secur32.dll looked better than securi32.dll. Maybe he couldn't count.

  • The Old New Thing

    Looking at the problem at the wrong level: Closing a process's stdin

    • 11 Comments

    A customer was having trouble manipulating the stdin stream that was given to a process.

    How do you simulate sending Ctrl+Z to a hidden console process programmatically?

    I am using Redirect­Standard­Input and want to send the console a Ctrl+Z. I've tried sending ASCII code 26, but that doesn't work. Generate­Console­Ctrl­Event supports Ctrl+C and Ctrl+Break but not Ctrl+Z.

    Here's what I'm doing, but it doesn't work:

    ProcessStartInfo info = new ProcessStartInfo(@"...");
    info.CreateNoWindow = true;
    info.RedirectStandardError = true;
    info.RedirectStandardOutput = true;
    info.RedirectStandardInput = true;
    info.UseShellExecute = false;
    Process p = Process.Start(info);
    // 0x1A is ASCII code of Ctrl+Z but it does not work
    p.StandardInput.WriteLine("\x1A");
    

    The customer was kind enough to do more than simply ask the question. The customer set up the scenario and even provided a code fragment that illustrates the problem. Which is good, because the original question was the wrong question.

    The customer asked about simulating typing Ctrl+Z to a console, but what they actually doing was sending a character to stdin; they weren't sending it to a console. In fact, the way they created the process, there is no console at all.

    The customer confused stdin with consoles. It's true that Ctrl+Z is the convention used by console windows to indicate that stdin should be closed. But that is hardly any consolation when you took control of stdin yourself and are not using a console window to manage it.

    It's like saying, "Normally, when I want somebody to take my order, I pull into a parking space and turn on my headlights, and somebody will come out. But I can't get it to work."

    Um, that's because you pulled into your own driveway.

    Ctrl+Z is a convention used by console windows to indicate that stdin should be closed, but if you said "I am going to manage stdin myself," then you aren't using a console window, and that convention carries no weight. If you write a Ctrl+Z to the process's stdin, it will simply read a Ctrl+Z. But since you are managing stdin yourself, you can do it yourself: Just take the stream you set as the process's stdin and close it.

    Exercise: Perhaps you can answer this related question from a different customer:

    I am trying to send a Ctrl+C (SIGINT) to a process.

    CurrentProcess = new Process();
    CurrentProcess.StartInfo.FileName = "foo.exe";
    CurrentProcess.StartInfo.UseShellExecute = false;
    CurrentProcess.StartInfo.RedirectStandardInput = true;
    StandardInputWriter = CurrentProcess.StandardInput;
    char c = '\u0003';
    StandardInputWriter.Write(c);
    StandardInputWriter.Flush();
    StandardInputWriter.Close();
    

    If I launch the process from a command prompt and type Ctrl+C, it flushes its output and terminates, but when I start it from within my application and send it a Ctrl+C via the code above, the process is still running. How do I send a Ctrl+C to a process?

  • The Old New Thing

    At least it'll be easy to write up the security violation report

    • 37 Comments

    Many years ago, Microsoft instituted a new security policy at the main campus: all employees must visibly wear their identification badge, even when working in their office. As is customary with with nearly all new security policies, it was met with resistance.

    One of my colleagues was working late, and his concentration was interrupted by a member of the corporate security staff at his door.

    Sir, can I see your cardkey?

    My colleague was not in a good mood (I guess it was a nasty bug), so he curtly replied, "No. I'm busy."

    Sir, you have to show me your cardkey. It's part of the new security policy.

    "I told you, I'm busy."

    Sir, if you don't show me your cardkey, I will have to write you up.

    "Go ahead, if it'll get you out of my office."

    All right, then. What's your name?

    Without even looking from his screen, my colleague replied impatiently, "It's printed on the door."

    The policy was rescinded a few weeks later.

  • The Old New Thing

    Photoshop meme: Mark Reynolds casually eating sunflower seeds

    • 3 Comments

    July 7, 2011: David Ortiz hits a home run against the Baltimore Orioles. As he rounds third base, Orioles third baseman Mark Reynolds casually finishes off a package of sunflower seeds (photo 6 in the slide show).

    An Internet meme is born.

    Follow the hilarity.

Page 1 of 3 (26 items) 123