• The Old New Thing

    If you want to be notified when your app is uninstalled, you can do that from your uninstaller

    • 8 Comments

    A customer had a rather strange request. "Is there a way to be notified when the user uninstalls any program from Programs and Features (formerly known as Add and Remove Programs)?"

    They didn't explain what they wanted to do this for, and we immediately got suspicious. It sounds like the customer is trying to do something user-hostile, like seeing that a user uninstalled a program and immediately reinstalling it. (Sort of the reverse of force-uninstalling all your competitors.)

    The customer failed to take into account that there are many ways of uninstalling an application that do not involve navigating to the Programs and Features control panel. Therefore, any solution that monitors the activities of Programs and Features may not actually solve the customer's problem.

    The customer liaison went back to the customer to get more information about their problem scenario, and the response was, that the customer is developing something like an App Lending Library. The user goes to the Lending Library and installs an application. They want a way to figure out when the user uninstalls the application so that the software can be "checked back in" to the library (available for somebody else to use).

    The customer was asking for a question far harder than what they needed. They didn't need to be notified if the user uninstalled any application from the Programs and Features control panel. They merely needed to be notified if the user uninstalled one of their own applications from the Programs and Features control panel.

    And that is much easier to solve.

    After all, when an application is installed, it registers a command line to execute when the user clicks the Uninstall button. You can set that command line to do anything you want. For example, you can set it to

    Uninstall­String = "C:\Program Files\Contoso Lending Library\CheckIn.exe" ⟨identification⟩

    where ⟨identification⟩ is something that the Check­In program can use to know what program is being uninstalled, so that it can launch the real uninstaller and update the central database.

  • The Old New Thing

    Did the Windows 95 interface have a code name?

    • 13 Comments

    Commenter kinokijuf wonders whether the Windows 95 interface had a code name.

    Nope.

    We called it "the new shell" while it was under preliminary development, and when it got enabled in the builds, we just called it "the shell."

    (Explorer originally was named Cabinet, unrelated to the container file format of the same name. This original name lingers in the window class: CabinetWClass.)

  • The Old New Thing

    Finding the shortest path to the ground while avoiding obstacles

    • 9 Comments

    Today's Little Program solves the following problem:

    Consider a two-dimensional board, tall and narrow. Into the board are nailed a number of horizontal obstacles. Place a water faucet at the top of the board and turn it on. The water will dribble down, and when it hits an obstacle, some of the water will go left and some will go right. The goal is to find the shortest path to the ground from a given starting position, counting both horizontal and vertical distance traveled.

    In the above diagram, the water falls three units of distance until it encounters Obstacle 1, at which some goes to the left and some goes to the right. The water that goes to the left travels three units of distance before it reaches the end of the obstacle, then falls three units and encounters Obstacle 2. Upon reaching Obstable 2, the water can again choose to flow either left or right. The water that flows to the left falls to the ground; the water that flows to the right falls and encounters a third obstacle. From the third obstacle, the water can flow left or right, and either way it goes, it falls to the ground. On the other hand, the water that chose to flow to the right when it encountered Obstable 1 iwould fall past Obstacle 2 (which is not in a position to intercept the water) and land directly on Obstacle 3.

    In the above scenario, there are five paths to the ground.

    • From Obstacle 1, flow left, then from Obstacle 2, flow left again. Total distance traveled: 17 units.
    • From Obstacle 1, flow left, then from Obstacle 2, flow right, then from Obstacle 3, flow left. Total distance traveled: 18 units.
    • From Obstacle 1, flow left, then from Obstacle 2, flow right, then from Obstacle 3, flow right. Total distance traveled: 20 units.
    • From Obstacle 1, flow right, then from Obstacle 3, flow left. Total distance traveled: 16 units.
    • From Obstacle 1, flow right, then from Obstacle 3, flow right. Total distance traveled: 14 units.

    In this case, the shortest path to the ground is the last path.

    There are many ways to attack this problem. The brute force solution would be to enumerate all the possible paths to the ground, then pick the shortest one.

    A more clever solution would use a path-finding algorithm like A*, where the altitude above the ground is the heuristic.

    In both cases, you can add an optimization where once you discover two paths to the same point, you throw out the longer one. This may short-circuit future computations.

    But I'm going to use an incremental solution, since it has the advantage of incorporating the optimization as a convenient side-effect. Instead of studying individual drops of water, I'm going to study all of them at once. At each step in the algorithm, the data structures represent a horizontal cross-section of the above diagram, representing all possible droplet positions at a fixed altitude.

    In addition to collapsing redundant paths automatically, this algorithm has the nice property that it can be done as an on-line algorithm: You don't need to provide all the obstacles in advance, as long as the obstacles are provided in order of decreasing altitude.

    Instead of presenting the raw code and discussing it later (as is my wont), I'll explain the code as we go via code comments. We'll see how well that works.

    I originally wrote the program in C# because I thought I would need one of the fancy collection classes provided by the BCL, but it turns out that I didn't need anything fancier than a hash table. After I wrote the original C# version, I translated it to JavaScript, which is what I present here.

    The inputs which correspond to the diagram above are

    • Initial X position = 6, Initial Y position = 12
    • Obstacle: Left = 3, Right = 7, height = 9
    • Obstacle: Left = 1, Right = 5, height = 6
    • Obstacle: Left = 4, Right = 8, height = 3

    And here's the program.

    function Obstacle(left, right, y) {
     this.left = left;
     this.right = right;
     this.y = y;
    }
    
    // A single step in a path, representing the cost to reach that point.
    function Step(x, y, cost) {
     this.x = x;
     this.y = y;
     this.cost = cost;
    }
    
     // Add a step to an existing step
    Step.prototype.to = function to(x, y) {
     var dx = Math.abs(this.x - x);
     var dy = Math.abs(this.y - y);
     return new Step(x, y, this.cost + dx + dy);
    }
    
    // Record a droplet position
    function addDroplet(l, step) {
     // If no previous droplet at this position or the new droplet
     // has a cheaper path, then remember this droplet.
     var existingStep = l[step.x];
     if (!existingStep || step.cost < existingStep.cost) {
      l[step.x] = step;
     }
    }
    
    // Take an existing collection of locations and updates them to account
    // for a new obstacle. Obstacles must be added in decreasing altitude.
    // (Consecutive duplicate altitudes allowed.)
    function fallTo(oldLocations, obstacle) {
     var newLocations = {};
     for (var x in oldLocations) {
      var step = oldLocations[x];
    
      // fall to the obstacle's altitude
      step = step.to(step.x, obstacle.y);
        
      // If the falling object does not hit the obstacle,
      // then there is no horizontal displacement.
      if (step.x <= obstacle.left || step.x >= obstacle.right) {
       addDroplet(newLocations, step);
      } else {
       // The falling object hit the obstacle.
       // Split into two droplets, one that goes left
       // and one that goes right.
       addDroplet(newLocations, step.to(obstacle.left, obstacle.y));
       addDroplet(newLocations, step.to(obstacle.right, obstacle.y));
      }
     }
     return newLocations;
    }
    
    function printStep(step) {
     console.log("Cost = " + step.cost + ": " + step.x + "," + step.y);
    }
    
    // Debugging function
    function printLocations(l) {
     for (var x in l) printStep(l[x]);
    }
    
    function shortestPath(x, y, obstacles) {
     var l = {};
     l[x] = new Step(x, y, 0);
     printLocations(l);
    
     obstacles.forEach(function (obstacle) {
      l = fallTo(l, obstacle);
      console.log(["after", obstacle.left, obstacle.right, obstacle.y].join(" "));
      printLocations(l);
      console.log("===");
     });
    
     // Find the cheapest step.
     var best;
     for (x in l) {
      if (!best || l[x].cost < best.cost) best = l[x];
     }
    
     // Fall to the floor and print the result.
     printStep(best.to(best.x, 0));
    }
    
    shortestPath(6,12,[new Obstacle(3,7,9),
                       new Obstacle(1,5,6),
                       new Obstacle(4,8,3)]);
    

    This program finds the cost of the cheapest path to the floor, but it merely tells you the cost and not how the cost was determined. To include the winning path, we need to record the history of how the cost was determined. This is a standard technique in dynamic programming: In addition to remembering the best solution so far, you also remember how that solution was arrived at by remembering the previous step in the solution. You can then walk backward through all the previous steps to recover the full path.

    // A single step in a path, representing the cost to reach that point
    // and the previous step in the path.
    function Step(x, y, cost, previous) {
     this.x = x;
     this.y = y;
     this.cost = cost;
     this.previous = previous;
    }
    
     // Add a step to an existing step
    Step.prototype.to = function to(x, y) {
     var dx = Math.abs(this.x - x);
     var dy = Math.abs(this.y - y);
     // These next two test are not strictly necessary. They are for style points.
     if (dx == 0 && dy == 0) {
      // no movement
      return this;
     } else if (dx == 0 && this.previous && this.previous.x == x) {
      // collapse consecutive vertical movements into one
      return new Step(x, y, this.cost + dx + dy, this.previous);
     } else {
      return new Step(x, y, this.cost + dx + dy, this);
     }
    }
    
    function printStep(firstStep) {
     // Walk the path backwards, then reverse it so we can print
     // the results forward.
     var path = [];
     for (var step = firstStep; step; step = step.previous) {
      path.push("(" + step.x + "," + step.y + ")");
     }
     path.reverse();
     console.log("Cost = " + firstStep.cost + ": " + path.join(" "));
    }
    

    Notice that we didn't change any of the program logic. All we did was improve our record-keeping so that the final result prints the full path from the starting point to the ending point.

  • The Old New Thing

    How do I obtain the computer manufacturer's name from C++?

    • 14 Comments

    Some time ago, I gave a scripting solution to the problem of obtaining the computer manufacturer and model. But what if you want to do this from C++?

    I could translate the script into C++, or I could just point you to Creating a WMI Application Using C++ in MSDN. In particular, one of the WMI C++ Sample Applications does exactly what you want: Example: Creating a WMI Application. The only things you need to do are

    • change SELECT * FROM Win32_Process to SELECT * FROM Win32_ComputerSystem, and
    • change Name to Manufacturer, and then again to Model.
  • The Old New Thing

    When I send a WM_GETFONT message to a window, why don't I get a font?

    • 12 Comments

    A customer reported that the WM_GET­FONT message was not working. Specifically, they sent the message to a window, and they can plainly see that the window is rendering with a particular font, yet the WM_GET­FONT message returns 0. Why isn't the window returning the correct font handle?

    The WM_SET­FONT and WM_GET­FONT messages are not mandatory. A window may choose to support them, or it may choose not to, or it may even choose to support one but not the other. (Though if it supports WM_SET­FONT, it probably ought to support WM_GET­FONT.)

    For example, our scroll bar program creates a custom font for the items in the list, but it does not implement the WM_SET­FONT or WM_GET­FONT messages. If you try to change the font via WM_SET­FONT, nothing happens. If you ask for the font via WM_GET­FONT, you get nothing back.

    A control might ignore your attempt to change the font if it already has its own notion of what font it should be using. Or maybe the control shows content in multiple fonts, so the concept of "the" font does not map well to the render model. (What would WM_GET­FONT on an HTML control return?) Or maybe the control doesn't use GDI fonts at all. (Maybe it uses Direct­Write.)

    That's one of the reasons why the rules for the WM_SET­FONT are set up the way they are. Since there is no way to tell whether a window did anything in response to the WM_SET­FONT message, there would be no way to know whether responsibility for destroying the font should be transferred to the control or retained by the caller.

    Controls that are designed to be used in dialog boxes are the ones most likely to support the WM_SET­FONT message, since that's the message the dialog manager uses to tell each control the font specified in the dialog box template. The hope is that all of the controls will respect that font, so that the controls on the dialog box have a consistent appearance. But there's nothing preventing a control from saying, "Screw you. I'm drawing with OCR-A and there's nothing you can do to stop me."

  • The Old New Thing

    When will GetSystemWindowsDirectory return something different from GetWindowsDirectory?

    • 30 Comments

    Most of the time, the Get­Window­Directory returns the Windows directory. However, as noted in the documentation for Get­System­Windows­Directory:

    With Terminal Services, the Get­System­Windows­Directory function retrieves the path of the system Windows directory, while the Get­Windows­Directory function retrieves the path of a Windows directory that is private for each user. On a single-user system, Get­System­Windows­Directory is the same as Get­Windows­Directory.

    What's going on here, and how do I test this scenario?

    When Terminal Services support was being added to Windows NT 4.0 in the mid 1990's, the Terminal Services team discovered that a lot of applications assumed that the computer was used by only one person, and that that person was a local administrator. This was the most common system configuration at the time, so a lot of applications simply assumed that it was the only system configuration.

    On the other hand, a Terminal Server machine can have a large number of users, including multiple users connected simultaneously, and if the Terminal Services team took no special action, you would have found that most applications didn't work. The situation "most applications didn't work" tends not to bode well for adoption of your technology.

    Their solution was to create a whole bunch of compatibility behaviors and disable them if the application says, "Hey, I understand that Terminal Server machines are different from your average consumer machine, and I know what I'm doing." One of those compatibility behaviors is to make the Get­Windows­Directory function return a private writable directory rather than the real Windows directory, because old applications assumed that the Windows directory was writable, and they often dumped their private configuration data there.

    The signal to disable compatibility behaviors is the IMAGE_DLLCHARACTER­ISTICS_TERMINAL_SERVER_AWARE flag in the image attributes of the primary executable. You tell the linker that you want this flag to be set by passing the /TSAWARE:YES parameter on the command line. (At some point, the Visual Studio folks made /TSAWARE:YES the default for all new projects, so you are probably getting this flag set on your files without even realizing it. You can force it off by going to Configuration Properties, and under Linker, then System, change the "Terminal Server" setting to "Not Terminal Server Aware".)

    Note that only the flag state on the primary executable has any effect. Setting the flag on a DLL has no effect. (This adds to the collection of flags that are meaningful only on the primary executable.)

    The other tricky part is that the Terminal Server compatibility behaviors kick in only on a Terminal Server machine. The way you create a Terminal Server machine has changed a lot over the years, as has the name of the feature.

    • In Windows NT 4.0, it was a special edition of Windows, known as Windows NT 4.0 Terminal Server Edition.
    • In Windows 2000, the feature changed its name from Terminal Server to Terminal Services and became an optional server component rather than a separate product. You add the component from Add/Remove Programs.
    • In Windows Server 2003 and Windows Server 2008, you go to the Configure Your Server Wizard and add the server rôle "Terminal Server."
    • In Windows Server 2008 R2, the feature changed its name again. The instructions are the same as in Windows Server 2008, but the rôle name changed to "Remote Desktop Services".
    • In Windows Server 2012, the feature retained its name but became grouped under the category "Virtual Desktop Infrastructure." This time, you have to enable the rôle server "Remote Desktop (RD) Session Host."

    Terminal Services is the Puff Daddy of Windows technologies. It changes its name every few years, and you wonder what it will come up with next.

  • The Old New Thing

    Microspeak: Tell Mode / Ask Mode

    • 12 Comments

    As a product nears release, the rate of change slows down, and along the way, the ship room goes through stages known as Tell Mode and Ask Mode.

    In Tell Mode, any changes to the product do not require prior approval, but you are required to present your changes to the next ship room meeting and be prepared to explain and defend them. The purpose of this exercise is to get teams accustomed to the idea of having to present their changes to the ship room as a warm-up for Ask Mode. There is also the psychological aspect: If you have to present and defend your changes, you are going to be more careful about deciding which changes to make, how you will go about making them, and how thoroughly you're going to validate those changes. For example, if a bug could be fixed by applying a targeted fix or by rewriting the entire class, you are probably not going to choose to rewrite. (In theory, the ship room may reject your changes after the fact, and then you have to go back them out. But this is rare in practice. The ship room usually lets you off with a warning unless your transgression was particularly severe.)

    The next stage of scrutiny is known as Ask Mode. In this stage, any proposed changes to the product must be presented to the ship room before they can be submitted. Rejection is more frequent here. Time has passed and the bug bar has gone up, and because it is easier to get forgiveness than permission.

    Here is a more detailed explanation of how one team implements the two modes.

    Note that there can be multiple levels of ship room. There may be a local feature team ship room, then a group-wide ship room, then a product-wide ship room, and it is not uncommon for each ship room to be in a different mode. For example, the local feature team ship room may be in Ask Mode, the group-wide ship room is in Tell Mode, and the product-wide ship room isn't looking at individual bugs yet. This means that when you want to make a change, you need to get permission from your local feature team, and then after you commit the change, you need to get forgiveness from the group ship room.

  • The Old New Thing

    The alternate story of the time one of my colleagues debugged a line-of-business application for a package delivery service

    • 77 Comments

    Some people objected to the length, the structure, the metaphors, the speculation, and fabrication. So let's say they were my editors. Here's what the article might have looked like, had I taken their recommendations. (Some recommendations were to text that was also recommended cut. I applied the recommendations before cutting; the cuts are in gray.) You tell me whether you like the original or the edited version.

    Back in the days of Windows 95 development, one of my colleagues debugged a line-of-business application for a major delivery service. This was a program that the company gave to its top-tier high-volume customers, so that they could place and track their orders directly. And by directly, I mean that the program dialed the modem (since that was how computers communicated with each other back then) to contact the delivery service's mainframe (it was all mainframes back then) a computer at the delivery service and upload the new orders and download the status of existing orders.¹

    [Length. The "top tier customer" part of the story is irrelevant.]
    [Length. The mainframe part of the story is irrelevant.]
    [Speculation. No proof that the computer being dialed is a mainframe. For all you know, it was an Apple ][ on the other end of the modem.]

    Version 1.0 of the application had a notorious bug: Ninety days after you installed the program, it stopped working. They forgot to remove the beta expiration code. I guess that's why they have a version 1.01. It told you that the beta period has expired.

    [Length. Version 1.0 is irrelevant.]
    [Speculation. No proof that the beta expiration code was left by mistake. It could have been intentional, for whatever reason. Probably some nefarious reason.]

    Anyway, the bug that my colleague investigated was that If you entered a particular type of order with a particular set of options in a particular way, then the application crashed your system. Setting up a copy of the application in order to replicate the problem was itself a bit of an ordeal, but that's a whole different story.

    [Length. Retransition no longer necessary. The "setting up" story is irrelevant.]

    Okay, the program is set up, and yup, it crashes exactly as described when run on Windows 95. Actually, it also crashes exactly as described when run on Windows 3.1. This is just plain an application bug.

    [Length. Irrelevant.]

    The initial crash

    [Structure. Create heading (even though it gives away some of the story).]

    Here's why it crashed: After the program dials up the mainframe to submit the order the order system, it tries to refresh the list of orders that have yet to be delivered a list box control. The code that does this assumes that the list of undelivered orders the list box control is the control with focus. But if you ask for labels to be printed, then the printing code changes focus in order to display the "Please place the label on the package exactly like this" dialog, under the specific circumstances, the control is no longer focus; as I recall, it was because a dialog box had appeared and changed focus, and as a result, the refresh code can't find the undelivered order list list box and crashes on a null pointer. (I'm totally making this up, by the way. The details of the scenario aren't important to the story.)

    [Fabrication. All that is known is that there was a list box that lost focus to a dialog box.]

    Okay, well, that's no big deal. A null pointer fault should just put up the Unrecoverable Application Error dialog box and close the program. Why does this particular null pointer fault crash the entire system?

    [Embellishment.]

    Recovering from the crash

    [Structure. Create heading.]

    The developers of the program saw that their refresh code sometimes crashed on a null pointer, and instead of fixing it by actually fixing the code so it could find the list of undelivered orders even if it didn't have focus, or fixing it by adding a null pointer check, they fixed it by adding a null pointer exception handler. (I wish to commend myself for resisting the urge to put the word fixed in quotation marks in that last sentence.) The program installed a null pointer exception handler.

    [Speculation. No way of knowing that this was what the developers were thinking when they wrote the code.]

    Now, 16-bit Windows didn't have structured exception handling. The only type of exception handler was a global exception handler, and this wasn't just global to the process. This was global to the entire system. Your exception handler was called for every exception everywhere. If you screwed it up, you screwed up the entire system. (I think you can see where this is going.)

    [Embellishment.]

    The developers of the program converted their global exception handler to a local one by going to every function that had a "We seem to crash on a null pointer and I don't know why" bug and making these changes: A few functions in the program took the following form:

    extern jmp_buf caught;
    extern BOOL trapExceptions;
    
    void scaryFunction(...)
    {
     if (setjmp(&caught)) return;
     trapExceptions = TRUE;
     ... body of function ...
     trapExceptions = FALSE;
    }
    

    Their global exception handler checks the trapExceptions global variable, and if it is TRUE, they set it back to FALSE and do a longjmp which sends control back to the start of the function, which detects that something bad must have happened and just returns out of the function.

    [Speculation. No way of knowing that this was what the developers were thinking when they wrote the code. No proof that the code was first written without a global exception handler, and that the handler was added later. No proof that every such function set this variable. No proof that the reason for adding the setjmp was to protect against null pointer failures.]

    Yes, things are kind of messed up as a result of this. Yes, there is a memory leak. But at least their application didn't crash.

    [Embellishment.]

    On the other hand, if the global variable is FALSE, because their application crashed in some other function that didn't have this special protection, or because some other totally unrelated application crashed, the global exception handler decided to exit the application by running around freeing all the DLLs and memory associated with their application.

    Okay, so far so good, for certain values of good.

    [Embellishment.]

    Failed recovery

    [Structure. Add heading here.]

    These system-wide exception handlers had to be written in assembly code because they were dispatched with a very strange calling convention. But the developers of this application didn't write their system-wide exception handler in assembly language. Their application was written in MFC, so they just went to Visual C++ (as it was then known), clicked through some Add a Windows hook wizard, and got some generic HOOKPROC. (I don't know if Visual C++ actually had an Add a Windows hook wizard; they could just have copied the code from somewhere.) Nevermind that these system-wide exception handlers are not HOOKPROCs, so the function has the wrong prototype. What's more, the code they used marked the hook function as __loadds. This means that the function For whatever reason, the handler they installed saves the previous value of the DS register on entry, then changes the register to point to the application's data, and on exit, the function restores the previous value of DS.

    [Speculation. No proof that the program was written with MFC in the Microsoft Visual C++ IDE. It could have been written with Notepad in assembly language that just happens to look like the assembly language generated by the Microsoft Visual C++ compiler when it compiles code written in MFC.]

    The DS is a register on the x86 CPU that describes the data currently being operated upon. All that's important here is that the value in the DS register must always be valid, or the CPU will raise an exception.

    [Need to explain the DS register in case the reader cannot infer this from the description that comes later. We have established that neither the author nor the reader is allowed to draw inferences.]

    Okay, now we're about to enter the set piece at the end of the movie: Our hero's fear of spiders, his girlfriend's bad ankle from an old soccer injury, the executive toy on the villain's desk, and all the other tiny little clues dropped in the previous ninety minutes come together to form an enormous chain reaction.

    [Embellishment.]

    The application crashes on a null pointer. The system-wide custom exception handler is called. The crash is not one that is being protected by the global variable, so the custom exception handler frees the application from memory. The system-wide custom exception handler now returns, but wait, what is it returning to?

    The crash was in the application, which means that the DS register it saved on entry to the custom exception handler points to the application's data. The custom exception handler freed the application's data and then returned, declaring the exception handled. As the function exited, it tried to restore the original DS register, but the CPU said, "Nice try, but that is not a valid value for the DS register (because you freed it)." The CPU reported this error by (dramatic pause) raising an exception.

    [Embellishment.]

    That's right, The system-wide custom exception handler crashed with an exception.

    [Embellishment]

    The chain reaction

    [Structure. Add heading here.]

    Okay, things start snowballing. This is the part of the movie where the director uses quick cuts between different locations, maybe with a little slow motion thrown in.

    [Embellishment.]

    Since an exception was raised, the custom exception handler is called recursively. Each time through the recursion, the custom exception handler frees all the DLLs and memory associated with the application. But that's okay, right? Because the second and subsequent times, the memory was already freed, so the attempts to free them again will just fail with an invalid parameter error.

    But wait, their list of DLLs associated with the application included USER, GDI, and KERNEL. Now, Windows is perfectly capable of unloading dependent DLLs when you unload the main DLL, so when they unloaded their main program, the kernel already decremented the usage count on USER, GDI, and KERNEL automatically. But they apparently didn't trust Windows to do this, because after all, it was Windows that was causing their application to crash, so they took it upon themselves to free those DLLs manually. For whatever reason, the handler frees the DLLs anyway.

    [Speculation. No way of knowing that this was what the developers were thinking when they wrote the code.]

    Therefore, each time through the loop, the usage counts for USER, GDI, and KERNEL drop by one. Zoom in on the countdown clock on the ticking time bomb.

    Beep beep beep beep beep. The reference count finally drops to zero. The window manager, the graphics subsystem, and the kernel itself have all been unloaded from memory. There's nothing left to run the show!

    [Embellishment.]

    Boom, bluescreen. Hot flaming death.

    The punch line to all this is that whenever you call the company's product support line and describe a problem you encountered, their response is always, "Yeah, we're really sorry about that one."

    [Length. Irrelevant.]

    Bonus chatter: What is that whole different story mentioned near the top?

    [Length. Cut the entire bonus chatter. Irrelevant story.]

    Well, when the delivery service sent the latest version of the software to the Windows 95 team, they also provided an account number to use. My colleague used that account number to try to reproduce the problem, and since the problem occurred only after the order was submitted, she would have to submit delivery requests, say for a letter to be picked up from 221B Baker Street and delivered to 62 West Wallaby Street, or maybe for a 100-pound package of radioactive material to be picked up from 1600 Pennsylvania Avenue and delivered to 10 Downing Street. all of which were fictitious.

    [Fabrication. No proof that these were the addresses and orders used. All that is known is that fictitious orders were placed.]

    After about two weeks of this, my colleague got a phone call from people identifying themselves as Microsoft's shipping department. "What the heck are you doing?"

    [Speculation. No proof that the call truly came from the shipping department. Could have been a lucky prank call.]
    [Fabrication. No transcript of this call exists.]

    It turns out that the account number my colleague was given was Microsoft's own corporate account number. As in a real live account. She was inadvertently prank-calling the delivery company and sending actual trucks all over the country to pick up nonexistent letters and packages. The people who identified themselves as Microsoft's shipping department and people from the delivery service's headquarters claimed that they were frantic trying to trace where all the bogus orders were coming from.

    [Hearsay.]

    ¹ Mind you, this sort of thing is the stuff that average Joe customers can do while still in their pajamas, but back in those days, it was a feature that only top-tier customers had access to, because, y'know, mainframe.

  • The Old New Thing

    How can I get the URL to the Web page the clipboard was copied from?

    • 18 Comments

    When you copy content from a Web page to the clipboard and then paste it into OneNote, OneNote pastes the content but also annotates it "Pasted from ...". How does OneNote know where the content was copied from?

    As noted in the documentation for the HTML clipboard format, Web browsers can provide an optional Source­URL property to specify the Web page the HTML was copied from.

    Let's write a Little Program that mimics what OneNote does, but just in plain text, because I don't want to try to parse HTML. This is much easier to do in C#, because the BCL provides most of the helper functions.

    using System;
    using System.IO;
    using System.Windows;
    
    class Program {
     [STAThread]
     public static void Main() {
      System.Console.WriteLine(Clipboard.GetText());
      using (var sr = new StringReader(
                   Clipboard.GetText(TextDataFormat.Html))) {
       string s;
       while ((s = sr.ReadLine()) != null) {
        if (s.StartsWith("SourceURL:")) {
         System.Console.WriteLine("Copied from {0}", s.Substring(10));
         break;
        }
       }
      }
     }
    }
    

    First, we get the text from the clipboard and print it. That's the easy part.

    Next, we get the HTML text from the clipboard. This is a bunch of text in a particular format. We look for an entry that specifies the Source­URL; if we find it, then we print the URL.

    This code is rather sloppy. For example, if the HTML itself contains the string SourceURL:haha-fakeout, we risk misdetecting it as the source. To do this properly, we would have to verify that the string appears in the header area of the HTML (before the first StartFragment).

    But this is a Little Program, so I can skip all that stuff.

    Here's a sketch of the equivalent C/C++ version:

    int __cdecl main(int, char **)
    {
     if (OpenClipboard(NULL)) {
    
      // Obtain the Unicode text and print it
      HANDLE h = GetClipboardData(CF_UNICODETEXT);
      if (h) {
       PCWSTR pszPlainText = GlobalLock(h);
       ... print pszPlainText ...
       GlobalUnlock(h);
      }
    
      // Obtain the HTML text and extract the SourceURL
      h = GetClipboardData(RegisterClipboardFormat(TEXT("HTML Format")));
      if (h) {
       PCSTR pszHtmlFormat = GlobalLock(h);
       ... break pszHtmlFormat into lines ...
       ... look for a line that begins with "SourceURL:" ...
       ... if found, print it ...
       GlobalUnlock(h);
      }
      CloseClipboard();
     }
     return 0;
    }
    
  • The Old New Thing

    The time one of my colleagues debugged a line-of-business application for a package delivery service

    • 41 Comments

    Back in the days of Windows 95 development, one of my colleagues debugged a line-of-business application for a major delivery service. This was a program that the company gave to its top-tier high-volume customers, so that they could place and track their orders directly. And by directly, I mean that the program dialed the modem (since that was how computers communicated with each other back then) to contact the delivery service's mainframe (it was all mainframes back then) and upload the new orders and download the status of existing orders.¹

    Version 1.0 of the application had a notorious bug: Ninety days after you installed the program, it stopped working. They forgot to remove the beta expiration code. I guess that's why they have a version 1.01.

    Anyway, the bug that my colleague investigated was that if you entered a particular type of order with a particular set of options in a particular way, then the application crashed your system. Setting up a copy of the application in order to replicate the problem was itself a bit of an ordeal, but that's a whole different story.

    Okay, the program is set up, and yup, it crashes exactly as described when run on Windows 95. Actually, it also crashes exactly as described when run on Windows 3.1. This is just plain an application bug.

    Here's why it crashed: After the program dials up the mainframe to submit the order, it tries to refresh the list of orders that have yet to be delivered. The code that does this assumes that the list of undelivered orders is the control with focus. But if you ask for labels to be printed, then the printing code changes focus in order to display the "Please place the label on the package exactly like this" dialog, and as a result, the refresh code can't find the undelivered order list and crashes on a null pointer. (I'm totally making this up, by the way. The details of the scenario aren't important to the story.)

    Okay, well, that's no big deal. A null pointer fault should just put up the Unrecoverable Application Error dialog box and close the program. Why does this particular null pointer fault crash the entire system?

    The developers of the program saw that their refresh code sometimes crashed on a null pointer, and instead of fixing it by actually fixing the code so it could find the list of undelivered orders even if it didn't have focus, or fixing it by adding a null pointer check, they fixed it by adding a null pointer exception handler. (I wish to commend myself for resisting the urge to put the word fixed in quotation marks in that last sentence.)

    Now, 16-bit Windows didn't have structured exception handling. The only type of exception handler was a global exception handler, and this wasn't just global to the process. This was global to the entire system. Your exception handler was called for every exception everywhere. If you screwed it up, you screwed up the entire system. (I think you can see where this is going.)

    The developers of the program converted their global exception handler to a local one by going to every function that had a "We seem to crash on a null pointer and I don't know why" bug and making these changes:

    extern jmp_buf caught;
    extern BOOL trapExceptions;
    
    void scaryFunction(...)
    {
     if (setjmp(&caught)) return;
     trapExceptions = TRUE;
     ... body of function ...
     trapExceptions = FALSE;
    }
    

    Their global exception handler checks the trapExceptions global variable, and if it is TRUE, they set it back to FALSE and do a longjmp which sends control back to the start of the function, which detects that something bad must have happened and just returns out of the function.

    Yes, things are kind of messed up as a result of this. Yes, there is a memory leak. But at least their application didn't crash.

    On the other hand, if the global variable is FALSE, because their application crashed in some other function that didn't have this special protection, or because some other totally unrelated application crashed, the global exception handler decided to exit the application by running around freeing all the DLLs and memory associated with their application.

    Okay, so far so good, for certain values of good.

    These system-wide exception handlers had to be written in assembly code because they were dispatched with a very strange calling convention. But the developers of this application didn't write their system-wide exception handler in assembly language. Their application was written in MFC, so they just went to Visual C++ (as it was then known), clicked through some Add a Windows hook wizard, and got some generic HOOKPROC. (I don't know if Visual C++ actually had an Add a Windows hook wizard; they could just have copied the code from somewhere.) Nevermind that these system-wide exception handlers are not HOOKPROCs, so the function has the wrong prototype. What's more, the code they used marked the hook function as __loadds. This means that the function saves the previous value of the DS register on entry, then changes the register to point to the application's data, and on exit, the function restores the previous value of DS.

    Okay, now we're about to enter the set piece at the end of the movie: Our hero's fear of spiders, his girlfriend's bad ankle from an old soccer injury, the executive toy on the villain's desk, and all the other tiny little clues dropped in the previous ninety minutes come together to form an enormous chain reaction.

    The application crashes on a null pointer. The system-wide custom exception handler is called. The crash is not one that is being protected by the global variable, so the custom exception handler frees the application from memory. The system-wide custom exception handler now returns, but wait, what is it returning to?

    The crash was in the application, which means that the DS register it saved on entry to the custom exception handler points to the application's data. The custom exception handler freed the application's data and then returned, declaring the exception handled. As the function exited, it tried to restore the original DS register, but the CPU said, "Nice try, but that is not a valid value for the DS register (because you freed it)." The CPU reported this error by (dramatic pause) raising an exception.

    That's right, the system-wide custom exception handler crashed with an exception.

    Okay, things start snowballing. This is the part of the movie where the director uses quick cuts between different locations, maybe with a little slow motion thrown in.

    Since an exception was raised, the custom exception handler is called recursively. Each time through the recursion, the custom exception handler frees all the DLLs and memory associated with the application. But that's okay, right? Because the second and subsequent times, the memory was already freed, so the attempts to free them again will just fail with an invalid parameter error.

    But wait, their list of DLLs associated with the application included USER, GDI, and KERNEL. Now, Windows is perfectly capable of unloading dependent DLLs when you unload the main DLL, so when they unloaded their main program, the kernel already decremented the usage count on USER, GDI, and KERNEL automatically. But they apparently didn't trust Windows to do this, because after all, it was Windows that was causing their application to crash, so they took it upon themselves to free those DLLs manually.

    Therefore, each time through the loop, the usage counts for USER, GDI, and KERNEL drop by one. Zoom in on the countdown clock on the ticking time bomb.

    Beep beep beep beep beep. The reference count finally drops to zero. The window manager, the graphics subsystem, and the kernel itself have all been unloaded from memory. There's nothing left to run the show!

    Boom, bluescreen. Hot flaming death.

    The punch line to all this is that whenever you call the company's product support line and describe a problem you encountered, their response is always, "Yeah, we're really sorry about that one."

    Bonus chatter: What is that whole different story mentioned near the top?

    Well, when the delivery service sent the latest version of the software to the Windows 95 team, they also provided an account number to use. My colleague used that account number to try to reproduce the problem, and since the problem occurred only after the order was submitted, she would have to submit delivery requests, say for a letter to be picked up from 221B Baker Street and delivered to 62 West Wallaby Street, or maybe for a 100-pound package of radioactive material to be picked up from 1600 Pennsylvania Avenue and delivered to 10 Downing Street.

    After about two weeks of this, my colleague got a phone call from Microsoft's shipping department. "What the heck are you doing?"

    It turns out that the account number my colleague was given was Microsoft's own corporate account number. As in a real live account. She was inadvertently prank-calling the delivery company and sending actual trucks all over the country to pick up nonexistent letters and packages. Microsoft's shipping department and people from the delivery service's headquarters were frantic trying to trace where all the bogus orders were coming from.

    ¹ Mind you, this sort of thing is the stuff that average Joe customers can do while still in their pajamas, but back in those days, it was a feature that only top-tier customers had access to, because, y'know, mainframe.

Page 1 of 427 (4,266 items) 12345»