I just faced an interesting scenario. It happens that a .NET 3.0 WinForms-based application running on Windows Vista stops responding. Of course, the common sense would say "Find out the root cause and fix it". It turns out that in this case, it is not that easy. The application depends heavily on threading and asynchronous calls and needs to interact with several devices (which leads directly to establish hypothesis on device drivers) and external systems by using products from several vendors. The underlying root cause might be in any or a combination of those elements. As I am writing this, the problem had not been identified.

But that's the technical aspect of the scenario. The real problem is with the end-user experience. The application is used to enter a lot of information but, as I stated above, the application stops responding. This leaves no option to the end-user but to kill the application (she was trained to do that) losing all of the information entered up to that moment. The impact on end-user productivity caused by reentering the information is the real problem.

While working on finding the technical root cause, a customer asked of me if there is something he could do in order to minimize the impact on end-user productivity. Incidentally, Windows Vista (and Windows Server 2008) provides a new set of APIs collectively known as Application Recovery and Restart and the Windows Error Reporting Service (WER). In a nutshell, an application registers itself as a recoverable application which means that in case it stops responding (or receives an unhandled exception) the OS will give it a chance to do some processing prior to being terminated. Usually this processing involves saving a copy of its internal state. The second part, Application Restart, allows the user to launch another instance of the application, restore its state, and get to the point it was when the undesirable condition appeared.

I am not entitled to show the real application and what I did in it. But to demo the Application Recovery and Restart concepts I created the following form:

MainForm

The TextBox Text property is all the state I am interested in. The button's task is easy: a while(true); loop caughts the UI thread making the window a Not Responding one as shown in the next figure:

NotResponding

At this moment, everything the end-user is able to do is to click on the Close button, restart the application, and cry for the information gone. So the first thing I do is to declare the Application Recovery and Restart Windows APIs to be invoked from within the C# demo application:

namespace Win32
{
  public static class ApplicationRecoveryRestart
  {
    public delegate Int32 ApplicationRecoveryCallbackDelegate(RecoveryInformation information);

    [DllImport("kernel32.dll")]
    public static extern Int32 RegisterApplicationRecoveryCallback
    (
      ApplicationRecoveryCallbackDelegate recoveryCallback,
      RecoveryInformation information,
      UInt32 pingInterval,
      UInt32 flags
    );

    [DllImport("kernel32.dll")]
    public static extern Int32 RegisterApplicationRestart
    (
      [MarshalAs(UnmanagedType.LPWStr)]
      String commandLineArgs,
      UInt32 flags
    );

    [DllImport("kernel32.dll")]
    public static extern void ApplicationRecoveryFinished(Boolean success);

    [DllImport("kernel32.dll")]
    public static extern Int32 ApplicationRecoveryInProgress(out Boolean canceled);

    [DllImport("kernel32.dll")]
    public static extern Int32 UnregisterApplicationRecoveryCallback();

    [DllImport("kernel32.dll")]
    public static extern Int32 UnregisterApplicationRestart();
  }

  public class RecoveryInformation
  {
    public string Parameter;
  }
}

I am only using a subset of the Application Recovery and Restart Windows APIs but you can go to the MSDN and see the rest. To register the application for recovery, the Main() method is a good place to start:

[STAThread]
static void Main(string[] args)
{
  string restartFile = Guid.NewGuid().ToString("N") + ".dat";
  RegisterApplicationRecovery(restartFile);

  Application.EnableVisualStyles();
  Application.SetCompatibleTextRenderingDefault(false);
  Application.Run(mainForm = new MainForm());
}

private static void RegisterApplicationRecovery(string restartFile)
{
  RecoveryInformation information = new RecoveryInformation();
  information.Parameter = restartFile;

  ApplicationRecoveryRestart.ApplicationRecoveryCallbackDelegate callback;
  callback = new ApplicationRecoveryRestart.ApplicationRecoveryCallbackDelegate(ApplicationRecoveryCallback);
  Int32 result = ApplicationRecoveryRestart.RegisterApplicationRecoveryCallback
  (
    callback,    // Function to be called to store application state when the recovery starts
    information, // It could be anything useful for a restart. In this case is the file holding the state stored during the recovery
    3000,        // Recovery ping interval. It also has to do with ApplicationRecoveryInProgress. See MSDN documentation.
    0            // Reserved for future use
  );
}

public static Int32 ApplicationRecoveryCallback(RecoveryInformation information)
{
  bool canceled = false;

  // Let Windows knows recovery is starting and check if the user has cancelled the recovery...
  ApplicationRecoveryRestart.ApplicationRecoveryInProgress(out canceled);
  if(canceled)
  {
    Trace.WriteLine("User canceled application recovery...");

    // Let Windows knows recovery is complete...
    ApplicationRecoveryRestart.ApplicationRecoveryFinished(false);
  }
  else
  {
    Trace.WriteLine("Application is trying to save data and state information before the application terminates...");
    using(StreamWriter writer = new StreamWriter(information.Parameter))
    {
      writer.Write("{0}", theTextBox.Text); // The TextBox Text property is all the state to be saved...
    }

    // Let Windows knows recovery is complete...
    ApplicationRecoveryRestart.ApplicationRecoveryFinished(true);
  }

  return 0;
}

During startup, the application lets Windows knows that in case of becoming a "Not Responding" one, its ApplicationRecoveryCallback function is to be called. Any useful state will be saved inside this function. Before running a quick test of the code and concepts covered so far, it is important to have a couple of caveats in mind:

  1. If the application is started inside the Visual Studio hosting process, it won't work because it is the latter the one to be recovered and restarted. Test the application through the Debug->Start without debugging menu item or press <CTRL>+<F5>.
  2. Only applications running for at least sixty seconds are considered by the Application Recovery and Restart and the Windows Error Reporting Service.

Once the application has entered into the "Not Responding" state, the end-user, unable to do anything else, tries to close it. In response, Microsoft Windows shows the following message box:

CloseWait

When the application is doing a lengthy operation, usually selecting "Wait for the program to respond" could be enough; once the lengthy operation is done, the control returns to the application and the user will be able to continue. In this case, while(true); won't relinquish control ever so the only thing the user is able to do is to click on "Close the program". This is where the ApplicationRecoveryCallback function is called by the Windows Error Reporting Service and the application state is saved. However, no provisions were made to restart the application. This could be accomplished through the following code:

[STAThread]
static void Main(string[] args)
{
  string restartFile = Guid.NewGuid().ToString("N") + ".dat";
  RegisterApplicationRecovery(restartFile);
  RegisterApplicationRestart(restartFile);

  Application.EnableVisualStyles();
  Application.SetCompatibleTextRenderingDefault(false);
  Application.Run(mainForm = new MainForm());
}

private static void RegisterApplicationRestart(string restartFile)
{
  string commandLineArgs = restartFile;
  ApplicationRecoveryRestart.RegisterApplicationRestart(commandLineArgs, 0);
}

Notice that the restartFile created during the ApplicationRecovery episode will be used as the commandLine argument for recovery. Therefore the application must determine during startup, if it is being started or restarted. The following is the modified Main function:

[STAThread]
static void Main(string[] args)
{
  // If the a command-line argument is supplied then the application is being restarted and
  // the supplied argument is the name of the file containing the state to be restored...
  string restartFile;
  if(args.Length > 0)
  {
    restartFile = args[0];
    using(StreamReader reader = new StreamReader(restartFile))
    {
      theTextBox.Text = reader.ReadToEnd();
    }
  }
  else
  {
    restartFile = Guid.NewGuid().ToString("N") + ".dat";
    if(File.Exists(restartFile))
    {
      File.Delete(restartFile);
    }
  }

  RegisterApplicationRecovery(restartFile);
  RegisterApplicationRestart(restartFile);

  Application.EnableVisualStyles();
  Application.SetCompatibleTextRenderingDefault(false);
  Application.Run(mainForm = new MainForm());
}

Under these new circumstances, every time the application hangs and its Close button is clicked the following message box will be shown:

RestartCloseWait

Now there is the "Restart the program" option which explains it by itself. If the user select such option, the application hanging instance is terminated and a new instance is launched as the following message box is shown:

IsRestarting

If everything works well, the application is restarted and its state is restored. That's it.

One final note. I have noticed that after a number of times the application is started, hung, and restarted, a point is reached where the recovery/restart functionality stops working. The RegisterApplicationRecoveryCallback returns E_FAIL and I have found that the reason is that the Windows Error Reporting Services runs out of some internal resource that keeps track of every application registered for recovery/restart. Like a good citizen, the application should unregister from the Windows Error Reporting Service through the UnregisterApplicationRecoveryCallback and UnregisterApplicationRestart API but at the time of this writing and after several tests, I have not been successful to find the correct place to unregister which leaves me no option but to restart the Windows Error Reporting Service to make everything works again (yes, I know it is not very end-user friendly).