AppDomains are the preferred means for isolating related managed components while still having the benefit of efficient in-process communication and relative ease of deployment/activation. The CLR Add-In framework along with FrameworkElementAdapters facilitate building UI consisting of pieces running in different AppDomains. Apart from the Add-In model abstractions, the basic interop interface is a native OS (“Win32”) window. While unfortunate for a composited UI framework like WPF, this is necessary because WPF cannot manage an element tree crossing an AppDomain boundary … and all previous UI frameworks for Windows use child windows as the basic building blocks. ;-) One benefit of having add-ins hosted in child windows, though, is that the “host” and “add-in” don’t need to be both built based on WPF. Here’s how WinForms can get involved too.

In some cases, however, AppDomain isolation does not suffice:

  • Reliability: A hang or a crash in a hosted UI component usually affects the host. In the most basic setup, the UI components run on the same thread, which crosses the AppDomain boundary to deliver window messages to the child window. The host is often not in a good position to catch an exception from the add-in or to distinguish it from an exception coming from its own code. That’s because exceptions often occur in the context of UI event handlers, which are ultimately responses to Win32 input messages, which get delivered to the target window by the message pump of the thread. WPF runs a standard Win32 message loop via Dispatcher.Run(). Application code is not directly involved in this loop, and even if it can be, it could not identify the source of an exception that originated many abstraction layers away.

With some effort, a child window can be hosted on its own thread. Since each UI thread is supposed to run a message loop, the above problem could be largely avoided. (On top of that, the host would have to guard every direct call into the add-in’s entry points.) However, the hang problem remains. The native window manager automatically merges the input queues and states of threads that have windows in a parent-child relationship. This is generally very beneficial as it facilitates building seamless UI with focus integration and “accelerator” keys just working across the boundary (which requires Alt/Ctrl/etc. key states to be shared), but the message loops get implicitly synchronized: A call to GetMessage() blocks as long as there are messages at the front of the queue destined for the other window. You could opt out of this synchronization by calling AttachThreadInput(NULL), but then a lot of things that you normally take for granted “break”, and dealing with all induced problems quickly becomes overwhelming. There’s a living proof, though, this approach is viable and greatly beneficial (if you can afford the engineering cost ;-) ): Loosely Coupled IE.

  • Unloading: To begin with, AppDomain unloading is not guaranteed to succeed. The reference documentation for AppDomain.Unload() has big enough caveats. On top of that, WPF has its fair share of problems with releasing of native resources and breaking object reference loops on unloading, which tend to show up in the context of CLR Add-Ins. Empirically, we know explicitly calling Dispatcher.InvokeShutdown() in an add-in’s AppDomain, on every thread that has had a Dispatcher or WPF object used on it, is often the practical solution. However, this requires cooperation from the add-in and, at least in the case of partial-trust add-ins, loading a host assembly in the add-in’s AppDomain (to be able to call Dispatcher.InvokeShutdown(), which is a privileged operation). If an add-in gets stuck in native code or merely runs a message loop on some thread that gets no user input (presumably different from the host’s UI thread), AppDomain unloading is guaranteed to fail.
  • Native code cannot be easily convinced to follow AppDomain state isolation. This is a common integration challenge with legacy components that were not created with use from managed code in mind. Also, the bitness of the native component needs to be matched with that of the host process. This can be a problem because many legacy components are not available in 64-bit form. (Even the Flash and Silverlight ActiveX controls are not (yet).)

Cross-process UI isolation is the strongest, largely because it’s supported at the OS level. In particular, you can always kill a process and rely on complete cleanup. A typical add-in crash will just leave a blank “hole” in the host’s window. (Incidentally, this is what happens in the browser on the rare occasions the XBAP host process, PresentationHost.exe, just crashes.) The host can detect an unexpectedly vanished add-in process (for example, via Process.WaitForExit()) and restart the add-in. Unfortunately, hangs are still a challenge, due to the forced synchronization of UI thread message queues. But there is an effective mitigation for that: The host can have a “watchdog” thread that periodically pings the add-in’s UI thread and, if found unresponsive, offers the user to forcibly recycle the add-in.

Unfortunately, WPF does not directly work with CLR Add-In model’s out-of-process hosting (and provides subpar input & focus integration even in-process when the add-in’s UI runs on its own thread). First basic problem is that the host process does not start in an STA thread. This can be worked around by using an add-in side adapter, but that gets awkward. A further problem is with input flow and tabbing across the thread/process boundary. The HwndSource/IKeyboardInputSink integration we have implemented in the FrameworkElementAdapters is designed for single-threaded cases. When a child window is running on its own thread, it doesn’t need to rely on the host for input, although selective delegation of accelerators may be desirable, but it is often expected to “bubble” unhandled input to the host and/or possibly give the host ‘first chance’ to handle select special keys. A future version of the framework may provide better support for cross-thread UI add-ins.

The easiest way to have part of the UI isolated in a separate process is via an XBAP. You can navigate the WebBrowser control directly to the .xbap file. (This can be WPF’s WebBrowser or WinForms’ or, in a native application, directly IE’s ActiveX control.) The XBAP will run in an instance of PresentationHost.exe, and its child window will be attached to the WebBrowser control. There are, however, some downsides to this approach:

  • The relatively significant runtime cost of ClickOnce deployment. (The first run can be particularly slow.)
  • You cannot control the bitness of PresentationHost.
  • Before .NET v4, there isn’t a convenient way for active two-way communication between the host application and the XBAP. This post presents one somewhat involved but effective technique.
    • In v4 we finally provide neat ‘script interop’ with the help of the Dynamic Language Runtime and the language-level support for it. The XBAP’s host application can expose its API through WebBrowser.ObjectForScripting, and the XBAP can access this object via BrowserInteropHelper.HostScript.

Finally, the attached solution illustrates doing cross-process UI hosting directly, using just HwndSource & HwndHost out of the box, running the two message loops, and routing input appropriately through the IKeyboardInputSink implementations of HwndHost and HwndSource. This example solves most of the above problems with in-process/cross-AppDomain hosting and provides good flexibility and performance. Much of the code is actually dedicated to somewhat extrinsic, “implementation-level” problems, but that’s how it typically is with solving real problems… ;-) The highlights:

  • Asynchronous loading of the add-in for better startup time and to avoid UI thread blocking
  • Use of .NET Remoting based on named pipes for two-way cross-process communication
  • Remotable object lifetime management. Graceful & fallback shutdown of the add-in process
  • Input & focus integration: tabbing in & out; “bubbling” unhandled keys from the add-in to the host
  • Solving threading issues. 
Input

image 

The main thread of the “add-in” process runs a Win32-style message loop via Dispatcher.Run(). This would mostly suffice for the child-window HwndSource to get its input, but keyboard input handling generally works better if MSGs straight off the message loop are “preprocessed", before passing them to DispatchMessage() and DefWinowProc(). This helps with correct handling of accelerators, in particular with some ActiveX controls (like the one WebBrowser wraps), and avoids occasional IME translation mishaps. WPF’s Dispatcher provides a direct hook into its message loop via the ComponentDispatcher.ThreadPreprocessMessage event. When HwndSource owns a top-level window, it subscribes itself to ThreadPreprocessMessage and does all the needed preprocessing and routing internally (largely by delegating to its IKeyboardInputSink implementation). But we need a child window for the add-in. The trick I’ve used is to first initialize the HwndSource as a top-level window, while keeping the window invisible, and then immediately convert it to a child one. Fortunately, the native window manager supports this well, and WPF obliviously takes it well too.

To be able to “bubble” unhandled keys to the host, the add-in subscribes a handler of its own to ComponentDispatcher.ThreadPreprocessMessage, counting on it to be called after HwndSource’s. The MSGs are relayed to the host via a custom remotable interface. Then the host obtains the main window’s IKeyboardInputSink implementation and calls TranslateAccelerator() and OnMnemonic(). Where do the messages go further? This arrangement invokes a special input event routing mode of HwndSource: It will see that a child window ("sink") has focus and will create an event route using the wrapper element (the HwndHost here) for the child window as the “forced” target. Thus, any element up to the root visual has a chance to see and handle the events, which is much in keeping with how most WPF routed events work. (The sample application shows a couple of interesting cases of key input delegation.)

For tabbing out, the KeyboardNavigation machinery gives us a special cue by calling IKeyboardInputSite.OnNoMoreTabStops(). The add-in provides a simple implementation of this interface and relays the call to the host, which merely calls MoveFocus() on the HwndHost. For tabbing into the add-in, the host implements IKeyboardInputSink.TabInto() on the HwndHost, which is appropriately called by the internal keyboard navigation code.

The host could selectively give the add-in opportunity to handle keyboard input. Doing so, however, is not analogous to how the add-in bubbles unhandled key input. The main difference is that there is no defined target element within the add-in’s UI, because focus is somewhere within the host’s window. (That’s why the host’s thread is getting the input messages.) Thus, calling IKeyboardInputSink.TranslateAccelerator() on the child-window HwndSource will not route input across that element tree. However, the InputManager will raise its global (per-thread, really) events: PreProcessInput, PostNotifyInput, etc. The add-in could handle these events and filter for specific keys it’s interested it. Or a custom interface could be defined across which to pass “managed” KeyDown events. There is one type of key input that still makes sense to be delegated to the child IKIS: mnemonics (these are Alt+something key combinations). HwndSource’s built-in implementation of IKIS.OnMnemonic() does most of its work independently from focus: It tries to activate AccessKeys within its element tree and then “broadcasts” the input event to any child sinks of its own (say, a WebBrowser control).

Threading

The two message loops appear to operate independently, and they are certainly unaware of each other per se. But the OS does the input queue & input state merging I mentioned earlier. Each input MSG is destined to a specific window, but the two message loops are essentially forced to serialize access to their respective messages. This creates implicit synchronization between the two threads (specifically around their message-pumping activity). One unfortunate practical implication is that the add-in can hang the host. A good strategy to mitigate this risk is to avoid doing any non-trivial input processing on the UI thread. While this is healthy practice in general, it’s not a guarantee, it's difficult to be so disciplined, and an external add-in may not cooperate. The “watchdog” feature I described earlier can be implemented if the host needs a stronger reliability guarantee.

Regardless of the specific IPC method chosen, for the two threads to be able to talk to each other, the receiving one has to be awaiting a Win32 message, that is, waiting in a call to the native GetMessage() or similar function or blocked in a message-alertable state supported by MsgWaitForMultipleObjects(). This is the general convention in Windows user-mode programming: A thread can be “hijacked” to respond to an external event (relatively) safely only around message pumping or when explicitly blocked in the special alertable state. Thus, for a UI thread to be responsive to external “calls”, it’d better be doing what UI threads are supposed to do most of the time: message pumping.

When making a simple, “one shot” call, the calling thread might just block until the call is processed and returned. But if the called thread decides to call back before returning, we have a problem: a deadlock. This kind of “reentrant” calls tend to occur naturally between related UI components, and even handling of basic window state changes such as resizing and focus move require making synchronous handling of window messages sent from one thread to another. Consequently, the calling thread should prepare itself to handle incoming messages while waiting for the call it has just made to complete. The most common traditional solution for this problem in Windows has been provided by COM. Being a means for cross-thread and cross-process communication and being used in UI scenarios, it had to. That’s one of the essential features of Single Threaded Apartments: Outgoing calls run an efficient local message loop waiting for the call to return and simultaneously dispatching all sent window messages and COM runtime’s internal window messages used to marshal STA calls from RPC thread-pool threads to the right thread. The CLR has had to accommodate message-based reentrancy as well, and as a pretty tough compromise at that: “Managed” blocking, such as via WaitHandle.WaitOne(), is window message permeable on an STA thread. While helping with interop scenarios and crucially preventing deadlocks, this tends to bite unexpectedly due to the unintuitive reentrancy allowed by an explicit “blocking” operation and the potentially asynchronous/random nature of such reentrant calls. I should not digress too much, because Chris Brumme has written the definitive treatise on this subject.

Because I’ve used .NET Remoting, and not COM, for the cross-process communication in the example implementation, I had the somewhat unexpected occasion to experience and ponder the above reentrancy vs. deadlocking issue. It turns out that the particular chosen IPC transport—named pipes (which is the most efficient available out of the box) does pretty hard blocking when sending “messages” and awaiting a response: At the bottom of the abstraction stack, a call to the ReadFile() native function is made to read the response message stream, and this function blocks unconditionally. So, for example, if the add-in tried to bubble some unhandled key to the host and the host decided to talk to the add-in while handling the given key or performed some window operation that affects the add-in’s window too, we’d get a deadlock because the add-in’s main (UI) thread is hard-blocked. To avoid this, all deadlock-prone calls across the boundary (in either direction) are made not from the UI thread but from a worker thread. WPF’s Dispatcher is run on this worker thread to be able to conveniently marshal calls to it. And there is a crucial side effect of calling Dispatcher.Invoke() cross-thread: The calling thread gets blocked in the message-permeable way described above, thus permitting reentrant calls via window messages. How would such a call actually be delivered? Via the Dispatcher again, but that of the pseudo-blocked UI thread. A Remoting call is normally received first on some random thread-pool thread. (That’s just how Remoting works. MarshalByRefObjects are treated as agile—not bound to any particular thread; thus, the runtime can deliver an incoming call on any thread.) For calls where some trivial processing is done not involving UI objects, no thread switching is needed. But WPF is pretty unforgiving if any UI object is accessed from a foreign thread. For such cases, that object’s Dispatcher is used to do a callback.

General Notes
  • Most of the example solution and the foregoing discussion are valid for any cross-thread child window hosting, including in-process. In spite of the implicitly UI thread synchronization, overall responsiveness may be better than with single-threaded UI because there is opportunity for true concurrent processing. While you would not get the stronger reliability guarantees afforded by having a separate host process for the child window, performance will be noticeably better, especially if the application-level APIs are chatty. Also, .NET Remoting has special, optimized cross-AppDomain in-process marshaling mode.
  • Also of practical interest is hosting WPF content in non-WPF applications. Since the essential integration interface between the “host” and the “add-in” in the above solution are a native window handle (HWND) and the fairly abstract IKeyboardInputSink, it is conceivable the host side could be implemented without WPF. For a purely native host application, however, .NET Remoting would have to be replaced with another IPC method, the most obvious and versatile choice being COM (and with it the deadlock/reentrancy issue is solved for free :-) ).

[Update, 4/29/10] Windows hosted by HwndHost tend to flicker heavily on resizing. This really shows with the WebBrowser control in my sample project, perhaps exacerbated by the cross-process window messaging. The solution is shown in this KB article: http://support.microsoft.com/kb/969728. (WFH derives from HwndHost, but the problem really is in HwndHost.) I updated the attached project.