I described how profilers may attach to already-running processes in some previous posts (#1 and #2). In this post I’m writing about how profilers that are already loaded may detach from a running process before that process exits. Like Profiler Attach, this is a new feature available starting with CLR V4.
The Detach feature allows a profiler that the user is finished with to be unloaded. That means the application may return to its usual behavior and performance characteristics, without a profiler loaded and doing stuff. Also, since only one profiler may be loaded at a time, detaching a profiler makes room for a different (or the same) profiler to be loaded later on when the user wishes to do more diagnostics.
Not every V4 profiler is allowed to detach from a running process. The general rule is that a profiler which has caused an irreversible impact in the process it’s profiling should not attempt to detach. The CLR catches the following cases:
If the profiler attempts to detach after doing any of the above, the CLR will disallow the attempt (see below for details).
That said, there are still other irreversible things the profiler might do to a process (which would also make detaching a bad idea). Imagine a profiler that allocates memory without cleaning up after itself, creates threads without waiting for them to exit, uses metadata APIs to modify aspects of the running managed code, etc. Profiler writers need to use good judgment when considering whether to allow their profilers to detach from running processes. You don’t want to give your customers the experience of noticing the app they profile always behaves weirdly after detaching your profiler. So do not use the detach feature unless you’ve thought through the ramifications and can ensure the profiler does not leave the application in a noticeably different state.
By the way, you may notice I said nothing about a profiler needing to load via attach in order for it to be able to use the detach feature. In fact, any profiler that loads on startup of the application (i.e., via environment variables and not via the AttachProfiler API) is perfectly welcome to use the detach feature—so long as it does not leave an impact on the process as per above.
There’s one, deceptively simple-looking method the profiler calls to detach itself from the running process. However, detaching is a big responsibility, and profiler writers need to give thoughtful consideration to doing it properly. The CLR does its part to ensure it doesn’t accidentally call into the profiler via Profiling API methods after the CLR unloads the profiler DLL. However, if the profiler has set into motion extra threads, Windows callbacks, timer interrupts, etc., then the profiler must “undo” all of these things before it attempts to detach from the running process. Basically, any way for control to re-enter the profiler DLL must be disabled before detaching, or else your users will experience crashes after trying to detach your profiler.
So, the sequence works like this:
Let’s dive a little deeper into the method you call to detach your profiler:
HRESULT RequestProfilerDetach([in] DWORD dwExpectedCompletionMilliseconds);
First off, you’ll notice this is on ICorProfilerInfo3, the interface your profiler DLL uses, in the same process as your profilee. Although the AttachProfiler API is called from outside the process, this detach method is called from in-process. Why? Well, the general rule with profilers is that everything is done in-process. Attach is an exception because your profiler isn’t in the process yet. You need to somehow trigger your profiler to load, and you can’t do that from a process in which you have no code executing yet! So Attach is sort of a boot-strapping API that has to be called from a process of your own making.
Once your profiler DLL is up and running, it is in charge of everything, from within the same process as the profilee. And detach is no exception. Now with that said, it’s probably typical that your profiler will detach in response to an end user action—probably via some GUI that you ship that runs in its own process. So a case could be made that the CLR team could have made your life easier by providing an out-of-process way to do a detach, so that your GUI could easily trigger a detach, just as it triggered the attach. However, you could make that same argument about all the ways you might want to control a profiler via a GUI, such as these commands:
The point is, if you have a GUI to control your profiler, then you probably already have an inter-process mechanism for the GUI to communicate with your profiler DLL. So think of “detach” as yet one more command your GUI will send to your profiler DLL.
Ok, fine, so your profiler DLL is the one to call RequestProfilerDetach. What should it specify for “dwExpectedCompletionMilliseconds”? The purpose of this parameter is for the profiler to give a guess as to how long the CLR should expect to wait until all control has exited the profiler, thus ensuring success of the CLR’s periodic safety checks (step 5). So consider all of your callback implementations and what they do. Pick the “longest” one—the one that does the most processing or blocking or complex calls back into the CLR via ICorProfilerInfo or other interfaces. Roughly how long will that callback implementation take? That’s the value (in milliseconds) that you specify for this parameter.
The CLR uses that value in its Sleep() statement that sits between each periodic safety check done as part of step 5. Although the CLR reserves the right to change the details of this algorithm, currently during step 5 the CLR sleeps dwExpectedCompletionMilliseconds before checking whether all callback methods have popped off all stacks. If they haven’t, the CLR will sleep an additional dwExpectedCompletionMilliseconds (for a total sleep time of 2*dwExpectedCompletionMilliseconds) and try again. If callback methods are still on any stacks, then the CLR degrades to a steady-state of sleeping for 10 minutes and retrying, repeating until the profiler may be unloaded.
Until the profiler can be unloaded, it will be considered “loaded” (though deactivated in the sense that no new callback methods will be called). This prevents any new profiler from attaching.
Ok, that wraps up how detaching works. If you remember only one thing from this post, remember that it’s really easy to cause an application you profile to AV after your profiler unloads if you’re not careful. While the CLR tracks outgoing ICorProfilerCallback* calls, it does not track any other way that control can enter your profiler DLL. Before your profiler calls RequestProfilerDetach: