12/6/2007 Update: The bug described below has been fixed in CLR 2.0 SP1.  (See this post for more info.)  However, you may still find this info useful as a tutorial on performing IL rewriting.

Holger, a helpful reader of this blog, has pointed out a bug in the CLR Profiling API, in how it enables profilers to inspect certain kinds of return values.  I'm going to explain the bug and offer a way for you to work around it.  Although we do plan to release a fix for this bug, it's still unclear when that fix will be available, and it could be a long, long time away. (Don't delay shipping your profiler until this gets fixed!)

Background

You can use SetEnterLeaveFunctionHooks2 to tell the CLR to call special hooks in your profiler whenever a managed function is entered and left. In these hooks you may opt in to receiving information about the parameters (on Enter) and return value (on Leave). Jonathan Keljo has the definitive blog entry on how you set this up.

In particular, you may opt in to receiving return values by using the profiler flags COR_PRF_MONITOR_ENTERLEAVE and COR_PRF_ENABLE_FUNCTION_RETVAL, and calling SetEnterLeaveFunctionHooks2(). In your FunctionLeave2 hook:

void FunctionLeave2(FunctionID funcId,
                    UINT_PTR clientData,
                    COR_PRF_FRAME_INFO func,
                    COR_PRF_FUNCTION_ARGUMENT_RANGE *retvalRange);

you may use retvalRange to explore the return value. retvalRange is a struct with two fields: startAddress and length.

The Bug

If all of the following are true, startAddress may be wrong:

  1. The managed function that you are hooking with FunctionLeave2 returns a value-type that is not a primitive. For example, it returns a struct. AND
  2. You're profiling on x86 architectures (this bug does not appear in x64 or ia64). AND
  3. When the managed function was JITted or NGENd, the JIT/NGEN compiler set up the stack frame in way that the Profiling API didn't expect. (This is vague; details below.)
Before I dive into clarifying #3, I just want to emphasize where this bug does NOT apply. You will NOT see this bug when inspecting arguments (in your FunctionEnter2 hook), as only return values are affected. You will NOT see this bug when the return value is a primitive (e.g., int, float, etc.). You will NOT see this bug when the return value is a reference type (e.g., Object, String, any class defined in C#, etc.). You will NOT see this bug on 64 bit platforms. The bug only applies when you're executing in x86, and you're inspecting a return value that's a non-primitive value-type, like structs in C#.

And even then, you may not hit the bug, as that brings us to #3. #3 talks about an invariant in our code base that was unfortunately broken... some of the time. The profiling API made an assumption that the return value will always be copied to the leaf-most end of the native stack by the time a managed function returns. Unfortunately, the JIT/NGEN compiler doesn't always generate code that does this. Here are some examples of when this assumption is invalid:

  • If the JIT/NGEN generated debuggable code. In C# you can force this by passing /debug /o- to the C# compiler (csc.exe).
  • If the JIT/NGEN performed inlining, which introduced new local variables spilled onto the stack.
  • If the method uses dynamic stack allocation (stackalloc in C#).
  • More...?

I'm trying to give you as much information as I can about this, but a key point here is you will never really know for sure when the JIT/NGEN compiler set up the stack frame in an unexpected way (i.e., when #3 of the 3 numbered items above is true).  The JIT is a complicated beast, and it would be foolish of me to try to list out every possible way the stack layout will cause this bug to occur. Although these examples may give you an idea of a minimum bound on how often #3 is true, you should always assume that #3 will be true. Your profiler will know at run-time when #1 and #2 are true. So if either of them is false, it's safe for you to inspect return values. But your profiler cannot know with 100% certainty when #3 is true. So if #1 & #2 are true, you must assume #3 is true as well, or else your profiler is doomed to report bogus data at seemingly random times.

The Workaround

We tossed around some ideas on how your profilers can work around this until we can officially release a fix. Indeed, I could probably write an interesting blog entry just on one of the ideas we dismissed. Eventually we decided that our best idea was for the profiler to use IL rewriting to get at the return values in a safe way.

Please don't cry. IL rewriting isn't that bad. :-) In the future, I plan to write some entries with tips on doing IL rewriting safely. For now, here is a general idea for how a profiler might use IL rewriting to inspect return values.

Intercept JITCompilationStarted callbacks, waiting for functions whose return values you're interested in. One gotcha to look out for here would be if the app you're profiling is loading NGENd images. If so, you won't be getting JITCompilationStarted messages for those functions that have already been compiled to native code. At least, not without a little help from you. You'll also want to intercept JITCachedFunctionSearchStarted(), which gets called for each function before the CLR chooses to use the NGENd version of it. If you set *pbUseCachedFunction to FALSE inside this callback, the CLR will ignore the NGENd version of the function, and will JIT the function instead. Then, you'll get your JITCompilationStarted callback.

But wait! You might not even get the JITCachedFunctionSearchStarted callback! The comment in corprof.idl warns us that the only reliable way to get JITCachedFunctionSearchStarted is if the profiled app is using the ngen /Profile flavor of the NGENd modules. You can ensure this happens if you set any of the special profiler event flags that force the CLR to use the /Profile flavor of all NGENd modules. COR_PRF_USE_PROFILE_IMAGES is the most obvious of these flags. But COR_PRF_MONITOR_ENTERLEAVE will do the trick just as well, and you were probably already setting that one if you were trying to inspect return values in your Leave2 hook. If your profiler sets any of the flags that force the CLR to load the /Profile-flavored NGEN modules, but some of the /Profile-flavored NGEN modules aren't available on the box running the profiled app, the CLR will just default back to JITting the functions in those modules (in which case you'll get the JITCompilationStarted callbacks anyway).

OK, so finally we're reliably getting JITCompilationStarted callbacks. When you receive JITCompilationStarted for a function whose return value interests you, grab its IL: GetFunctionInfo lets you go from FunctionID to ModuleID and the function's mdToken; GetILFunctionBody() takes the ModuleID and mdToken and gives you the IL.

The trick we'll use to rewrite the IL is based on the fact that the return value will be loaded onto the IL stack just before returning. As a brief aside, note that the IL "stack" is really a logical stack. When the JIT generates native code, pushes and pops against the IL stack don't necessarily translate to pushes and pops on the native stack. Before you go too deep into IL rewriting you will definitely want at least a basic understanding of how IL works. Free information can be found by running ildasm on any assembly (to view the IL that results from compiling C#, VB.NET, etc.), and by reading through the ECMA CLI Specification (currently on msdn at http://msdn.microsoft.com/netframework/ecma/). Partition III goes into the details of the IL instructions, and Partition II talks about metadata.

OK, back to the trick. By the time you come across a ret instruction, you know the return value should be sitting at the leaf of the IL stack. What we'd like to do is peek at this value somehow. Perhaps the easiest way to do this is to just write a managed assembly that contains a function to receive the value as a parameter, and then perform operations on it (e.g., calling its ToString() implementation). Let's call this function "MyInspector()". You can write MyInspector using your favorite managed language, editor, and compiler. You can write MyInspector at design time, and you don't need to use IL directly (that's what the compiler is for :-)). The dynamic IL you'll rewrite at run-time will just call MyInspector with the return value.

Let's start with the dynamic IL. You've found a ret instruction. Now what? You know you want to take the return value that was pushed onto the IL stack, and pass it to your MyInspector() function. Fortunately IL passes parameters by loading them onto the stack, so it would seem the return value is right where we want it! The catch is that your MyInspector() function will automatically pop the parameter off of the stack, leaving an empty stack by the time your ret instruction gets called. We need to push that return value again somehow. Luckily, "dup" comes to the rescue. It just takes the leaf-most item on the stack and pushes it again--a handy shortcut for doing this manually.

Now you have a choice. Do you pass the value directly to MyInspector or do you box it first? That depends on the type of parameter you design MyInspector to accept. If you pass the value directly, then you'll need many different inspector functions, each taking a parameter with a specific value type. You could write many such MyInspector functions for well-known non-primitive value types, like System.Decimal. Alternatively, you could have a single MyInspector that just takes an Object. To call this function, your dynamic IL would need to box the return value before calling MyInspector. Probably you'll want to do a combination of both approaches: have specific MyInspectors for known types, and default to the Object version for everything else. In this example, I'll just show a single Object-typed MyInspector():

namespace MyProfiler
{
    public class Helper
    {
        public static void MyInspector(Object o)
        {
            Console.WriteLine("MyProfiler.Helper.MyInspector: {0}", o.ToString());
        }
    }
}
In order to call this, we'll have to box our return value first. That means we'll have to know the type of our return value. This can be gotten from the signature of the function whose IL you're rewriting. You'll need to play around in metadata-land to fish this out, but luckily the SigParse sample will help out here. The IL transformation will look like this:

Original:

	ret
New and improved:
	dup
	box (TypeRef / TypeDef token)
	call MyProfiler.Helper.MyInspector
	ret
The call instruction could use a little explanation. Here's another place where you're going to need to do some metadata work. The call instruction expects a metadata token that references MyProfiler.Helper.MyInspector. In this case, since you're making a separate assembly to house MyInspector(), you'll need to get a MemberRef to it. Unfortunately, I doubt many of the apps you profile will have a clue that you happened to write your own assembly, much less come complete with references to your assembly's methods! You'll need to dynamically modify the metadata of any module that contains functions for which you plan to rewrite IL. A handy time to do this is during the ModuleLoadFinished callback for such a module. The following sequence of metadata calls will do what you need:
  • IMetaDataAssemblyEmit::DefineAssemblyRef() gives you an mdAssemblyRef to your assembly. A little work is necessary to get this right. A reliable way to reference your assembly is to sign your assembly, add it to the GAC, and use the public key that "gacutil /l" prints out for you
  • IMetaDataEmit2::DefineTypeRefByName goes from mdAssemblyRef to mdTypeRef for your class (MyProfiler.Helper)
  • IMetaDataEmit2::DefineMemberRef takes your class's mdTypeRef and the name of your function (MyInspector) and finally gives you the mdMemberRef. You'll need the signature of your MyInspector method, but you can easily find this by running ildasm against your assembly.

Caveats

Be careful which functions you perform IL rewriting on! Here are some things to keep in mind:

  • The return value will not be pushed onto the stack if the function returns void! Yeah, a little silly, but if you accidentally rewrite IL in this way on a void-returning function, the JIT will throw an exception complaining of a stack "underflow".
  • In general, it's wise to be careful about which functions you choose to modify. If you're not careful, you might accidentally modify the IL belonging to your own assembly and force MyProfiler.Helper.MyInspector to call itself.
  • This example involved adding a reference from a profiled module to your own assembly. You do NOT want to do this to mscorlib.dll! Rewriting IL in mscorlib.dll functions can be dangerous, and one trap to avoid is forcing mscorlib.dll to reference any other assembly. The CLR loader treats mscorlib.dll pretty specially. The loader expects that, while everyone in the universe may reference mscorlib.dll, mscorlib.dll had better not reference any other assembly. If you absolutely must instrument mscorlib.dll by modifying IL, and you must have that IL reference some nifty new function of yours, you had better put that function into mscorlib.dll by dynamically modifying mscorlib.dll's metadata when it is loaded. In this case you no longer have the option of creating a separate assembly to house your custom code.

To be continued...

For those of you who don't want to write any of your profiler's inspection logic using managed code, I'll blog soon about how you can rewrite IL to call into unmanaged code. That way, instead of authoring your own managed assembly to do inspection, you can have the rewritten IL call directly into your profiler's unmanaged code to do this inspection.

That wasn't too bad, was it?

To wrap up: we have a bug in the CLR, and the fix may be a long time coming. [12/6/2007 Update: This bug has been fixed in CLR 2.0 SP1.  (See this post for more info.)]  So in the meantime you'll want to get familiar with IL rewriting if you need to inspect non-primitive, value-type return values on x86.  Also, I want to give a quick shout out to a couple members of the CLR team who've helped considerably with this content: thanks, Jan Kotas and Rick Winter!  And of course, thanks again to Holger for bringing this to our attention.