|
|
This blog is to introduce some new features about Managed CodeGen in Whidbey. We provided a new way of doing LightWeight CodeGen, added support for emitting Generics in Reflection.Emit and there is some new exciting token handle stories in Reflection going on.
-
After being in Exchange for almost a year, I tend to forget the how to run Reflection.Emit that I was once so familiared with. Occasionally, I would receive emails from this blog asking how to emit this, how to emit that. Before, the solutions came directly from my memory, but now I need some help. The way for me to figure it out is to compile the thing I want to emit into a binary file and run it under my little tool AssemblyRoundTrip.exe with verbose output.
I wrote the tool for testing purpose. I ran every assembly I can get under that tool and it will figure out for me whether we can or cannot emit the assembly.
Now, I found it really helpful for myself to figure out things should be done. I am posting the tool here and hopefully it will help you to work out some generic mystery.
Here is an example, if I want to define Test<T>:List<T> type, I wrote it in C# and compile it. Then I ran
AssemblyRoundTrip.exe repro.dll /verbose
and here is the output:
new Asm Name is repro_Emit Security Attribute number :0 setting ca:[System.Runtime.CompilerServices.CompilationRelaxationsAttribute((Int 32)8)] setting ca:[System.Runtime.CompilerServices.RuntimeCompatibilityAttribute(WrapNo nExceptionThrows = True)] new module name is repro_Emit.dll Total Types:1 m_modBuilder.DefineType M`1 <---- Define the type Adding M`1[T] to the create list Define Generic Parameters on M`1 <---- Define the type parameter Adding Generic Parameter M`1[T]::T on Type M`1 to the create type list Found a match for :M`1::T <--- this step is a preparation for the type below. We are binding List`1[T] (the type below) to T Set Parent for M`1 System.Collections.Generic.List`1[T] <--- set the paraent Found a match for :M`1::T Set Interface for M`1 System.Collections.Generic.IList`1[T] Found a match for :M`1::T Set Interface for M`1 System.Collections.Generic.ICollection`1[T] Found a match for :M`1::T Set Interface for M`1 System.Collections.Generic.IEnumerable`1[T] Set Interface for M`1 System.Collections.IEnumerable Set Interface for M`1 System.Collections.IList Set Interface for M`1 System.Collections.ICollection +++ Create Method : M`1[T]::.ctor Security Attribute for type :M`1[T] is 0 Adding M`1[T] to bake type list +++ Created Type : M`1
So I translate the verbose output into the following code:
using System;
using System.Reflection;
using System.Collections.Generic;
using System.Reflection.Emit;
public class Test
{
public static void Main()
{
string typename = "ListProxy";
AssemblyName asmname = new AssemblyName("mytest");
AssemblyBuilder asmbuild = AppDomain.CurrentDomain.DefineDynamicAssembly(asmname, AssemblyBuilderAccess.RunAndSave);
ModuleBuilder moduleBuilder = asmbuild.DefineDynamicModule("mytest", "mytest.dll");
TypeAttributes typeAttributes = TypeAttributes.Public | TypeAttributes.Class;
TypeBuilder typeBuilder = moduleBuilder.DefineType(typename, typeAttributes);
GenericTypeParameterBuilder[] tpbs = typeBuilder.DefineGenericParameters("T"); // defines a generic type definition
typeBuilder.MakeGenericType(typeof(List<>).GetGenericArguments()[0]);
typeBuilder.SetParent(typeof(List<>).MakeGenericType(tpbs[0])); // bind the correct base type
typeBuilder.CreateType();
asmbuild.Save("mytest");
}
}
I attached the tool as it is and I am unlikely to support this tool. I could send the source code upon request but the source code is a little confusing since it is a test tool so it tend to has many extra stuff that we are interested to test but not related to the main purpose.
If you set EXT_ROOT to your runtime install path, the tool will run peverify for you at the end to verify the generated assembly.
You will find there is another AssemblyPrinter.exe, it is a reflection verstion of ildasm outputing text file. The only difference is that AssemblyPrinter.exe prints out members in deterministic order. So if you want to compare the two assemblies, AssemblyPrinter.exe is a good tool for that. ildasm.exe output on two assemblies compiled from the same source tend to be difference because the runtime doesn't grarantee the ordering of members to be the same. (This should remind you not to use GetMethods[1] or GetCustomAttributes[2] since they tend to change!)
|
-
I am going to program in the Server world using managed code! Maybe after some time, I will start a new blog for IT professionals about Clustering in Exchange. But currently this blog is not likely to have new contents about the most updated CLR technology.
Thanks.
Yiru
|
-
ProfilerCallback.cpp
// This is the function that will invoke the managed code through COM interop on another thread
// this function creates the CCW object
// [in] this pointer DWORD WINAPI CreateManagedStub(LPVOID lpParam) { _ManagedStub * pIManagedStub = NULL; HRESULT hr = CoCreateInstance(CLSID_ManagedStub, NULL, CLSCTX_INPROC_SERVER, IID__ManagedStub, (void **)&pIManagedStub); if (FAILED(hr)) { printf("Fail to CoCreateInstance on ManagedStub class 0x%x\n", hr); return 1; } if (pIManagedStub == NULL) { printf("pIManagedStub is null 0x%x\n", hr); return 1; }
// we have the managed instance now. ((CProfilerCallback*)lpParam)->m_pIManagedStub = pIManagedStub; return 0; }
// this function is the actual caller to the managed world. // this function can be used to get a managed dynamic method token back // [in] this pointer; method token; assembly name // [out] delegate type name DWORD WINAPI ManagedPreStub(LPVOID lpParam) { PMYDATA pData = (PMYDATA)lpParam; LPWSTR delegateName = NULL; if (pData->thisObj->m_pIManagedStub == NULL) { printf("pData->thisObj->m_pIManagedStub is null"); return 1; } HRESULT hr = pData->thisObj->m_pIManagedStub->PreStub(pData->assemblyName, pData->methodTk, &delegateName); if (FAILED(hr)) { printf("Fail to Call PreStub 0x%x\n", hr); } wcscpy_s(pData->delegateName, (wchar_t*)delegateName); return 0; }
......
// Profiler callback function
HRESULT CProfilerCallback::JITCompilationStarted(UINT functionId, BOOL fIsSafeToBlock) { HANDLE hThread = NULL; DWORD dwThreadId; ///////////////////////////////////////////////////////// // We dont want to continue on JITCompilationStarted if the thread is for the "managed code" we are to call from this callback.
// We store these thread ids in our skip thread hashtable
// 1. if the JITCompilationStarted is on one of the skipping thread, avoid following steps. DWORD currentThreadId = GetCurrentThreadId(); if (m_SkipJCHashTable->PLookup(currentThreadId)) return S_OK;
// create the ManagedStub object if (m_pIManagedStub==NULL) // if it is already created, no need to create twice { hThread = ::CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)CreateManagedStub, this, 0, &dwThreadId);
// add the thread Id to skip pool so that JITCompilationStarted will do nothing on that thread.
m_SkipJCHashTable->SetAt(dwThreadId, 0); if (hThread != NULL) { WaitForSingleObject(hThread, INFINITE); CloseHandle(hThread); } m_SkipJCHashTable->RemoveKey(dwThreadId); }
....
/////////////////////////////////////////////////////// // try to define a managed dynamic method and get the dynamic method method token // prepare the parameters to be passed in PMYDATA pData = (PMYDATA)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, sizeof(MYDATA));
if( pData == NULL ) goto exit;
pData->assemblyName = wszAssemblyName; pData->methodTk = tkMethod; pData->thisObj = this;
ProfilerPrintf("Working on asm %ws func %ws.", wszAssemblyName, wszFunctionName); hThread = ::CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)ManagedPreStub, pData, 0, &dwThreadId); m_SkipJCHashTable->SetAt(dwThreadId, 0); if (hThread != NULL) { WaitForSingleObject(hThread, INFINITE); CloseHandle(hThread); } m_SkipJCHashTable->RemoveKey(dwThreadId);
// we free up the memory here instead of in the thread because we want some return value back. wchar_t wszDelegateName[512];
if (pData->delegateName == NULL) return S_OK; // copy the return value out wcsncpy_s(wszDelegateName, _countof(pData->delegateName), pData->delegateName, _countof(pData->delegateName));
HeapFree(GetProcessHeap(), 0, pData);
...
We have these definitions in ProfilerCallback.h:
private:
CMap<DWORD, DWORD, int, int> * m_SkipJCHashTable;
...
Enjoy!
|
-
I think CLR Profiling API is one of the coolest things in CLR. Here is a nice article I just found on msdn about Whidbey Profiling APIs. Note that Everett Profiling API and Whidbey Profiling API are completely not compatible. You have to explicitly implement ICorProfilerCallback2 interface to make profiling work on Whidbey.
Profiling provides the possibility to do runtime tracking, tracing and assembly rewriting. It is exactly what a tester would like to do in many aspects to test managed projects.
In order to show some ideas behind this, I will introduce one of our test tools here.
We implemented a Just-In-Time LCG converter tool based on Profiling API's ability to do method instrument. It was always a headache to test LCG because it is a new kind of method, completely dynamic, and touches nearly every aspect of the runtime. The only way to test it well is to leverage our existing tests which already covered nearly all aspects of CLR.
We implemented a DynMethodConverter tool based on Reflection. The tool loads the assembly and based on the methodinfos, it converts them to the dynamic methods, linked them up and invoke on the entry point. Everything looked good at the beginning but we came to many dead problems. The tool only worked on simple, verifiable test libraries. Tests with unmanaged entry points (interop tests, hosting tests), unsafe tests (cannot pass security checks while loading) and tests with drivers will have a difficult time to be adopted under the tool.
Then we looked at the possibility of solving this problem using Profiling API.
When the profiling is turned on, the JITCompliationStarted function gets called. We took that chance to change the method body to call on a DynamicMethod delegate. One tricky part is that LCG has to be created in managed code and Profiling API cannot have any managed stack on top of it. We worked around this issue by starting another thread in the JITCompilationStarted function; invoke managed code that will create the dynamic method there. We also cached that thread id in a global hashtable, so that when that managed code hits JITCompilationStarted function again, we can quickly know that we will have to do nothing there. On the managed side, it will be very natural to use token resolution APIs to get the calling method and leverage all the Reflection structures to create a dynamic method.
This tool can be very easily adapted to the automatic testing environment. We only need to turn the profiler on and off, register the necessary managed assembly as needed. It also solved all the previous limitations of tools based on Reflection.
Hope this sample helps you when you are designing your testing tool.
|
-
I was reading Joel's blog about new stuff in Reflection and he mentioned ReflectionOnly context. Apart from the Generics, token handle resolutions, LCG, ReflectionOnly context (actually a loader feature) had a big impact in Reflection area in Whidbey. It reminds me to talk about ReflectionOnly context thing in detail in my blog.
ReflectionOnly context is first of all a loader concept, and here is Junfeng's blog entry explaining that aspect in detail.
ReflectionOnly context's impact on Reflection is also large. Basically, Reflection can be summarized into two usages: inspect the assembly metadata and do latebind invocations. ReflectionOnly context is targeted for the first scenario *only*.
All those APIs related to execution is disabled (throws InvalidOperationException) if the assembly is loaded in a ReflectionOnly context. Those APIs are xxx.Invoke, FieldInfo.GetField, FieldInfo.SetField ... Their changes seem to be apparent and need no explanation.
There are several more APIs that are affected by ReflectionOnly context that do need some explanations:
- GetCustomAttribute: The reason we have to disable this because GetCustomAttribute returns you an Attribute and to get that, we have to run the Attribute ctor. To enable the ReflectionOnly GetCustomAttribute, we provided a new API called GetCustomAttributeData that returns a CustomAttributeData object. That object contains the constructorinfo and other raw info about the Attribute arguments. Our new document contains detailed explanation on that type. Currently, GetCustomAttributeData performs close to GetCustomAttribute, but it doesn’t execute the ctor. It is possible that in the furture, we could provide a even faster GetCustomAttributeData.
- Handle related APIs: We disable all access to handles. Handle is really a runtime concept, it exists in the runtime and is used for invocation operations. By disabling handle access and usage in ReflectionOnly context, we basically guarded ourselves against many potential areas that could hack through the system trying to do invocation in ReflectionOnly context. On the other hand, we are able to do all the assembly inspection work without handles. All token APIs not related to Handle function in ReflectionOnly context. In the ReflectionOnly context, it is possible to build your reflector that prints out assembly content together with the methodbody.
- You may see there are new APIs called GetRawConstantValue on FieldInfo, PropertyInfo and ParameterInfo(GetRawDefaultValue). The usage of those APIs is to get the values encoded in the metadata out, which is supported in ReflectionOnly context. Particularly, if the field constant value is an enum, normal GetValue will try to return a enum and fail to execute the constructor, but GetRawConstantValue will return the encoded raw integer back to you. You should use GetRaw* APIs only when you are in ReflectionOnly context.
ReflectionOnly context is needed for certain scenarios to work. For example, you are developing a managed build tool such as regasm.exe that is supposed to work on all platforms. If you build the tool based on normal context, the tool will have difficulty to build a 64 bit app on 32 bit platform since we cannot load a 64 bit app on 32 bit platform in normal context. This scenario is common for build tools since in many places, people build their 64 bit apps using 32 bit machines. It is quite fair to recommend using ReflectionOnly context if your app is using Reflection only for content inspection to avoid unnecessary limitations.
The cons of ReflectionOnly context is that you cannot do any execution if the assembly is loaded that way (but for many tools, inspect the content is all Reflection used for) and you have to build your own assembly resolver since the loader won't resolve the assembly references for you automatically. ReflectionOnly load can also be used to load some unverifiable assemblies that normal load will fail since it skipped a lot of verification steps that normal load needed to do to ensure safe execution.
|
-
I've got several inquiry emails about this issue, so I think it is a good idea to publish the problem here.
Say we have:
public class Test
{ public void Teste<U>(U teste, int i) { Console.WriteLine(teste); }
public void Teste<U> (U teste) { }
}
Is it possible to use GetMethod to get void Teste<U>(U teste, int i)out?
Unfortunately, the answer is there is no way for now to use only GetMethod to get the method before we know the type of U. It is a limitation in Reflection for Whidbey.
Here is a possible workaround for you, you can use GetMembers to minimize the methodinfo you have to retrieve.
using System; using System.Reflection;
public class Test { public void Teste<U>(U teste, int i) { Console.WriteLine(teste); }
public void Teste<U> (U teste) { }
public static void Main() { MemberInfo[] mis = typeof(Test).GetMember("Teste*", BindingFlags.InvokeMethod|BindingFlags.Public|BindingFlags.Instance); if (mis.Length == 0) {Console.WriteLine("No Teste Methods"); return;}
Type U = ((MethodInfo)mis[0]).GetGenericArguments()[0]; // assume we know the class structure above, for simplicity. MethodInfo mInfo = typeof(Test).GetMethod("Teste", new Type[] { U, typeof(int) });
if (mInfo.IsGenericMethod) { mInfo = mInfo.MakeGenericMethod(typeof(string)); mInfo.Invoke(new Test(), new object[] { "Test - calling generic method", 1 }); } } }
|
-
Our Reflection QA definitely knows a lot and he is now having a blog of his own:
http://blogs.msdn.com/haibo_luo
A little advertisement, I know he is going to have something cool coming up soon. Just keep an eye on it.
|
-
Users are not supposed to operate on same Reflection.Emit object on multiple threads. Reflection.Emit Objects are *NOT* thread-safe. It is certainly OK to operate on different Reflection.Emit objects on different threads.
We have been discovering issues when users trying to do the first scenario and we are trying hard to make the program not to crash. :) But the correctness cannot be guaranteed. So be careful.
|
-
We have found some limitations in Reflection.Emit that you cannot emit types that can be compiled in C# compiler. You are unlikely to hit it in daily emit job but if ever you met one:
public class E { public struct N1 { public E e; }
public struct N2 { public N1 n1; }
N2 n2; }
There are two partial order functions that must be satisfied for ref.emit type loading to succeed and sometimes there is way to satisfy both rules. The order rules are:
1. All types of all struct fields must be loaded before the type (this one is necessary) 2. All enclosing types of all field types must be loaded before the type (this one is not necessary but is artificially imposed by the ref.emit path through the loader)
For rule 1 the orders are the subset of all orders where N1 is loaded before N2 and N2 is loaded before E which leaves us with only one:
N1 > N2 > E
For rule 2 the partial orders are the subset of all orders where E is loaded before N2 which leaves us with 2:
E > N1 > N2 N1 > E > N2
But there is no intersection of these sets and so no way to create this assembly with ref.emit.
Of course, the early bound loader doesn't have the rule 2 restriction so there is a code path that will work but it is likely too much work to move reflection emit over to that code path. So at least for Whidbey, we will have to live with such restrictions.
However, it is possible that you could try to solve the TypeLoadException by hooking your own type resolver up to the AppDomain's TypeResolve event.
|
-
In Whidbey, apart from the token handle resolution APIs I mentioned earlier, there are some overloads making these APIs look a bit more complicated. Again, I am using APIs related to MethodInfo as an representitive example:
Token -> Info
public class Module
{
public MethodBase ResolveMethod(int metadataToken, Type[] genericTypeArguments, Type[] genericMethodArguments)
}
Handle->Info
public class MethodBase
{
public static MethodBase GetMethodFromHandle(RuntimeMethodHandle handle, RuntimeTypeHandle declaringType)
}
Token->Handle
public class ModuleHandle
{
public RuntimeMethodHandle ResolveMethodHandle(int methodToken, RuntimeTypeHandle[] typeInstantiationContext, RuntimeTypeHandle[] methodInstantiationContext)
}
These APIs are used for Generic related method resolution.
Let's talk about the Handle->Info resolution first.
public class MethodBase
{
public static MethodBase GetMethodFromHandle(RuntimeMethodHandle handle, RuntimeTypeHandle declaringType)
}
Here is a bascially concept of code sharing in Generics. Again, Joel has a nice article about it and within that article there are useful resource links for further reading. We are just going to put it simple here.
In Generics, Methods instantiated with reference type share the same body (they are represented by the same Method Desc).
That is G<String>.M and G<Object>.M share the same Method Desc; C.M<String> and C.M<Object> share the same Method Desc as well. Since RuntimeMethodHandle is a pointer to the method desc, those two methods have the same RuntimeMethodHandle. How to resolve this handle back correctly?
The good news is that the if M is a static method on G<T>, the runtime is able to know if it is a method for G<String> or G<Object> by some additional internal data structure.
If M is the generic method, the runtime is also able to distinquish the two. So in those cases, the additional declaringtype info is not needed for the MethodInfo to be resolved back.
The only case runtime will not be able to differenciate the two methods is when the M is inside G<T> and M is a instance method. Runtime depends on the this pointer's type to figure out the declaring type of the method. In our handle resolution API, we have to tell the runtime what the declaringType is. That's when the second argument is needed.
So here is a bit interesting thing, if M is a instance method, if you emit such code
new G<String>.ctor
call G<Object>.M
C<String>.M will be called because the runtime figured the declaring type of the instance method through this pointer. Of course, runtime cannot be fooled if you pass in G<int> as the this pointer for G<object>.M, it won't pass the JIT verification.
However if M is a static method, Runtime is clever enough to figure out the Handle is really about the G<String>.M or G<Object>.M since no "this" pointer will be passed in.
One thing need to be noted is that, code sharing is kind of a implementation detail of CLR, you shouldn't be building a program that relies on say G<String>.M.MethodHandle == G<Object>.M.MethodHandle although this sharing is not likely to change. But maybe one day, we decide to share more (for example, share code with instantitions of same size value type) or share less. RuntimeMethodHandle really exposed a quite low level and raw notation of the runtime, and you should be careful when using it and use it in the right way (the way we promised to work).
Now what about tokens?
Token is the disk represntation of all kinds of types, to learn about tokens, the best way is to write a C# program and open it in ildasm.exe.
public class G<T> /* type def G`1<T> 02000002*/ { public void M(T t) /* method def M(!T t) 06000001*/ {} } public class C { public static void GM<T>(T t) /* method def GM<T>(!!T t) !!T means a different generic parameter than !!T 06000003*/ {} } public class Test { public static void Main() { G<String> gString = new G<String>(); gString.M("Hehe"); /* method ref under type spec G`1<string>::M(!0) method ref 0a000005 type spec 1b000001*/ C.GM<String>("Haha"); /*method spec C::GM<string>(!!0) 2b000001*/ } }
This example illustrates some of the basic concept of different type tokens (def token, ref token, and spec token)
in ildasm we use ! to distinquish different type parameters, !0 means it is instantiated with whatever is in !T, 0 is the type parameter index.
In this example G<T>.M G<String>.M, C.GM<String> are all of different tokens. However the info->token API will only return you def tokens, which means:
G<Object>.M.MetadataToken will return the token for G<!T>.M
C.GM<String>.MetadataToken will return the token for C.GM<!!T>
However, reflection was able to resolve the spec token or ref token under a spec token. For those cases to work, however, you will need to provide the "context" where the specific token lives in. The context is the calling method's generic argument (if any) and calling method's declaring type's generic argument (if any). If there is none, you can just pass in null to the two parameters.
The reason why we need this API can be illustrated below.
using System; using System.Reflection;
public class G1<T> { public void M() { C.M<T>(); } } public class G2<V> { public void M() { C.M<V>(); } }
public class C { public static void M<Z>() { } }
if you compile this program and ildasm it, you will see there is only one Method Spec which says initiate the method M with the first type argument.
MethodSpec #1 (2b000001) Parent : 0x06000005 CallCnvntn: [GENERICINST] 1 Arguments Argument #1: Var!0
In the context of the first call of C.M<T> the instantiation should be T and in the second call, the right argument for M<Z> is V. From the metadata itself, we don't know which method it actually is and that's why the context is needed to understand that token.
For furthur reading about tokens, the best source is the ECMA CIL specificiation:
http://www.ecma-international.org/publications/standards/Ecma-335.htm
Today, I am searching myself on google, however, this blog wasn't shown as the top hits. I guess it is because I didn't put my name Yiru Tang here. :)
|
-
I haven't blogged for a long time because we were busy hitting ZBB for RTM milestone.
I am not aware of something special to talk about so let's talk about one of the main new features in Whidbey -- the token handle resolution APIs.
We have talked about tokens before and now it is more time for handles. We have RuntimeTypeHandle, RuntimeMethodHandle, RuntimeFieldHandle etc for each reflection entity. They contains a pointer to the real loaded managed image, for example, the RuntimeMethodHandle points to a MethodDesc which is used when invoking the method. In ReflectionOnly context (which disallows excecution), we disallowed the accessing to these handles simply because handles have a strong excution notion associated with them.
We provided a complete set of token -> handle -> info resolution APIs in Reflection. I am listing them below just for MethodInfo (or MethodBase), for Fields and Types, the APIs are similiar.
Token -> Info
public class Module
{
public MethodBase ResolveMethod(int metadataToken);
}
Info -> Token
public class MemberInfo
{
public int MetadataToken { get; }
}
Handle->Info
public class MethodBase
{
public static MethodBase GetMethodFromHandle(RuntimeMethodHandle handle);
}
Info->Handle
public class MethodInfo
{
public RuntimeMethodHandle MethodHandle { get; }
}
Token->Handle
public class ModuleHandle
{
public RuntimeMethodHandle ResolveMethodHandle(int methodToken);
}
Let's talk about non-Generics first. Stories regarding Generics are quite complicated and worth another post. For non-Generics, Handle -- Info -- Token has a roughly one to one correspondence.
I said roughly because sometimes, it could the one to one correspondence could be broken. For example, the identity of a MethodInfo is determined by the Handle and the ReflectedType.
public class A
{
public void M(){}
}
public class B:A
{}
MethodInfo mi1 = typeof(A).GetMethod("M");
MethodInfo mi2 = typeof(B).GetMethod("M");
The two MethodInfo has the same DeclaringType and the same RuntimeMethodHandle but they have different ReflectedType, so comparing the two of them, they won't be equal. Since their RuntimeMethodHandle are the same, invoking on them we are always invoking into the same method. So RuntimeMethodHandle really means an excution entity and MethodInfo could contain more information.
Token is completely Module based, so the same MethodInfo must have differnet tokens regarding to different modules. Within the same module, the tokens should have one to one correspondance to the MethodInfo if there is no Generic in the picture and the method is not a vararg method.
Why we want to provide the notion of Handles and Tokens?
By providing tokens, we are expanding reflect ability from purely string based into more possible dimensions (as shown in my examples in the first post). Token to Info resolution is faster than string->info resolutions (our PM joel had an MSDN article about the performance topic), and tokens occupy less space than infos. If you are to cache a lot of MethodInfos in your memory, it is going to be expensive. Instead you can get the tokens from info and cache those tokens instead. When you need to use the info, use the relative cheap token to info resolution to get the info back. Or you can cache the handles. Handles can even be used directly for invocation if you have the right type of delegate defined.
// Get the delegate's constructor
ConstructorInfo ctor = typeof(MyDelegate).GetConstructor(new Type[] {typeof(object), typeof(IntPtr)});
// invoke on the delegate CACall del = (CACall)ctor.Invoke(new Object[] {null/*parameters to be pushed*/, myMethodHandle.GetFunctionPointer()});
Of course, the call above doesn't mean to save you much invocation time and working set. But it shows you how the handles means exceution entity. In the furture, we could possibliy provide more APIs to utilize the handles directly for managed code gen, but I cannot guarantee that now.
We provide a new type called MethodBody in Whidbey. So from a MethodInfo, you can get a MethodBody object, from the MethodBody you can get infos such as Local Variables, Exception Clauses and the IL byte array. Using the token resolution API and opcode.def file, you could easily parse the array to the re-assemble the opcodes. We could do all these in metadata API before but now we can do it through reflection APIs. In other word, you should be able to write a managed ildasm.exe now!
|
-
-
Maybe it is a little to early to jump into this talk but I think for a user starting to use LCG, he will hit this problem in the first place. When you are emitting a dynamic method and doesn't have a way to persist it, you can imagine that it could be hard to debug. But SoS has some build-in support for LCG that works surprisingly well (Thanks to Michael Stanton(our SOS dev lead) and Dario Russi(who created LCG)).
Let me start with a program that caused LCG to throw:
using System.Reflection; using System.Reflection.Emit; using System;
public class Test { public static void Main() { DynamicMethod dm = new DynamicMethod("dm", typeof(Int64), new Type[]{}, typeof(Test).Module); ILGenerator ilgen = dm.GetILGenerator(); ilgen.EmitWriteLine("Insdie DM"); ilgen.Emit(OpCodes.Ldc_I8, 0); ilgen.Emit(OpCodes.Ret); dm.CreateDelegate(typeof(D)).DynamicInvoke(null); } } delegate Int64 D();
When excuting the program, we throw:
Unhandled Exception: System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.InvalidProgramException: Common Language Runtime detected an invalid program. at dm() at D.Invoke() --- End of inner exception stack trace --- at System.RuntimeMethodHandle._InvokeMethodFast(Object target, Object[] arguments, SignatureStruct& sig, MethodAttributes methodAttributes, RuntimeTypeHandle typeOwner) at System.RuntimeMethodHandle.InvokeMethodFast(Object target, Object[] arguments, Signature sig, MethodAttributes methodAttributes, RuntimeTypeHandle typeOwner) in c:\vbl\ndp\clr\src\BCL\System\RuntimeHandles.cs:line 686 at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture, Boolean skipVisibilityChecks) in c:\vbl\ndp\clr\src\BCL\System\Reflection\XXXInfos.cs:line 1472 at System.Delegate.DynamicInvokeImpl(Object[] args) in c:\vbl\ndp\clr\src\BCL\System\Delegate.cs:line 102 at System.Delegate.DynamicInvoke(Object[] args) in c:\vbl\ndp\clr\src\BCL\System\Delegate.cs:line 94 at Test.Main()
Welcome to the LCG world. :-) How could we debug this issue?
Let's start "windbg repro.exe" and let it stop at CLR exceptions by typing in "sxe clr". (I will usually type in "sxe eh" as well just as a habbit).
It will stop at the place it is throwing and the call stack looks like this:
0:000> kb ChildEBP RetAddr Args to Child 0012ba94 5b15cfc9 0012bcac 5b0fc1a6 00000727 mscorjit!debugError+0xc8 [c:\vbl\ndp\clr\src\jit\il\error.cpp @ 113] 0012beac 5b16488d 5b0fc8d8 5b0fca20 00000009 mscorjit!badCode3+0x69 [c:\vbl\ndp\clr\src\jit\il\error.cpp @ 219] 0012bf08 5b1660c1 00217031 00000006 0022774c mscorjit!Compiler::fgFindJumpTargets+0x51bd [c:\vbl\ndp\clr\src\jit\il\flowgraph.cpp @ 1834] 0012bfa0 5b1598d2 0012d558 0012d520 00162a80 mscorjit!Compiler::fgFindBasicBlocks+0x81 [c:\vbl\ndp\clr\src\jit\il\flowgraph.cpp @ 2418] 0012c034 5b15ac98 02e21b49 0012cfbc 0012c700 mscorjit!Compiler::compCompile+0x702 [c:\vbl\ndp\clr\src\jit\il\compiler.cpp @ 2376] 0012c0e0 5b15be5b 0012cfbc 0012c700 0012c128 mscorjit!jitNativeCode+0x178 [c:\vbl\ndp\clr\src\jit\il\compiler.cpp @ 2845] 0012c13c 5dd01521 5b249ed4 0012cfbc 0012c700 mscorjit!CILJit::compileMethod+0xbb [c:\vbl\ndp\clr\src\jit\il\ee_il_dll.cpp @ 212] 0012c1f0 5dd0185d 0020e548 0012cfbc 0012c700 mscorwks!invokeCompileMethodHelper+0x101 [c:\vbl\ndp\clr\src\vm\jitinterface.cpp @ 9763] 0012c2e8 5dd01adb 0020e548 0012cfbc 0012c700 mscorwks!invokeCompileMethod+0x17d [c:\vbl\ndp\clr\src\vm\jitinterface.cpp @ 9791] 0012c354 5dd0288e 0020e548 0012cfbc 0012c700 mscorwks!CallCompileMethodWithSEHWrapper+0x1db [c:\vbl\ndp\clr\src\vm\jitinterface.cpp @ 9881] 0012d144 5dfb0042 02e21430 00000000 00107410 mscorwks!UnsafeJitFunction+0x5de [c:\vbl\ndp\clr\src\vm\jitinterface.cpp @ 10289] 0012d234 5dfb2b12 00000000 00000000 0012d558 mscorwks!MethodDesc::MakeJitWorker+0x362 [c:\vbl\ndp\clr\src\vm\prestub.cpp @ 437] 0012d3e0 5dfb1726 00000000 0012d558 0012d520 mscorwks!MethodDesc::DoPrestub+0xdd2 [c:\vbl\ndp\clr\src\vm\prestub.cpp @ 1447] 0012d4f0 00391f0a 0012d520 12345678 5e58f190 mscorwks!PreStubWorker+0x326 [c:\vbl\ndp\clr\src\vm\prestub.cpp @ 906] WARNING: Frame IP not in any known module. Following frames may be wrong. 0012d508 00ba0256 00000000 0012d558 00162a80 0x391f0a 0012d540 5d86cf0d 0012e648 00bc46e4 00bc46e4 CLRStub[StubLinkStub]@ba0256 0012d570 5da1710d 0012db30 00000000 0012db00 mscorwks!CallDescrWorkerInternal+0x33 [c:\vbl\ndp\clr\src\vm\i386\asmhelpers.asm @ 930] 0012d9a4 5da16e07 0012db30 00000000 0012db00 mscorwks!CallDescrWorker+0x10d [c:\vbl\ndp\clr\src\vm\class.cpp @ 13529] 0012dae4 5dd0e6ad 0012db30 00000000 0012db00 mscorwks!CallDescrWorkerWithHandler+0x2b7 [c:\vbl\ndp\clr\src\vm\class.cpp @ 13436] 0012def4 5dd0d93a 02e22534 0012e22c 0012e0f0 mscorwks!MethodDesc::CallDescr+0xd1d [c:\vbl\ndp\clr\src\vm\method.cpp @ 2288]
We can see that JIT is throwing because we emitted some invalid IL.
The first argument for UnsafeJitFunction is the Dynamic Method's MethodDesc.
if we do:
0:000> .loadby sos mscorwks 0:000> !dumpmd 02e21430
Method Name: DynamicClass.dm() Class: 02e21288 MethodTable: 02e212f8 mdToken: 06000000 Module: 0020e0b0 IsJitted: no m_CodeOrIL: ffffffff
We can see the Method signature and the method name of the method. Moreover, if we do
0:000> !dumpil 02e21430 This is dynamic IL. Exception info is not reported at this time. If a token is unresolved, run "!do <addr>" on the addr given in parenthesis. You can also look at the token table yourself, by running "!DumpArray 00bc4644".
IL_0000: ldstr 70000002 "Insdie DM" IL_0005: call 6000003 System.Console.WriteLine(System.String) IL_000a: ldc.i8 0
We can see the IL of the method body. Well, this method has some problem so the IL body is not complete.
If we do dumparray, you can find the scope table!
0:000> !DumpArray 00bc4644 Name: System.Object[] MethodTable: 5bcb9434 EEClass: 5bcb9a14 Size: 32(0x20) bytes Array: Rank 1, Number of elements 4, Type CLASS Element Methodtable: 5bca30dc [0] null [1] 00bc42b8 [2] 00bc3eb0 [3] 00bc72b8
LCG has its own token table to handles and strings so this table is very important. From the IL body, we know that the string should live in the [2] position of the token table. We can verify it by doing:
0:000> !do 00bc3eb0 Name: System.String MethodTable: 5bca3760 EEClass: 5bca36f0 Size: 36(0x24) bytes (C:\WINDOWS\Microsoft.NET\Framework\v2.0.50523dbg\assembly\GAC_32\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll) String: Insdie DM Fields: MT Field Offset Type VT Attr Value Name 5bca76f4 4000095 4 System.Int32 0 instance 10 m_arrayLength 5bca76f4 4000096 8 System.Int32 0 instance 9 m_stringLength 5bca511c 4000097 c System.Char 0 instance 49 m_firstChar 5bca3760 4000098 10 System.String 0 shared static Empty >> Domain:Value 0015d8f8:5bc567a4 << 5bcb98d0 4000099 14 System.Char[] 0 shared static WhitespaceChars >> Domain:Value 0015d8f8:00bc12c8 <<
The Console.WriteLine's method handle should be in [3] so:
0:000> !do 00bc72b8 Name: System.RuntimeMethodHandle MethodTable: 5bcadfcc EEClass: 5bd7a614 Size: 12(0xc) bytes (C:\WINDOWS\Microsoft.NET\Framework\v2.0.50523dbg\assembly\GAC_32\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll) Fields: MT Field Offset Type VT Attr Value Name 5bca6cbc 40004df 4 System.IntPtr 0 instance 1541425360 m_ptr 0:000> !dumpmd 0n1541425360
Method Name: System.Console.WriteLine(System.String) Class: 5bd75708 MethodTable: 5bca66f0 mdToken: 06000767 Module: 5bc44000 IsJitted: yes m_CodeOrIL: 5b704610
The two most common errors that caused InvalidProgramException are:
1. You are emitting bad IL; 2. If you are using open scope API, you could be messing up token tables.
So these Sos commands should help you to identify most of the InvalidProgramExceptions.
I will explain these open scope APIs, token scope in my later posts.
Another easy way to check you IL stream is to emit the same IL stream using Reflection.Emit so that you can bake the method and use tools such as peverify to identify where you did wrong.
Finally, can you tell me where I did wrong in this small program?
|
-
I think I will start severl topics around LCG in the furture. Our PM Joel has an excellent post about various method invocations and the paragraph about LCG is a sweet start point on Lightweight Code Gen.
http://blogs.msdn.com/joelpob/archive/2004/04/01/105862.aspx
Here is a list of basic dynamic features:
1. LCG Methods are reclaimable.
If you used Reflection.Emit, you will probably find that you cannot delete your emitted stuff within the same appdomain. It is because Reflection.Emit stuff lives on loader heap and those memories are not reclaimable until AppDomain shutdown. For dynamic method, it's code heap can be fully reclaimed through GC reclaim, and its method desc can be resued. So you can imagine that if you are emitting a lot of methods within an Appdomain using Reflection.Emit, even though you finished using most of them, the memory usage will increase linearly until you unload the appdomain. But for dyanmic method, if you are only holding up to a certain number of methods and creating and release a lot of methods, the memory usage will be flat.
2. LCG Method is a global static method on module.
LCG Method is bascially a piece of code that has minimium amount of metadata associated with it. Although we have a Creation API that allows you to specify a parent type for the dynamic method, the type there is only used for visibility checks -- that is the dynamic method will be able to access all private members in the type and all internal members in the type's module. Metadata-wise, it doesn't belong to that type.
You can imagine then LCG Method's can be used to write compiled regular expression evaluator, XSLT transformation sheet, database queries. It provides an efficient way for server-side code generation. LCG methods can be invoked early bind (through Emit(OpCodes.Call, DynamicMethod)), late bind (DynamicMethod.Invoke) and through delegate.
|
-
Shreeman suggested us to publish a list of Reflection.Emit known restrictions (that is not going to be made into Whidbey). I happen to have such a list on my machine for my own reference. So I just post them out here:
- Cannot Emit nested enum type
-
- Cannot Emit global field
- Cannot Emit private enum field
- Small things like:
- Cannot Emit some of the AssemblyNameFlags such as 0; Cannot Emit CallingConventions.WinAPI
- EventBuilder should derive from EventInfo
- This will probably never be fixed because if we fix this it is a breaking change from previous versions.
- Cannot Emit new format security attribute
In Whidbey, Security Attributes got a new form which has a less length in blob and the layout is closer to normal custom attributes.
For example, the MemberAccessPermission attribute looks like this in ildasm:
.permissionset reqmin
= {[mscorlib]System.Security.Permissions.ReflectionPermissionAttribute = {property enum class 'System.Security.Permissions.ReflectionPermissionFlag, mscorlib, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089' 'Flags' = int32(2)}}
The old format in Everett is like this:
.permissionset reqmin
"<PermissionSet class=\"System.Security.PermissionSe"
+ "t\"\r\nversion=\"1\">\r\n<IPermission class=\"System.Security.Permis"
+ "sions.ReflectionPermission, mscorlib, Version=2.0.0.0, Cultu"
+ "re=neutral, PublicKeyToken=b77a5c561934e089\"\r\nversion=\"1\"\r\nF"
+ "lags=\"MemberAccess\"/>\r\n</PermissionSet>\r\n"
If you use Ctrl+M to open up the metdata info, you will find the new attribute's blob length is shorter.They serve the same purpose and has same effect. The only problem is that Reflection doesn't support Reflect on the old format security attribute.
- On Method Emit, some ordering of the API usage can cause the method not being emitted fully right.
The order of setting custom attribute, set parameters and set implementation flags on method could matter.
A best practice is that you always SetImplementationFlags last and you always set return type before setting parameter types.
Finally, I was asked by a user about CodeDom and Reflection.Emit comparison. I know little about CodeDom, so I'd rather not comment it here. Here is an article I found on the web that give some light on this topic:
http://www.fawcette.com/reports/vslivesf/2004/holmes/
Edited:7/2/2005
Remove some restrictions since they are going to be fixed in Whidbey.
|
|
|
|