Firstly, the big metadata diagram. Thanks to Chris King for this absolute gem. Now, on to the various ways developers can read and write metadata bits...

 

Unmanaged Metadata Reader API’s

I mentioned various times before that there exists Unmanaged Metadata reader/writer APIs that you can use to traverse and write out metadata structures with. It’s a great workaround for those subtle Reflection (or Reflection.Emit) bugs, or gaping holes in the Reflection API. I get a few questions here and there regarding these unmanaged API’s (mostly bootstrapping related questions), so I’ve decided to drop in documentation and a bit of source code I cooked up today:

 

Some documentation for the Unmanaged Metadata API’s can be found here, (also on google, MSDN and various newsgroups).

 

A quick bootstrap code snippet is below, which gets all method names for all type definitions in a given assembly:

 

:/> MDRExample.exe System.Web.dll

// MDRExample.cpp : Defines the entry point for the console application.

//

#include "stdafx.h"

#include "cor.h"

#include <stdio.h>

#include <objbase.h>

 

int _tmain(int argc, _TCHAR* argv[])

{

    IMetaDataDispenser* _IMetaDataDispenser;

    IMetaDataImport* _IMetaDataImport;

    IMetaDataAssemblyImport* _IMetaDataAssemblyImport;

 

    CoInitialize( 0 );

    HRESULT hr;

 

    // go create all the interfaces

    hr = CoCreateInstance(CLSID_CorMetaDataDispenser, 0,

                                    CLSCTX_INPROC_SERVER,

                                    IID_IMetaDataDispenser,

                                    (LPVOID*)&_IMetaDataDispenser );

    if ( FAILED(hr) )

        throw "failed on IMetaDataDispenser";

 

    wchar_t _FileName[MAX_PATH];

    mbstowcs( _FileName, argv[1], lstrlen(argv[1])+1 );

 

    hr = _IMetaDataDispenser->OpenScope(_FileName, ofRead,

                                    IID_IMetaDataImport,

                                    (LPUNKNOWN *)&_IMetaDataImport );

    if ( FAILED(hr) )

        throw "failed on IMetaDataImport";

 

    hr = _IMetaDataDispenser->OpenScope(_FileName, ofRead,

                                    IID_IMetaDataAssemblyImport,

                                    (LPUNKNOWN *)&_IMetaDataAssemblyImport);

    if ( FAILED(hr) )

        throw "failed on IMetaDataAssemblyImport";

 

    HCORENUM    _hCorEnum = 0;

    mdTypeDef   _typeDefs[2048];

    ULONG       _countTypeDefs = sizeof(_typeDefs);

   

 

    // go get all the type defs defined in the assembly

    hr = _IMetaDataImport->EnumTypeDefs(&_hCorEnum,

                                    _typeDefs,

                                    _countTypeDefs,

                                    &_countTypeDefs);

 

    for (int i=0;i<_countTypeDefs;i++)

    {

        mdMethodDef _methodDefs[2048];

        ULONG _countMethodDefs = sizeof(_methodDefs);

 

        // go get all the methods defined on the typedef

        _hCorEnum = 0;

        hr = _IMetaDataImport->EnumMethods(&_hCorEnum,

                                    _typeDefs[i],

                                    _methodDefs,

                                    _countMethodDefs,

                                    &_countMethodDefs);

      

        // now print out the methods name

        wchar_t _methodName[1024];

        ULONG _countMethodName = sizeof(_methodName);

 

        for (int j=0;j<_countMethodDefs;j++)

        {

            wchar_t _methodName[1024];

            hr = _IMetaDataImport->GetMethodProps(_methodDefs[j], 0, _methodName, _countMethodName, &_countMethodName, 0, 0, 0, 0, 0);

            wprintf(_methodName);

            printf("\n");

        }

    }

      return 0;

}

 

AbsIL (Abstract IL)

AbsIL is a toolkit from the smart people (yeah that’s you Don) at Microsoft Research, that provides an abstracted view of metadata (generally in the form of an abstract syntax tree). Of course, having this transformation from metadata relationships to an abstract tree view gives you all sorts of opportunity to do your little bit of analysis/manipulation. Some interesting uses may include things like assembly re-writing, contract and performance analysis, discovery for plug-in like architectures, exception handling related issues and more.

 

Consumers of this toolkit beware – you’ll need to brush up on OCaml, as the AbsIL libs have been written in F# (therefore exposing this functionally). Writing an AbsIL program in C# can be very frustrating to begin with, but sometimes power comes with pain. Here’s something I hacked up in C# that’s similar to the Unmanaged Metadata reader example above – it prints the name of each type and method to the screen from an assembly provided as a command line argument:

 

using System;

 

class AbsILExample

{

    static object writeName(object t)

    {

        if (t is Il.type_def)

            Console.WriteLine(((Il.type_def) t).tdName);

        else if (t is Il.method_def)

            Console.WriteLine(((Il.method_def) t).mdName);

        return t;

    }

 

    static object writeMethodName(object t) { List.iter(new System.Func(writeName), Il.dest_mdefs(((Il.type_def)t).tdMethodDefs)); return t; }

 

      static void Main(string[] args)

      {

        Il.modul module = Ilread.read_binary(args[0], FS.option.MkNone());

        FS.list l = Il.dest_tdefs(module.modulTypeDefs);

       

        Console.WriteLine("type names:");

        List.iter(new System.Func(writeName), l);

 

        Console.WriteLine("\nmethod names:");

        List.iter(new System.Func(writeMethodName), l);

    }

}

 

Find AbsIL at the Microsoft Research website.

 

PEAPI (or Perrrwapi)

My old research lab has a managed metadata reader/writer called PEAPI. From the website it looks like they’ve kicked some butt on perf (“The new backend can create a program executable file in almost exactly the same length of time as it takes to write the equivalent CIL text file”). Crazy fast. I believe the Mono C# compiler uses this component as its code generator, although I can’t verify that.

 

I don’t have much experience with this API, I haven’t had a chance to write some stuff with the latest bits, although I do remember years back when the component author spat out the first alpha - fun stuff. Ahh the good old days…

 

Find it here.

 

IL Reader for .NET

A fellow Microsoftee, Lutz Roeder wrote a great .NET library for IL reading called IL Reader. While it doesn’t exist for 2.0 (not exactly sure what his plans are for a 2.0 rev), it supports reading a types methods in 1.0 and 1.1 assemblies. The great thing about this library is that it returns its instructions as Reflection.Emit opcodes, giving you the ability to read in, manipulate, and write out via Reflection.Emit with relative ease.

 

Find IL Reader here.

 

Whidbey’s MethodBody API

I mention this API here and here. Similar to Lutz’s IL Reader, but doesn’t have the complete opcode abstraction yet. You’ll have to take a byte stream, go find the ECMA spec or something, and decode the bytes yourself (which is fairly trivial, yet fun to do).

 

 

Feel free to leave comments below if you know of other interesting Metadata reading/writing toolkits - I've only done a quick braindump here. In anycase, hopefully there's some value here in this post for those who want to read and write any metadata bit they like!