Generating Dynamic Methods with Expression Trees in Visual Studio 2010

Generating Dynamic Methods with Expression Trees in Visual Studio 2010

Rate This
  • Comments 21

Expression trees first appeared in Visual Studio 2008, where they were mainly used by LINQ providers. You can use expression trees to represent code in a tree-like format, where each node is an expression. You can also convert expression trees into compiled code and run it. This transformation enables dynamic modification of executable code as well as the execution of LINQ queries in various databases and the creation of dynamic queries. Expression trees in Visual Studio 2008 are explained in Charlie Calvert’s blog post Expression Tree Basics.

In this post I’m going to show how expression trees were extended in Visual Studio 2010 and how you can use them to generate dynamic methods (a problem that previously could be solved only by emitting MSIL). But although I strongly recommend reading Charlie’s blog post first, I still need to repeat some basics to spell out certain nuances.

Creating Expression Trees

The easiest way to generate an expression tree is to create an instance of the Expression<T> type, where T is a delegate type, and assign a lambda expression to this instance. Let’s take a look at the code.

// Creating an expression tree by providing a lambda expression.

Expression<Action<int>> printExpr = (arg) => Console.WriteLine(arg);

 

// Compiling and invoking the expression tree.

printExpr.Compile()(10);

// Prints 10.

In this example, the C# compiler generates the expression tree from the provided lambda expression. Note that if you use Action<int> instead of Expression<Action<int>> as a type of the printExpr object, no expression tree will be created, because delegates are not converted into expression trees.

However, this is not the only way to create an expression tree. You can also use classes and methods from the System.LINQ.Expressions namespace. For example, you can create the same expression tree by using the following code.

// Creating a parameter for the expression tree.

ParameterExpression param = Expression.Parameter(typeof(int), "arg");

 

// Creating an expression for the method call and specifying its parameter.

MethodCallExpression methodCall = Expression.Call(

typeof(Console).GetMethod("WriteLine", new Type[] { typeof(int) }),

param

);

 

// Compiling and invoking the methodCall expression.

Expression.Lambda<Action<int>>(

methodCall,

new ParameterExpression[] { param }

).Compile()(10);

// Also prints 10.

Of course this looks much more complicated, but this is what actually happens when you supply a lambda expression to an expression tree.

Expression Trees vs. Lambda Expressions

A common misconception is that expression trees are identical to lambda expressions. This is not true. On the one hand, as I have already shown, you can create and modify expression trees by using API methods, without using lambda expression syntax at all. On the other hand, not every lambda expression can be implicitly converted into an expression tree. For example, multiline lambdas (also called statement lambdas) cannot be implicitly converted into expression trees.

// You can use multiline lambdas in delegates.

Action<int> printTwoLines = (arg) =>

{

Console.WriteLine("Print arg:");

Console.WriteLine(arg);

};

 

// But in expression trees this generates a compiler error.

Expression<Action<int>> printTwoLinesExpr = (arg) =>

{

Console.WriteLine("Print arg:");

Console.WriteLine(arg);

};

Expression Trees in Visual Studio 2010

All the code examples I have shown so far work (or don’t work) the same in both Visual Studio 2008 and Visual Studio 2010. Now let’s move to C# 4.0 and Visual Studio 2010.

In Visual Studio 2010, the expression trees API was extended and added to the dynamic language runtime (DLR), so that language implementers can emit expression trees rather than MSIL. To support this new goal, control flow and assignment features were added to expression trees: loops, conditional blocks, try-catch blocks, and so on.

There is a catch: You cannot use these new features “an easy way”, by using lambda expressions syntax. You must use the expression trees API. So the last code example in the previous section still generates a compiler error, even in Visual Studio 2010.

But now you have a way to create such an expression tree by using API methods that were not available in Visual Studio 2008. One of these methods is Expression.Block, which enables the execution of several expressions sequentially, and this is exactly the method that I need for this example.

// Creating a parameter for an expression tree.

ParameterExpression param = Expression.Parameter(typeof(int), "arg");

 

// Creating an expression for printing a constant string.

MethodCallExpression firstMethodCall = Expression.Call(           

typeof(Console).GetMethod("WriteLine", new Type[] { typeof(String) }),

Expression.Constant("Print arg:")

);

 

// Creating an expression for printing a passed argument.

MethodCallExpression secondMethodCall = Expression.Call(          

typeof(Console).GetMethod("WriteLine", new Type[] { typeof(int) }),

param

);

 

// Creating a block expression that combines two method calls.

BlockExpression block = Expression.Block(firstMethodCall, secondMethodCall);

 

// Compiling and invoking an expression tree.

Expression.Lambda<Action<int>>(

block,

new ParameterExpression[] { param }

).Compile()(10);

I’ll repeat this: Although the expression trees API was extended, the way expression trees work with lambda expression syntax did not change. This means that LINQ queries in Visual Studio 2010 have the same features (and the same limitations) that they had in Visual Studio 2008.

But because of the new features, you can find more areas outside of LINQ where you can use expression trees.

Generating Dynamic Methods

Now let’s move to the real problems where the new API can help. The most well-known one is creating dynamic methods. The common solution to this problem is to use System.Reflection.Emit and work directly with MSIL. Needless to say, the resulting code is hard to write and read.

Basically, the expression tree that prints two lines to the console that I have shown previously is already an example of a dynamic method. But let’s try a little bit more complex one to demonstrate more features of the new API.  Thanks to John Messerly, a developer on the DLR team, for providing the following example.

Assume that you have a simple method that calculates the factorial of a number.

static int CSharpFact(int value)

{

      int result = 1;

while (value > 1)

{

result *= value--;

}

return result;

}

Now you want a dynamic method that does the same thing. We have several essential elements here: a parameter that is passed to a method, a local variable, and a loop. This is how you can represent these elements by using the expression trees API.

static Func<int, int> ETFact()

{

 

// Creating a parameter expression.

ParameterExpression value = Expression.Parameter(typeof(int), "value");

 

// Creating an expression to hold a local variable.

ParameterExpression result = Expression.Parameter(typeof(int), "result");                                   

           

// Creating a label to jump to from a loop.

LabelTarget label = Expression.Label(typeof(int));

 

// Creating a method body.

BlockExpression block = Expression.Block(

 

// Adding a local variable.

new[] { result },

 

// Assigning a constant to a local variable: result = 1

Expression.Assign(result, Expression.Constant(1)),

 

// Adding a loop.

Expression.Loop(

 

// Adding a conditional block into the loop.

Expression.IfThenElse(

 

// Condition: value > 1

Expression.GreaterThan(value, Expression.Constant(1)),

 

// If true: result *= value --

Expression.MultiplyAssign(result,

Expression.PostDecrementAssign(value)),

 

// If false, exit from loop and go to a label.

Expression.Break(label, result)

),

// Label to jump to.

label

)

);

 

// Compile an expression tree and return a delegate.

return Expression.Lambda<Func<int, int>>(block, value).Compile();

}

Yes, this may look more complicated and less clear than the original C# code. But compare it to what you have to write to generate MSIL.

static Func<int, int> ILFact()

{

      var method = new DynamicMethod(

      "factorial", typeof(int),

      new[] { typeof(int) }

      );

      var il = method.GetILGenerator();

      var result = il.DeclareLocal(typeof(int));

      var startWhile = il.DefineLabel();

      var returnResult = il.DefineLabel();

 

      // result = 1

      il.Emit(OpCodes.Ldc_I4_1);

      il.Emit(OpCodes.Stloc, result);

 

      // if (value <= 1) branch end

      il.MarkLabel(startWhile);

      il.Emit(OpCodes.Ldarg_0);

      il.Emit(OpCodes.Ldc_I4_1);

      il.Emit(OpCodes.Ble_S, returnResult);

 

      // result *= (value--)

      il.Emit(OpCodes.Ldloc, result);

      il.Emit(OpCodes.Ldarg_0);

      il.Emit(OpCodes.Dup);

      il.Emit(OpCodes.Ldc_I4_1);

      il.Emit(OpCodes.Sub);

      il.Emit(OpCodes.Starg_S, 0);

      il.Emit(OpCodes.Mul);

      il.Emit(OpCodes.Stloc, result);

 

      // end while

      il.Emit(OpCodes.Br_S, startWhile);

 

      // return result

      il.MarkLabel(returnResult);

      il.Emit(OpCodes.Ldloc, result);

      il.Emit(OpCodes.Ret);

 

      return (Func<int, int>)method.CreateDelegate(typeof(Func<int, int>));

}

If You Want to Know More

In Visual Studio 2010, expression trees were developed as a part of the dynamic language runtime, which is also released as an open-source project. You can download the source code and find the specification and documentation for expression trees on the CodePlex Web site.

A more advanced example of using the new expression trees API is shown in Bart De Smet's blog post Expression Trees, Take Two – Introducing System.Linq.Expressions v4.0.

And of course, to try the examples, you need to download Visual Studio 2010 and .NET Framework 4 Beta 1 here.

Leave a Comment
  • Please add 8 and 4 and type the answer here:
  • Post
  • Nice article!!

  • I know this is to late to change but I just voice my concern that I think the inclusion of imperative constructs might turn out to be a mistake.

    What I mean is the following; only half of the reason to do Expression Trees is to generate dynamic methods the other half is traversing Expression Trees in order to analyze it, perhaps I'm building a LINQ provider and wants to build up a SQL statement.

    While the imperative constructs are great for the first half (generating dynamic methods) I think they complicate the second half (LINQ providers) alot.

    That said there are situations where the expresssions trees were lacking but then I think we should have looked to the functional programming domain for solutions.

    Loops - Loops are easy enough, use recursion or provider higher order functions to do it (such as .Where).

    Calculating values to be used later in the same expression - These there are no good work arounds for in .NET 3.5. Work around exists such as recalculate the value everytime or introducing a new intermediate lambda expression. But I think instead of introducing statements I think one should have consider the functional approach and introduced an expression for "let <name> = <expr> in <expr_using_name>"

    Sequential statements - also doesn't have a good work around today. For instance I've created a generic dispose method that disposes all fields that inherit disposable. Lacking statement makes it hard but not impossible. What I've done in .NET3.5 is creating a helper object and then the expression would become. helper.DisposeMe(x.Field1).DisposeMe(x.Field2).DisposeMe(x.Field3). I still think statements are unnecessary. The let expression could have solve this as well.

    So, then you say "this is not a problem because C# don't allow conversion of code that uses statements and loops into Expression<Func<>> ". But you can create it programmatically and that sort of puts presssure on me to at least try to consider implementing them.

    I know .NET 4.0 will ship with these changes which means that in the future that life for us who like to do LINQ providers suddenly became a lot harder. Nothing I say will change that. I'm just saying I think that other options should have been considered (I'm sure they were and rejected for some rational reasons).

    Regarding Expression what I'm desperately looking for a solution for is the well known issue that

    void F(int y)

    {

      Expression<Func<int, int>> exp = x => x + y;

    }

    The expression tree here will be constructed fully from the leaf to the root even though that the major structure of the expression are identical. The problem is that 'y' becomes a constant expression that by its nature is a leaf. Due to the immutability of the trees the whole tree has to be reconstructed.

    The immutability I don't question, immutability is a good thing.

    Two issues:

    1. Performance - Reconstructing a complex tree can be quite cpu intensive due to that the expression trees using Reflection internallly (interestingly trees with field references are faster to construct then if they have property references). I personally like to use expression trees a lot more not only for LINQ, in order to do that they have to be very fast to construct.

    2. Makes caching impossible - Let's say I build some smart transformation of an expression tree into an SQL statement. Next time I receive a tree I'd like to check if the tree actually is the same as something I already processed thus using the cached response. But since it always creates a new instance I can't really cache trees. Sure I as a programmer can declare a global variable that holds the expression and always pass that but most programmers wouldn't know to do that.

    In order to retain immutability inserting the variable reference has to be moved from the leaf of the tree to the top some how. I've sketched on some ideas that when typing the expression above what it really becomes is this:

    static readonly Expression<Func<int,int, int>> compiler_generated_exp = (x,y) => x + y;

    void F(int y)

    {

    //   Expression<Func<int, int>> exp = x => x + y;

      Expression<Func<int, int>> exp = x => compiler_generated_exp(x, y);

    }

    Then at least the compiler_generated_exp would always be the same regardless and cachable. Especially useful I think it would be if the compiler_generated_exp(x, y) used a special expression (called Closure perhaps) becomes then my expression tree traverser knows this is closure case.

    In addition I think the .Compile() method on the Lambda Expression should benefit from this as well (.Compile() takes around 1ms on my machine which is way too long for my intended usage).

  • Hello, Alexandra!

    Another way to construct dynamic methods is codedom. Look at http://linq2codedom.codeplex.com/. It use expression trees to create codedom tree and compile it in runtime.

    With multistatement lambdas the approach will be more efficiet and easy.

  • Hi, Marten.  Thanks for your comments on Alexandra’s post.  I thought I’d take a stab at responding to some of your comments.  I work on the DLR.

    If you’re just working with LINQ providers, you likely won’t see any of the new Expression Tree (ET) nodes.  Our languages do not generate them now, and I’m not sure it would ever make sense for LINQ expressions.  We did not want trees for LINQ, trees for the DLR, trees for code modeling (of which we have three now), and so on.  It made a lot of sense to have one unified design for ETs, but then to document for a given subsystem, which ET nodes they work with.

    In a sense our ETs have a functional style to them in that they are expression based, like Lisp, Scheme, Ruby, etc.  You can avoid using the Loop node and generate recursion for looping as you suggest, and we even let you set a flag on your lambdas to do tail calling.  However, we had a design guideline to provide general nodes that were close to language constructs for many languages so that the ETs produced by a certain language could be viewed and processed close to that language’s expressions.  Of course, they won’t ever be one to one, and we had hoped to provide more higher-level nodes that reduced to other nodes, such as Loop.

    We have something very close to “let blah in blah” expressions with our Block.  You effectively create a binding scope, provided expression to execute within that scope, and the last expression implicitly provides the result value.  We omitted the initialization expressions to avoid a more complex model with options for definite assignment, nulls, default values, etc.  You could easily define a Let node that reduced to our Block node and use your own Let node, which the DLR can compile for you by reducing it.

    Again, LINQ providers should not worry about supporting more than ETs v1.

    Anyway, a key point of our design is that you can create extension nodes that are more purely functional for your processing, and if you make them reducible, you can always include them in lambdas for compilation and executions.

    Thanks,

    bill

  • On a very special episode of This Week on Channel 9, Dan is joined Christian "Littleguru" Liensberger, and Sven Groot where we do blind taste-testing of beer, and  discuss the week's top developer news.

    TrackBack from http://channel9.msdn.com/shows/this+week+on+channel+9/twc9-sven-groot-beer-taste-test-office-web-aspnet-auto-starts-system-io-changes/

  • This is absolutely fantastic news -

  • This is cool stuff, but I have to admit that the Expressions to CodeDOM looks much easier (and cleaner).

    I don't know how stable that is though, but if it is good then I may give that a try first.

    It is nice to know that there is language support for this now.

  • CodeDOM is _much_ more heavyweight, because of all the overhead of spinning up an out-of-process compiler (it actually runs csc.exe!), parsing the code, writing IL to a disk file, and then loading the resulting assembly.

  • Hi,

    Good article about Visual Studio 2010. It will useful to understand the diffrence between Expression Trees and Lambda Expressions.

  • Great post. This article inspired me to refactor my data access layer into one that dynamically generates data access code based on entity meta-data.

    It works beautifully, and will in the end save me a lot of code.

    Thanks,

    Pete

  • Expressions are used to evaluate something. We usually use them in the right side of an affectation or at any place where a value is expected. It can be composed with many other values, with computations or function calls...
  • Very nice article, please do something about text formatting....

  • Pete, I'd love to see some samples of your refactor? I've been considering the same thing and would be interested in seeing how someone else has achieved it.

  • I know there is already CodeDOM. But, if I am not wrong, code generated by CodeDOM is slow to compile. Generating MSIL is a lot faster, but harder.

    How does the expression trees compares to both?

    And, is there a way to generate full classes using expressions trees? I must say that I would love to do this... I think "expression trees" can really do that type of job and even make CodeDOM obsolete.

    And, any possibility of "decompiling" methods to expressions trees? Sometimes I really want to decompile a method, change something and create a new method based on it.

    Thanks in advance.

  • Hi Paulo,

    No, I don't think you can generate classes with expression trees. In fact, all you can do is to create a delegate (strictly speaking, not even a method).

    Decompilation of methods is also not there yet... As I mentioned in the article, this implicit conversion is supported for lambda expressions only.

    The thing is that expression trees main goals are to serve LINQ and new interop with dynamic langauges. Tricks like this one are really nice advantages, but they were not the main focus of this API.

Page 1 of 2 (21 items) 12