In the last post we experimented with expressing a dynamic behavior (addition) using a helper function and came to conclusion that such implementation is definitely feasible (after all, IronPython 1.x uses just that), but it has its drawbacks. Today we'll address the issues and get into the real magic of the Dynamic Language Runtime.

The second way to express dynamic behavior in the DLR Tree is using the ActionExpression. In general terms, ActionExpression represents a dynamic invocation of an "action" with a list of arguments. The addition operation is an action "Add" and it accepts two arguments, as we can see in the tree produced by ToyScript from the script:

def add(a, b) {
    return a + b
}

the DLR Tree for the function body is:

//
// CODE BLOCK: add (1)
//

.codeblock Object add ()(
    .arg Object a (Parameter,InParameterArray)
    .arg Object b (Parameter,InParameterArray)
) {
    {
        .return .action (Object) Do Add( // DoOperation Add
            (.bound a)
            (.bound b)
        );
    }
}

The tree that ToyScript compiler produced essentially says: "There's an addition happening here, but the exact details will be figured out at runtime".

But who figures out the "exact details" and how?

There are two things that the DLR code generation does now when it encounters ActionExpression. First, it will emit an instance of what we call a dynamic site and initialize it with the appropriate action. If we were to write a C# pseudo-code for the generated code, it would be close to:

static DynamicSite a_plus_b_site = new DynamicSite(Add);

In the place of the actual operation the code generator will emit an invocation of the dynamic site:

a_plus_b_site.Invoke(a, b);

And that's pretty much it. There are two details that you would notice if you looked at the generated code in the reflector:

First, the DynamicSite is a generic type. Its generic parameters are the types of the ActionExpression's arguments and the desired result type. In our case they would be all objects, but there are cases when even ToyScript generates ActionExpression with more specific concrete types. For example when adding constants directly:

x = 2 + 3

The ToyScript compiler will create an ActionExpression whose arguments are of type System.Double and as a result the dynamic site will be strongly typed as well, as: DynamicSite<Double, Double, Object>. The result is still object because that is what we desire for the result to be so that we can assign it to the variable x (typed as object). The tree created by ToyScript for the assignment above is:

(.bound x) = .action (Object) Do Add( // DoOperation Add
    2
    3
);

Python generates even tighter trees. For example if the action expression is used as a condition in the conditional expression, Python will produce an ActionExpression returning bool directly so that a cast (and moreover, boxing and un-boxing) can be avoided if possible.

Second omission I made in the above pseudo-code is that when invoking the dynamic site, compiler not only passes the arguments, but also an additional argument - a code context. CodeContext is a class which contains an important thing for the whole system to work - a reference to the executing language.

Now that we know all the details, we should look at the function "add" in the reflector. Unfortunately, at the time of writing this post, the optional command line argument which will force DLR to save generated assemblies to the disk is temporarily out of commission in ToyScript (to be fixed within a day or two). Good we have Python around:

def py_add(a, b):
   
return a + b

will actually compile into an identical DLR Tree to that which ToyScript produced above:

//
// CODE BLOCK: py_add (1)
//

.codeblock Object py_add ()(
    .arg Object a (Parameter,InParameterArray)
    .arg Object b (Parameter,InParameterArray)
) {

    .return .action (Object) Do Add( // DoOperation Add
        (.bound a)
        (.bound b)
    );
}

Now isn't that curious? Two different languages with potentially rather different semantic compiling into the very same tree? It actually makes a lot of sense. The behavior of the addition is determined at runtime so each language can still apply its own semantic.

As for the actual generated code (saved by the DLR into debugSnippets1.dll), here it is:

image

You can clearly see the invocation of the dynamic site (and the additional argument - the code context.

Now we are finally getting to the place where the really interesting stuff happens... It is inside the invocation of the dynamic site. But even that is quite simple:

public Tret Invoke(CodeContext context, T0 arg0, T1 arg1) {
    Validate(context);
   
return _target(this, context, arg0, arg1);
}

After some validation of the incoming context (and the validation is debug mode only), the site blindly calls the _target delegate with whatever arguments it itself received.

While the DynamicSite.Invoke doesn't do much of anything, it is the delegate "_target" which is the key to things. The benefit of it being a delegate is that it can be dynamically updated (replaced with 'better version').

When the dynamic site gets initialized at the beginning of the program's execution, its delegate does only one thing: it cries for help. No matter what arguments are coming into the dynamic site, the delegate cannot know what to do with them. It is the language who knows what to do. When the dynamic site cries for help, it hopes to find a listening ear with the language implementation, which conveniently travels with the CodeContext $context parameter.

A key thing to this mechanism is the format of the DLR's cry. Instead of asking: "Hey language, please perform this operation on these arguments!", it will choose rather different wording:

"Hey, language, HOW can I perform this operation on these arguments?"

And it is the choice of words on the part of the Dynamic Language Runtime that makes the world of a difference. By asking the "how" question instead of demanding that the language actually does the work, DLR can learn. So long as whatever DLR learnt so far still applies, there is no need to cry for help again. Only when DLR is faced with objects with which it doesn't know what to do, the cry for help comes again and the language provides more answers.

This learning is expressed in the "_target" delegate inside the dynamic site. When the DLR learns something new it will update the delegate and and continue running. It will be therefore interesting to look at the evolution of the delegate contents. We'll use yet another nifty command line argument
(-X:StaticMethods) to look at those, and we have to do it in Python again, while ToyScript is temporarily indisposed:

Let's add a call to our py_add function, passing couple of integers:

def py_add(a, b):
    return a + b

py_add(1, 2)

... and see what code we find in the reflector (it may take some digging around, because there are many of the delegates DLR creates, but with some practice and intuition it is not too hard to find). This time I copied the function body out so I can reformat it and make it little more readable:

public static object $IronPython.Runtime.Operations.Int32Ops.Add(
    object[],
    DynamicSite<object, object, object> site1,
    CodeContext context1,
    object obj1,
    object obj2)
{

    //
    // The language's answer
    //
   
if (((obj1 != null) && (obj1.GetType() == typeof(int))) &&
        ((obj2 != null) && (obj2.GetType() == typeof(int))))
{

        return Int32Ops.Add((int) obj1, (int) obj2);

    }

    //
    // Cry for help !!!
    //
   
return site1.UpdateBindingAndInvoke(context1, obj1, obj2);

}

There are two parts to the body of the delegate updated by addition of two integers. The first part comes from the language and expresses what it means to add two integers (in Python). The Int32Ops.Add method will for example overflow from integers to big integers if the numbers are too large.

Second part of the delegate is, again, crying for help. This will happen, for example, if we were to add, say ... strings, and here would be the updated delegate:

public static object $IronPython.Runtime.Operations.StringOps.Add(
    object[],
    DynamicSite<object, object, object> site1,
    CodeContext context1,
    object obj1, object obj2)
{

    //
    // Adding strings
    //
   
if (((obj1 != null) && (obj1.GetType() == typeof(string))) &&
        ((obj2 != null) && (obj2.GetType() == typeof(string))))
{

        return StringOps.Add((string) obj1, (string) obj2);

    }

    //
    // Adding integers
    //
    if (((obj1 != null) && (obj1.GetType() == typeof(int))) &&
        ((obj2 != null) && (obj2.GetType() == typeof(int))))
{

        return Int32Ops.Add((int) obj1, (int) obj2);

    }

    //
    // Cry for help !!!
    //
    return site1.UpdateBindingAndInvoke(context1, obj1, obj2);

}

Doesn't that  look familiar? It probably does. In fact, it is very similar to our attempt to implement dynamic behavior in the previous post when we iterated over the ToyHelpers.Add to make it increasingly smarter. First implementing addition for doubles, then adding strings and ultimately the reflected call to the op_Addition.

There is an important difference here though! There is no reflective call (something we identified as a possible performance problem with our helper solution) and the generated delegate only contains code to deal with those types which actually appear in the program being executed. If my program uses only addition on integers, the code of the delegate will never evolve beyond the first updated version which handles integers and cries for help on anything else.

Here is where I confess that I played a little trick. DLR is just little too smart and if you execute a code like:

def py_add(a, b):
    return a + b

py_add(1, 2)
py_add("Hello", "Python")

The first time around in the addition, DLR sees two integers and will cry for help, yielding exactly the delegate which we saw above. Second time around, DLR sees two strings, cries for help again, but doesn't exactly produce the second delegate I showed you above. Yet! At that point DLR has a choice to make:

Either the program ran on integers until now and the fact it now sees strings is a sign that it is going to be dealing with strings from now on, in which case presence of the tests for integers may be just a waste of code in the delegate.

Or, the string is just a singularity and program will be back with integers, or even alternate between the two types (or even among even more types, which will trigger more cries for help).

DLR takes the first approach. It will assume that from now on it is strings only and will generate a delegate to deal with strings and cry for help otherwise. It is only when an integer shows up again that DLR will realize that it has seen integer before, it still remembers the recipe how to deal with integers, and will combine both of them to the final delegate I showed you above. So my trick was really quite innocent. Instead of the Python code which only adds two integers and then two strings, I used the following:

def py_add(a, b):
    return a + b

py_add(1, 2)
py_add("Hello", "Python")
py_add(1, 2)

And that did the trick.

The next question we'll explore is going to be how it exactly works when the DLR cries for help and how the language provides the answers. But we'll do it next time. Until then, happy hacking!