Welcome to MSDN Blogs Sign in | Join | Help

InvocationExpression and LINQ to Entities

I talked a little bit about patterns using InvocationExpression in a previous post (you might want to review the post for some context). It turns out my advice wasn’t that easy to implement… Fortunately, the EFExtensions  helper library now includes an ExpandInvocations method that implements the same functionality (and a little bit more). I’ll return to the same example I used in my previous post, “all cars that are red or cheap”:

 

Expression<Func<Car, bool>> theCarIsRed = c1 => c1.Color == "Red";

Expression<Func<Car, bool>> theCarIsCheap = c2 => c2.Price < 10.0;

ParameterExpression p = theCarIsRed.Parameters.Single();

Expression<Func<Car, bool>> theCarIsRedOrCheap = Expression.Lambda<Func<Car, bool>>(

    Expression.Or(theCarIsRed.Body, Expression.Invoke(theCarIsCheap, p)), p);

var query = carQuery.Where(theCarIsRedOrCheap);

 

Unfortunately, the Entity Framework does not support InvocationExpressions, so the query fails with a NotSupportedException. Our new helper method does the work of rewriting the query to avoid the unsupported invocations:

 

// Rewrite query to remove InvocationExpression

query = query.ExpandInvocations();

 

A lot easier! If you’re interested in the implementation, read on…

 

Remember math class? Let’s define a couple of functions:

 

f(x) = x + 1, g(y) = f(y + 2) * f(y + 3)

 

To evaluate g(4), we just expand the values inline:

 

g(4) = f(4 + 2) * f(4 + 3) = ((4 + 2) + 1) * ((4 + 3) + 1)

 

We just need to teach EFExtensions how to do the same trick. Whenever an invocation is encountered (e.g. “f(y + 2)” in our example), we want to replace the parameter in the body of the function with the corresponding argument. If we’re expanding an expression tree, we just need to remember which argument values correspond to which parameters:

 

private sealed class InvocationExpander : ExpressionVisitor {

    private readonly ParameterExpression _parameter;

    private readonly Expression _expansion;

    private readonly InvocationExpander _previous;

   

    protected override Expression VisitInvocation(InvocationExpression iv) {

        if (iv.Expression.NodeType == ExpressionType.Lambda) {

            LambdaExpression lambda = (LambdaExpression)iv.Expression;

            return lambda

                .Parameters

                // zip together parameters and the corresponding argument values

                .Zip(iv.Arguments, (p, e) => new { Parameter = p, Expansion = e })

                // add to the stack of available parameters bindings (this class doubles as an immutable stack)

                .Aggregate(this, (previous, pair) => new InvocationExpander(pair.Parameter, pair.Expansion, previous))

                // visit the body of the lambda using an expander including the new parameter bindings

                .Visit(lambda.Body);

        }

        return base.VisitInvocation(iv);

    }

   

 

A couple of details of the VisitInvocation method might need explanation. First of all, I’m using the (very handy) Zip method, which allows me to consider the elements of two sequences pair-wise, like the teeth of a zipper coming together. (Good news: Zip is in .NET 4.0!) In this case, the lambda expression’s Parameters are aligned pair-wise with the invocation expression’s Arguments. Second of all, I’m using the call stack to keep track of which parameters are currently in scope. If you squint, InvocationExpander looks a lot like an immutable stack implementation… The _previous field is the ‘tail’ and the _parameter and _expansion fields are the ‘head’. The use of the Aggregate method gives me implicit the Push behavior I’m looking for, adding parameter bindings like layers in an onion. When control returns to the caller of VisitInvocation, the parameters are Popped out of scope, or peeled away, as a side effect of ceding control to the _previous visitor. At every position in the call stack, the InvocationVisitor instance that is in scope also captures the relevant parameter stack. When we run into a parameter expression that potentially requires expansion, we just iterate over the stack until we find a match:

 

protected override Expression VisitParameter(ParameterExpression p) {

    InvocationExpander expander = this;

    while (null != expander) {

        if (expander._parameter == p) {

            return base.Visit(expander._expansion);

        }

        expander = expander._previous;

    }

    return base.VisitParameter(p);

}

 

OK… There might be easier ways to write this solution, but I like immutable types in general. Why? I’m not smart enough to figure out state changes. I hate reviewing code and having to ask questions like “have I initialized field x at this point?”, “what happens if someone modifies collection y?” or “do I need to use a lock here?” I’m sure you can find plenty of other good reasons to like them.

Stored Procedures and First-Class Relationships in the Entity Framework

A user of the EFExtensions library asked me how to retrieve a set of ‘categories’ and their related ‘products’ through a single stored procedure call. I made some changes to the sample application to illustrate this pattern, but I thought I should also spend some time explaining how it works. The full explanation takes a little bit of patience, because it gets into some of the unique features of the Entity Data Model (EDM).

 

The EDM has “first-class relationships”. What does this mean? Take a look at products and categories in the EFExtensions sample application. The dbo.Products table has a cid column with a foreign key constraint. The corresponding Product entity type does not have a property corresponding to that column. Where did the foreign key go? Typically, foreign keys are mapped to a separate "association set" in your model, in this case ProductCategoryAssociations, so they would be redundant if included in the entity type as well. Consider the following table contents:

 

dbo.Categories

cid

name

1

Beverage

2

Food

 

dbo.Products

pid

name

cid

1

Beer

1

3

Water

1

 

In our entity model, there are three independent sets:

 

Categories

CategoryID

Name

1

Beverage

2

Food

 

Products

ProductID

Name

1

Beer

3

Water

 

ProductCategoryAssociations

Products<ProductID>

Categories<CategoryID>

<1>

<1>

<3>

<1>

 

The products and the relationships between products and categories are actually separate ‘things’ in the conceptual model, whereas in the database they’re represented by a single table. The nice thing about this independence in the EDM is that you can define a relationship between two entity types without modifying the entities. There are also some potentially unintuitive side effects to consider. I’ll talk about a few of them here.

 

Let’s say you decide that ‘Beer’ is a meal in and of itself and should be classified as a ‘Food’ rather than a ‘Beverage’. You start off by loading the relevant category and product:

 

Category food = context.Categories.Where(c => c.Name == "Food").First();

Product beer = context.Products.Where(p => p.Name == "Beer").First();

 

So where is our "first-class" relationship hiding? Let’s look at the contents of the state manager (context.ObjectStateManager.GetStateEntries):

 

  ProductCategoryAssociations(Unchanged)

    <CategoryID=1, ProductID=1>

  Categories(Unchanged)

    C2: Food

  Products(Unchanged)

    P1: Beer

  Categories(Unchanged)

    Stub entity CategoryID=1

 

Although we only queried two entities, there are four state entries! As expected, we have the ‘Beer’ product and the ‘Food’ category. We also have a ProductCategoryAssocations entry, representing the relationship between ‘Beer’ and ‘Beverage’, and a so-called stub entry representing the ‘Beverage’ category (at this point we only know that the key value is ‘1’). This is a little bit strange. After all, I didn’t query the association set. This is because the Entity Framework (EF) implicitly “spans” in reference relationships when you issue certain queries. In this case, Product has a reference (as opposed to collection) relationship with Category, so the query issued automatically includes the relationship information as well.

 

Warning: implicit relationship span does not happen for EntityCommand, ObjectQuery (LINQ or Entity-SQL) with MergeOption.NoTracking, or FunctionImport stored procedures (though I demonstrate how to work around this last limitation using EFExtensions in this post).

 

What happens when I change the category?

 

beer.Category = food;

 

Although it may look like I’m modifying the product, I’m really deleting one relationship and adding another. If you examine the state manager again, you’ll notice there are now two relationship state entries:

 

  ProductCategoryAssociations(Added)

    <CategoryID=2, ProductID=1>

  ProductCategoryAssociations(Deleted)

    <CategoryID=1, ProductID=1>

 

When I call context.SaveChanges, the EF is smart enough to issue an UPDATE rather than attempting to INSERT a new row and DELETE the existing row. The moral of the story: things are not necessarily what they seem. The ObjectContext does a pretty good job of sheltering us from the complexities of first-class relationships, but it’s good to understand them in case something goes wrong. Which brings me back to the original point of this post… What if I want to call a stored procedure that returns multiple categories and multiple products? Most likely, I’d like to know which categories and products are related. My original samples showed how you could return a single category and its related products. If there are multiple categories, we no longer know how the products line up with categories, and we end up with a disconnected graph. After attaching products and categories to my context, I still don’t know about the relationships, obvious when you examine the state manager:

 

  Categories(Unchanged)

    C1: Beverage

  Categories(Unchanged)

    C2: Food

  Products(Unchanged)

    P1: Beer

  Products(Unchanged)

    P3: Water

 

If we change the definition of our product ‘materializer’ (I describe these techniques in detail here), we can include the relationship data with the product entity:

 

new Product {

    ProductID = r.Field<int>("pid"),

    Name = r.Field<string>("name"),

    CategoryReference = {

        EntityKey = new EntityKey("MiniNWEntities.Categories", new[] { new EntityKeyMember("CategoryID", r.Field<int>("cid")) }),

    },

}

 

Now we’re mimicking the “relationship span” behavior of the EF by including the EntityKey for our category reference along with the product, and the state manager reflects the relationships as well as the entities:

 

  ProductCategoryAssociations(Unchanged)

    <CategoryID=1, ProductID=1>

  ProductCategoryAssociations(Unchanged)

    <CategoryID=1, ProductID=3>

 

In general, I think it’s useful to adopt a “first-class relationships” view of the world when you work with the EF because it is central to the design of the product.

EFExtensions update

The EFExtensions helper library has a few updates:

 

-          CreateCommand now respects the CommandTimeout property on ObjectContext. A user of the library reported this bug to me a while ago, and I’m finally getting around to fixing it.

-          One tricky scenario involves multiple stored procedure result sets that need to be combined to produce a single object graph. No new code in the library is required, but I have added a new sample illustrating how you can return a set of categories and their related products through a single stored procedure call. See the Materializer_MultipleResultSetsMerge sample for details. See this post for background information.

-          We get frequent requests for LINQ patterns involving InvocationExpressions. I’ve added a couple of helper ExpandInvocations overload to simplify use of these patterns in the Entity Framework (EF). See the Misc_ExpandInvocationExpressions sample for details. See this post for details of the implementation soon.

Using LINQ Expressions to Generate Dynamic Methods II

A beta of Visual Studio 2008 SP1 was released on Monday and the ADO.NET Entity Framework (EF) is now in the box! You can download and install the Beta here. The EF Extensions library has been updated to work with the beta and includes several public and internal changes. Source code is available at http://code.msdn.com/EFExtensions. The latest release introduces some performance improvements in the materializer (you can read about the library here). These improvements illustrate another powerful expression pattern.

To improve code clarity, the EF Extensions API encourages you to write:

var products = command.Materialize<Product>(r => new Product {

    ProductID = r.Field<int>("ProductID"),

    Name = r.Field<string>("Name"),

   

}).ToList();


instead of:

List<Product> products = new List<Product>();

using (SqlDataReader reader = command.ExecuteReader()) {

    int idOrdinal = reader.GetOrdinal("ProductID");

    int nameOrdinal = reader.GetOrdinal("Name");

   

    while (reader.Read()) {

        Product product = new Product {

            ProductID = (int)reader.GetValue(idOrdinal),

            Name = reader.IsDBNull(nameOrdinal) ? (string)null : (string)reader.GetValue(nameOrdinal),

           

        };

        products.Add(product);

    }

}


There’s usually a tradeoff. This is no exception... While the code in the first example is easier to write, read and maintain, looking up column ordinals on each call to Field<T> is expensive: for every row in every column, I’m incurring the cost of the lookup. Field<T> also verifies arguments on every call and checks for DBNull whether or not the requested type accepts nulls. Most of this work is redundant or unnecessary for materialization.

Fortunately, there’s a simple solution to these problems. If we represent the “shaper” delegate as a LINQ expression, we can rewrite it (using the technique described here) for efficiency. Basically, we can rewrite calls to Field<T> to calls to the underlying reader, caching column ordinals for efficiency. In the above example, the expression:

r => new Product() {ProductID = r.Field("ProductID"), Name = r.Field("Name")}


now becomes

r => new Product() {ProductID = Convert(r.GetValue(0)), Name = Convert(IIF(r.IsDBNull(1), null, r.GetValue(1)))}


The rewritten version is identical to the more performant version we wrote by hand.

The EFExtensions library uses an extensible pattern to perform these optimizations.  Methods that can be optimized or rewritten are flagged with an attribute indicating a handler, in this example FieldMethodOptimizer:

[MaterializerOptimizedMethod(typeof(FieldMethodOptimizer))]

public static T Field<T>(this IDataRecord record, string name);


When materialization begins, field names from the reader are immediately retrieved. Whenever a method with this attribute is encountered in the shaper expression, the corresponding optimizer is called to rewrite the expression:

protected override Expression VisitMethodCall(MethodCallExpression m) {

    Expression result = base.VisitMethodCall(m);

    if (result.NodeType == ExpressionType.Call) {

        m = (MethodCallExpression)result;

        MaterializerOptimizedMethodAttribute attribute = m.Method.GetCustomAttributes(typeof(MaterializerOptimizedMethodAttribute), false)

            .Cast<MaterializerOptimizedMethodAttribute>()

            .SingleOrDefault(); // multiple attributes not permitted; not inherited

        if (null != attribute) {

            return attribute.Optimizer.OptimizeMethodCall(this.fieldNames, this.recordParameter, m);

        }

    }

    return result;

}


As in my previous post, I’m leveraging the ExpressionVisitor to do the rewrite. In this case, I’m intercepting and replacing MethodCallExpressions only.

End result: we can now use a more concise coding pattern without sacrificing performance. Unfortunately, we still need to pay the cost of compiling the materializer delegate, but this can be offset by reusing the delegate. To facilitate reuse, the Materializer class in EFExtensions is thread-safe and stores the optimized delegate on first use.

LINQ to Entities: Combining Predicates

Someone asked a great question on the ADO.NET Entity Framework forums yesterday: how do I compose predicates in LINQ to Entities? I’ll give three answers to the question.

Answer 1: Chaining query operators

Basically, you have some query and you have some predicates you want to apply to that query (“the car is red”, “the car costs less than $10”). If both conditions need to be satisfied, you can just chain together some calls to Where (“the car is red and costs less than $10”):

Expression<Func<Car, bool>> theCarIsRed = c => c.Color == "Red";

Expression<Func<Car, bool>> theCarIsCheap = c => c.Price < 10.0;

IQueryable<Car> carQuery = ;

var query = carQuery.Where(theCarIsRed).Where(theCarIsCheap);

 

If you’re willing to exceed the $10 budget for cars that are red, you can chain Unions instead (“the car is red or the car costs less than $10”):

var query2 = carQuery.Where(theCarIsRed).Union(carQuery.Where(theCarIsCheap));

 

This last query has a couple of problems: it’s inefficient (because of the unions) and it eliminates duplicates in the results, something that would not happen if I applied a single predicate.

Answer 2: Build expressions manually

The LINQ Expressions API includes factory methods that allow you to build up the predicate by hand. I can define the conditions (with respect to a “car” parameter) as follows:

ParameterExpression c = Expression.Parameter(typeof(Car), "car");

Expression theCarIsRed = Expression.Equal(Expression.Property(c, "Color"), Expression.Constant("Red"));

Expression theCarIsCheap = Expression.LessThan(Expression.Property(c, "Price"), Expression.Constant(10.0));

Expression<Func<Car, bool>> theCarIsRedOrCheap = Expression.Lambda<Func<Car, bool>>(

    Expression.Or(theCarIsRed, theCarIsCheap), c);

var query = carQuery.Where(theCarIsRedOrCheap);

 

Building queries by hand isn’t very convenient. If you’re already building expressions from scratch, this is a good approach but otherwise I’d suggest something different…

Answer 3: Composing Lambda Expresions

The Albaharis suggest combining bodies of lambda expressions in their C# 3.0 book (a great resource for all things C# and LINQ). This allows you to describe the parts of the expression using the lambda syntax and build an aggregate expression:

Expression<Func<Car, bool>> theCarIsRed = c1 => c1.Color == "Red";

Expression<Func<Car, bool>> theCarIsCheap = c2 => c2.Price < 10.0;

Expression<Func<Car, bool>> theCarIsRedOrCheap = Expression.Lambda<Func<Car, bool>>(

    Expression.Or(theCarIsRed.Body, theCarIsCheap.Body), theCarIsRed.Parameters.Single());

var query = carQuery.Where(theCarIsRedOrCheap);

 

I’m taking the bodies of the two conditions and Oring them in a new lambda expression. There is a subtle problem however: the parameter for the merged expression (c1) is taken from “theCarIsRed”, which leaves us with a dangling parameter (c2) from “theCarIsCheap”. The resulting query is invalid. How can I force “theCarIsCheap” to use the same parameter? The answer is to invoke the expression using the common parameter:

ParameterExpression p = theCarIsRed.Parameters.Single();

Expression<Func<Car, bool>> theCarIsRedOrCheap = Expression.Lambda<Func<Car, bool>>(

    Expression.Or(theCarIsRed.Body, Expression.Invoke(theCarIsCheap, p)), p);

 

Here’s the problem: LINQ to Entities does not support InvocationExpressions. Rather than invoking the expression with c1, I can manually rebind the parameter. Matt Warren’s series of articles on IQueryable providers includes an ExpressionVisitor implementation that makes it easy to rewrite expression trees. If you do any LINQ expression manipulation, this class is a crucial tool. Here’s an implementation of the visitor that rebinds parameters:

public class ParameterRebinder : ExpressionVisitor {

    private readonly Dictionary<ParameterExpression, ParameterExpression> map;

 

    public ParameterRebinder(Dictionary<ParameterExpression, ParameterExpression> map) {

        this.map = map ?? new Dictionary<ParameterExpression, ParameterExpression>();

    }

 

    public static Expression ReplaceParameters(Dictionary<ParameterExpression, ParameterExpression> map, Expression exp) {

        return new ParameterRebinder(map).Visit(exp);

    }

 

    protected override Expression VisitParameter(ParameterExpression p) {

        ParameterExpression replacement;

        if (map.TryGetValue(p, out replacement)) {

            p = replacement;

        }

        return base.VisitParameter(p);

    }

}

 

Now I can write a general utility method to compose lambda expressions without using invoke (I’ll call it Compose), and leverage it to implement EF-friendly And and Or builder methods:

public static class Utility {

    public static Expression<T> Compose<T>(this Expression<T> first, Expression<T> second, Func<Expression, Expression, Expression> merge) {

        // build parameter map (from parameters of second to parameters of first)

        var map = first.Parameters.Select((f, i) => new { f, s = second.Parameters[i] }).ToDictionary(p => p.s, p => p.f);

 

        // replace parameters in the second lambda expression with parameters from the first

        var secondBody = ParameterRebinder.ReplaceParameters(map, second.Body);

 

        // apply composition of lambda expression bodies to parameters from the first expression 

        return Expression.Lambda<T>(merge(first.Body, secondBody), first.Parameters);

    }

 

    public static Expression<Func<T, bool>> And<T>(this Expression<Func<T, bool>> first, Expression<Func<T, bool>> second) {

        return first.Compose(second, Expression.And);

    }

 

    public static Expression<Func<T, bool>> Or<T>(this Expression<Func<T, bool>> first, Expression<Func<T, bool>> second) {

        return first.Compose(second, Expression.Or);

    }

}

 

To combine lambda expressions, I can write:

Expression<Func<Car, bool>> theCarIsRed = c => c.Color == "Red";

Expression<Func<Car, bool>> theCarIsCheap = c => c.Price < 10.0;

Expression<Func<Car, bool>> theCarIsRedOrCheap = theCarIsRed.Or(theCarIsCheap);

var query = carQuery.Where(theCarIsRedOrCheap);

 

I’ll use this last answer as an excuse to discuss variations on the visitor pattern in a future post...

Using LINQ Expressions to Generate Dynamic Methods

This week at DevConnections in Orlando, I gave a “deep-dive” talk on LINQ. I wanted to give people a feel for what's possible with the new language features and core APIs in .NET 3.5. I spent most of the talk discussing a single example: take an ADO.NET 2.0 code sample and simplify. Instead of using an existing library like LINQ to SQL or LINQ to Entities, I built a (limited) LINQ provider from scratch. As boilerplate code is moved into helper methods and a proto-LINQ-to-SQL API evolves, the sample is boiled down to something much more compact:

static List<Customer> GetCustomers()

{

    using (Table<Customer> table = new Table<Customer>(GetConnectionString(), "Customers"))

    {

        IEnumerable<Customer> query = from customer in table

                                      where customer.City == "London"

                                      select customer;

        return query.ToList();

    }

}

 

In this post, I’ll drill down on one component developed in the talk, which was also included in the EFExtensions library. This component takes rows from a data reader and transforms them (or “shapes” them) into typed results. For instance, I may want to transform records into customers. Using a couple of extension methods (Field<T> and Materialize<T>), we can leverage a “shaper” delegate to do just this:

SqlCommand command = …;

 

return command.Materialize<Category>(r =>

    new Category

    {

        CategoryID = r.Field<int>("CategoryID"),

        CategoryName = r.Field<string>("CategoryName"),

       

    });

 

The shaper delegate code (highlighted above) is still annoying though: for every property of the customer, I’m retrieving a column of the same name. The code is mechanical and repetitive, the sort of thing you want a machine to do rather than a programmer.

Here are three strategies you can use to automate this pattern in .NET 2.0:

1.       Reflection: Write a general purpose delegate that uses reflection to construct an instance of T and then dynamically invokes property setters. While this code is relatively easy to write, the performance is sub-optimal. For information on the performance of various method dispatch patterns, take a look at this great talk by Joel Pobar and Joe Duffy.

2.       Automatically generate the code: You can automatically generate wrapper classes encapsulating shaping logic. Code generation has its challenges however, in particular integration with build systems and the development environment. It’s also a lot of code to maintain.

3.       Create a DynamicMethod implementing the pattern: .NET allows you to compile a delegate at runtime using dynamic methods. This resolves the performance and maintenance problems of solutions 1 and 2. MSIL generation is hard to get right unfortunately and the code does not reflect the intent of the generated delegate.

Digression: random thoughts on APIs

I recently heard the expression “Swiss Army Knife API”. These are interfaces that handle all kinds of little (possibly unrelated) problems. They can be useful but they are also hard to package and discover. The Zip method in EFExtensions illustrates these problems. It’s a handy method, but what is it doing in an EF library? It has nothing to do with the EF or with the scenarios addressed by the library (it’s poorly packaged), and no one trying to pair the elements of two iterators would think to look in that particular library (it’s not discoverable). If you need to install a PCI card, fillet a fish and hand-stitch a saddle, you might need a Swiss Army API.

At the other extreme, there are narrowly targeted APIs that can solve complex problems but most often require specialized knowledge or training. To make matters worse, once you’ve mastered them, you can rarely apply your knowledge to different domains. You can probably think of a few examples of this pattern.

LINQ achieves a useful balance. While the LINQ project was motivated by a specific requirement – seamless support for non-object data within .NET applications – all components of the solution are generically useful. Consider…

The System.Linq.Expressions API serves a specific need for integrated queries: it allows the compiler to describe the user’s code as a data structure that can then be translated to targets other than MSIL at runtime, like SQL, Web Services, etc. Expressions can also be compiled into delegates at runtime, which brings me to a .NET 3.5 solution to the default shaper problem… If the compiler can use expressions to describe code, so can we!

Here’s the code pattern we want to generate:

r => new T

{

    Property1 = r.Field<Type[Property1]>("Property1"),

    Property2 = r.Field<Type[Property2]>("Property2"),

   

}

 

Shortcut: learning how to build an expression programmatically

If you want to figure out how to build expressions programmatically, a simple trick will probably save you some time. Just follow the compiler’s lead. First, write an example of the pattern, e.g.:

Expression<Func<IDataRecord, Customer>> example =

    r => new Customer { City = r.Field<string>("City"), };

 

After compiling your program, use a tool like .NET Reflector to figure out how the compiler builds the expression tree. In its default mode, this tool is actually a little bit too smart, disassembling the code into precisely what we wrote to begin with. If you change the disassembler settings (View à Options à Disassembler) to use “.NET 2.0” optimizations, it will give more useful output, e.g.:

ParameterExpression CS$0$0000;

Expression<Func<IDataRecord, Customer>> example =

    Expression.Lambda<Func<IDataRecord, Customer>>(

        Expression.MemberInit(

            Expression.New((ConstructorInfo) methodof(Customer..ctor), new Expression[0]),

            new MemberBinding[] {

                Expression.Bind(

                    (MethodInfo) methodof(Customer.set_City),

                    Expression.Call(null, (MethodInfo) methodof(Utility.Field), new Expression[] { CS$0$0000 = Expression.Parameter(typeof(IDataRecord), "r"), Expression.Constant("City", typeof(string)) })) }),

        new ParameterExpression[] { CS$0$0000 });

 

Once you’ve figured out the expression pattern, you can implement a general solution, e.g.:

private static Expression<Func<IDataRecord, T>> CreateDefaultShaper<T>()

{

    // Compiles a delegate of the form (IDataRecord r) => new T { Prop1 = r.Field<Prop1Type>("Prop1"), ... }

    ParameterExpression r = Expression.Parameter(typeof(IDataRecord), "r");

 

    // Create property bindings for all writable properties

    List<MemberBinding> bindings = new List<MemberBinding>();

    foreach (PropertyInfo property in GetWritableProperties<T>())

    {

        // Create expression representing r.Field<property.PropertyType>(property.Name)

        MethodCallExpression propertyValue = Expression.Call(

            typeof(Utility).GetMethod("Field").MakeGenericMethod(property.PropertyType),

            r, Expression.Constant(property.Name));

 

        // Assign the property value to property through a member binding

        MemberBinding binding = Expression.Bind(property, propertyValue);

        bindings.Add(binding);

    }

   

    // Create the initializer, which instantiates an instance of T and sets property values

    // using the member bindings we just created

    Expression initializer = Expression.MemberInit(Expression.New(typeof(T)), bindings);

 

    // Create the lambda expression, which represents the complete delegate (r => initializer)

    Expression<Func<IDataRecord, T>> lambda = Expression.Lambda<Func<IDataRecord, T>>(

        initializer, r);

 

    return lambda;

}

 

Expression<T> has a Compile method which produces an instance of T. In this case, T is our delegate type, Func<IDataRecord, T>, so we’re done! We build up a description of the logic using LINQ expressions, and then compile the logic to produce a delegate. This code is an order of magnitude easier to write using the expressions API than using custom IL generation (if you don’t believe me, give it a try!)

I can’t resist offering another implementation of the CreateDefaultShaper method that uses query expressions. This example nicely closes the loop, using LINQ to create LINQ expressions:

private static Expression<Func<IDataRecord, T>> CreateDefaultShaper<T>()

{

    ParameterExpression r = Expression.Parameter(typeof(IDataRecord), "r");

    return

        Expression.Lambda<Func<IDataRecord, T>>(

            Expression.MemberInit(

                Expression.New(typeof(T)),

                from property in GetWritableProperties<T>()

                let fieldMethod = typeof(Utility).GetMethod("Field")

                    .MakeGenericMethod(property.PropertyType)

                let propertyValue = Expression.Call(

                    fieldMethod, r, Expression.Constant(property.Name))

                select (MemberBinding)Expression.Bind(property, propertyValue)),

            r);

}

 

Problem:

Before signing off, I’d like to leave you with a simple problem: the CreateDefaultShaper method currently assumes that record instances have columns exactly corresponding to every property. Try adding support for column renames (e.g. the “customer_id” column maps to the “CustomerID” property). It’s probably easiest to implement this pattern with custom attributes on the materialized type properties, but it’s probably best to decouple the declaration of the mapping from its usage. This allows the mapping to be retrieved from an arbitrary location (e.g. either from CLR custom type attributes or from a separate configuration file).

Posted by cmeek | 5 Comments

ADO.Entity Framework: Stored Procedure Customization

The ADO.NET Entity Framework (EF) allows you to map stored procedures to functions that return typed results in the conceptual model. However, when stored procedures results don't match the patterns supported by the EF, reading and tracking results is quite difficult. Fortunately, it is possible to run arbitrary stored procedures and Transact-SQL commands through the ObjectContext. This post describes these facilities and leverages a new utility library recently posted on Code Gallery (EFExtensions) to make the job much easier. In most cases, I will show the coding patterns directly against the EF and then illustrate the same behavior using EFExtensions, e.g.:

// EF example

 

// EFExtensions example

 

The following patterns are discussed:

·         Creating and executing store commands.

·         Materializing typed results.

·         Tracking results.

·         Multiple result sets.

There’s lots of code between here and the end, so as a motivation I’ll start by showing you what all of these extensions methods working in concert can do.  Without the EFExtensions library the code to call a stored procedure in the database (including properly handling connection lifetime), create a set of objects and identity resolve them against the context would be surprisingly large.  With EFExtensions, that code is reduced to:

var results = context

    .CreateStoreCommand("GetCategories", CommandType.StoredProcedure)

    .Materialize<Category>()

    .Bind(categorySet);

Store commands

You can reuse the connection on an ObjectContext to create a store command as follows:

using (MyContext context = new MyContext())

{

    // In V1 of the EF, the context connection is always an EntityConnection

    EntityConnection entityConnection = (EntityConnection)context.Connection;

 

    // The EntityConnection exposes the underlying store connection

    DbConnection storeConnection = entityConnection.StoreConnection;

    DbCommand command = storeConnection.CreateCommand();

    command.CommandText = "GetCategory";

    command.CommandType = CommandType.StoredProcedure;

    command.Parameters.Add(new SqlParameter("cid", 1));

}

 

First you’ll need to add the library namespace to your sources files in order to use the extension methods provided by this library:

using Microsoft.Data.Extensions;

 

Now you can use the CreateStoreCommand method which extends ObjectContext and packages the code shown above:

using (MyContext context = new MyContext())

{

    DbCommand command = context.CreateStoreCommand("GetCategory",

        CommandType.StoredProcedure, new SqlParameter("cid", 1));

}

 

The ObjectContext does a nice job of managing the store connection for you when you run a query. If the connection is not already open, it will be opened for the duration of the query. I’ve included a general purpose extension method called CreateConnectionScope on DbConnection to support the same behavior, so instead of writing:

bool openingConnection = command.Connection.State == ConnectionState.Closed;

if (openingConnection) { command.Connection.Open(); }

int result;

try

{

    result = command.ExecuteNonQuery();

}

finally

{

    if (openingConnection && command.Connection.State == ConnectionState.Open) { command.Connection.Close(); }

}

 

I can write

int result;

using (context.Connection.CreateConnectionScope())

{

    result = command.ExecuteNonQuery();

}

 

Alazel Acheson, a developer on the ADO.NET team, wrote a more extensive connection scope utility that you can find here.

Materializing typed results

Given a store command, you can manually produce typed results, which EF team calls “result materialization”: turning relational records we get from the store into “real” objects!

Here’s some code that takes results from the command we created in the last section and uses them to populate Category instances:

using (DbDataReader reader = command.ExecuteReader())

{

    while (reader.Read())

    {

        int categoryIDOrdinal = reader.GetOrdinal("cid");

        int nameOrdinal = reader.GetOrdinal("name");

        yield return new Category

        {

            CategoryID = reader.GetInt32(categoryIDOrdinal),

            Name = reader.IsDBNull(nameOrdinal) ? (string)null : reader.GetString(nameOrdinal),

        };

    }

}

 

EFExtensions includes a Materializer class and several Materialize extension method overloads that make this a little bit easier.

command.Materialize(r =>

    new Category {

        CategoryID = r.Field<int>("cid"),

        Name = r.Field<string>("name"),

    });

 

Internally, all Materialize overloads taking commands leverage CreateConnectionScope and handle the lifetime of the reader. Materialize overloads optionally take a “shaper” delegate (as shown in the above example) which maps from IDataRecord to the result type. Note the Field method overload as well which handles the magic around type conversion and DBNull handling from data records (similar to the LINQ to DataSet method).

When no materialization delegate is specified, we generate a default shaper which assigns column values to public writable properties of the result type of the same name. In the following example, the store command explicitly projects two columns with the names of CLR properties on the Category type:

context.CreateStoreCommand("select cid as CategoryID, [name] as [Name] from dbo.Categories")

    .Materialize<Category>();

 

Why are shapers provided as Expression<Func<IDataRecord, T>> rather than just Func<IDataRecord, T>? We plan on exploiting the expression representation of the shaper in a future release of EFExtensions (details to follow).

While the Materialize extension methods on DbCommand and DbDataReader are convenient, it is more efficient to create a single Materializer instance and reuse it. Here are a couple of materializers that I will reuse in the rest of this post:

// Materializer with column renames.

private static readonly Materializer<Category> s_categoryMaterializer = new Materializer<Category>(r =>

    new Category

    {

        CategoryID = r.Field<int>("cid"),

        Name = r.Field<string>("name"),

    });

 

// Materializer returning different types based on a condition.

private static readonly Materializer<Product> s_productMaterializer = new Materializer<Product>(r =>

    r.Field<DateTime?>("discontinued_date").HasValue ?

    (Product)new DiscontinuedProduct

    {

        ProductID = r.Field<int>("pid"),

        Name = r.Field<string>("name"),

        DiscontinuedDate = r.Field<DateTime>("discontinued_date")

    } :

    new Product

    {

        ProductID = r.Field<int>("pid"),

        Name = r.Field<string>("name")

    });

Tracking results

Using the techniques I’ve described so far, you can retrieve arbitrary CLR objects from store commands, including entities. Entities aren’t very useful if they aren’t tracked however – you cannot modify them or delete them using the context’s state manager. To facilitate tracking, EFExtensions includes an EntitySet class which manages activities related to Entity Data Model (EDM) entity sets. The class includes a FindOrAttach method which either attaches the given entity to the state manager for tracking purposes or returns an existing entity with the same key. This simulates the identity resolution behavior that the EF provides when running queries.

In the following example, we run a command returning a single category and track it

Category category = s_categoryMaterializer.Materialize(command).SingleOrDefault();

 

// Create a category set

EntitySet<Category> categorySet = new EntitySet<Category>(context, "categories");

 

// Track command result

category = categorySet.FindOrAttach(category);

 

An EntitySet<T> similar to an ObjectQuery<T> in the EF but it is bound to a specific EDM EntitySet instance (which you can examine by looking at the EntitySet<T>.Metadata property), and encapsulates various services related to the set. I will cover this class in greater detail in a future post.

For convenience, a Bind extension method that attaches a sequence of entities is also included. This makes it easy to stream results, as in the following example:

IEnumerable<Product> products = s_productMaterializer.Materialize(command).Bind(productSet);

 

I recommend creating EntitySet properties on your ObjectContext partial class definition. These properties can also be used as roots for LINQ and Entity-SQL queries.

public partial class MyObjectContext : ObjectContext

{

    private EntitySet<Product> productSet;

    /// <summary>

    /// Gets alternative entry point for the Products entity set.

    /// </summary>

    public EntitySet<Product> ProductSet

    {

        get

        {

            if (null == this.productSet)

            {

                this.productSet = new EntitySet<Product>(this, "Products");

            }

            return this.productSet;

        }

    }

   

}

Multiple result sets

I’ll give one final example that ties together many of the concepts described in this post. Assume we have a stored procedure returning two result sets. The first result set includes a Category instance and the second result set includes all related Products. For convenience, I’ll define a method on my object context that takes a category ID, and returns that category with all related products attached. We’ve already defined the necessary materializers (s_categoryMaterializer and s_productMaterializer), as well as our entity set properties (ProductSet and CategorySet). Note that we leverage an existing EF method, EntityCollection.Attach, to associate the returned category with the related products.

public Category GetCategoryAndRelatedProducts(int categoryID)

{

    DbCommand command = this.CreateStoreCommand("GetCategoryAndProducts", CommandType.StoredProcedure, new SqlParameter("cid", categoryID));

    Category category;

 

    using (command.Connection.CreateConnectionScope())

    using (DbDataReader reader = command.ExecuteReader())

    {

        // first result set includes the category

        category = s_categoryMaterializer

            .Materialize(reader)

            .Bind(this.CategorySet)

            .SingleOrDefault();

 

        // second result set includes the related products

        if (null != category && reader.NextResult())

        {

            category.Products.Attach(s_productMaterializer

                .Materialize(reader)

                .Bind(this.ProductSet));

        }

    }

 

    return category;

}

 

Expect more posts on the EFExtensions library over the months leading up to the release of the ADO.NET Entity Framework V1!

 

 
Page view tracker