Code Only
10 June 09 09:04 PM | efdesign | 26 Comments   

There are currently two ways to get Entity Framework apps up and running, we call these Database First and Model First respectively.

Database First has been with us since .NET 3.5 SP1, and involves reverse engineering an Entity Data Model from an existing database.

Model First is new to Visual Studio 2010/.NET 4.0 and allows you to create an Entity Data Model from scratch and then generate a database and mapping for it.

However many developers view their Code as their model.

Ideally these developers just want to write some Domain classes and without ever touching a designer or a piece of XML be able to use those classes with the Entity Framework.
Basically they want to write ‘Code Only’.

We are pleased to announce that we’ll be shipping a preview of “Code Only” on top of the .NET Framework 4.0 Beta 1 (more on that in the coming days).

How it works:

To use Code only you simply write some POCO classes, here is a simple example:

public class Category
{
   public int ID { get; set; }
   public string Name { get; set; }
   public List<Product> Products { get; set; }
}

public class Product
{
   public int ID { get; set; }
   public string Name { get; set; }
   public Category Category { get; set; }
}

Then write a class that derives from ObjectContext that describes the shape of your model and how you want to access your POCO classes, so something like this:

public class MyContext : ObjectContext
{
   public MyContext(EntityConnection conn)
    : base(conn){ } 
   public ObjectSet<Category> Categories {
      get {
         return base.CreateObjectSet<Category>();
      }
   }
   public ObjectSet<Product> Products {
      get {
         return base.CreateObjectSet<Product>();
      }
   }
}

NOTE: this persistence / EF aware class could easily be in another assembly so you can keep it separate from your nice pristine domain classes.

At this point you have everything you need from a CLR perspective, but you won’t be able to use the MyContext class without some EF metadata, which is normally stored in the EDMX file.

But remember Code-Only means there isn’t an EDMX file.
So where do we get the metadata?
That is where the ContextBuilder class comes in:

SqlConnection connection = …; // some un-opened SqlConnection
MyContext context = ContextBuilder.Create<MyContext>(connection);

What is happening here is that the ContextBuilder is looking at the Properties on MyContext, inferring a default Conceptual Model, Storage Model and Mapping by convention, it the uses that metadata plus the SqlConnection you passed in to create an EntityConnection, and finally it constructs an instance of MyContext by passing the EntityConnection to the constructor we created previously.

Once you have an instance of your context, we ship several extension methods you can use to automatically create a database script, see if the database exists, drops the database, and/or create the database,. For example this code snippet uses two of those extension methods to create the database if it doesn’t already exist.

if (!context.DatabaseExists())
    context.CreateDatabase();

The CreateDatabase call looks at the EF metadata, in particular the Storage Model, aka SSDL, and uses it to produce Database Definition Language (or DDL) which it then executes against against the connection.

Overriding Conventions

In the examples so far everything is done by convention. This is great if you are happy with our conventions. However you may want to override them, for example to register different keys, use different mappings or use a different inheritance strategy etc.

To override the convention instead of creating an ObjectContext directly you create an instance of a ContextBuilder that you can configure.

var contextBuilder = new ContextBuilder<MyContext>();

In the first Feature CTP, Code Only provides the ability to override how the key properties are inferred:

contextBuilder.RegisterKey((Product p) => p.ID);

This code tells the builder that the key of Product is its ID property.

When you’ve finished configuring how you want your model and mappings to work in your code, you create an instance of your context, this time by calling Create on the builder instance you have been configuring:

MyContext context = contextBuilder.Create(connection);

It is that simple.

Over time the features of the ContextBuilder will grow so you will be able to setup custom mappings, mark properties as transient, setup up Facets (e.g. MaxLength), change inheritance strategies, link properties that are the inverse of each other together (e.g. product.Category and category.Products) and more. In fact here is an example of the sort of thing we are aiming to support for the next release of the Feature CTP:

builder[“dbo.prds”] = 
  from c in builder.OfType<Product>()
  select new {
    pcode = p.ID,
    p.Name,
    cat = p.Category.ID
  }
);

This snippet:

  • Maps Product entities to the ‘prds’ table.
  • Maps Product.ID to a ‘pcode’ column.
  • Maps Product.Name to a ‘Name’ column.
  • Maps the FK under the Product.Category relationship to the ‘cat’ column.

Here is another example that does basic TPH inheritance:

builder[“dbo.Cars”] = (
  from b in builder.OfTypeOnly<Car>()
  select new {
     id = b.CID.AsKey(),
     b.Color,
     b.Manufacturer,
     disc = “C”
   }
).Merge(
  from s in builder.OfType<SportsCar>()
  select new {
     id = s.CID.AsKey(), 
     s.Color, 
     s.Manufacturer, 
     disc = “S”, 
     hp = s.HorsePower, 
     tq = s.Torque
  }
);

These two examples just scratch the surface of what we are planning on doing, hopefully though they give you a sense of where we are heading.

We are super excited about Code Only. But as of right now our plans to support overriding conventions aren’t finalized, so we’d love to get your feedback, this really is your chance to help us get it right.

Alex James,
Program Manager, Entity Framework Team, Microsoft

This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post.

Self-Tracking Entities in the Entity Framework
24 March 09 02:50 AM | efdesign | 63 Comments   

Background

One of the biggest pieces of feedback we received from the N-Tier Improvements for Entity Framework post as well as other sources was: “low level APIs are great, but where is the end-to-end architecture for N-tier and the Entity Framework?”.   This post outlines some of the additional feedback we’ve received and describes the self-tracking entities architecture that will ship along side of VisualStudio 2010 and .NET Framework 4.0.

When an application goes multi-tier, there are important architectural decisions that the application developer makes around how to communicate between tiers. There are a lot of choices, and the choice depends on a variety of conditions. How rapidly is each tier expected to change? How much control is there over each tier? Are their specific protocol, security, or business policy concerns? The answers to these questions often drive the selection of whether to use data transfer objects (DTOs) or DataSets or something different altogether.

Self-Tracking Entities

Self-tracking entities know how to do their own change tracking regardless of which tier those changes are made on. As an architecture, self-tracking entities falls between DTOs and DataSets and includes some of the benefits of each.

Drawing from DataSet

DataSet on the client tier is very easy to use because there is no need to track changes separately or maintain any extra data structures that include change tracking information. DataSet takes care of serializing state information for each row of data. On the mid tier, applying the changes stored within a DataSet is straightforward. DataSets have also gained popularity because of the number of tools that work with DataSets, and because they are easily bound to many UI/presentation controls. Since it works in so many scenarios, for many applications there is never a need to transform data outside of a DataSet allowing a single paradigm to be used up and down the stack.

However, there are disadvantages to DataSets when used as the communication payload between tiers. The first is the cost of getting the data into a serializable format, and the second is that the DataSet serialization format is generally not very interoperable with other languages that are used to expose services. Another disadvantage of using a DataSet is that you can quickly lose the intent of your service call because so many kinds of things can be included in a DataSet. For example, if the service method is declared as:

DataSet GetCustomer(string id);

It is extra work to ensure that the DataSet that is returned contains only rows of data that pertain to “customers”. It is also more difficult to specify a service contract that says the DataSet is also supposed to return data for each customer’s orders and their order details.

Self-tracking entities share many advantages with DataSet. They also encapsulate change tracking information, which is serialized along with the data contained in the entity. On the mid-tier applying changes from a graph of self-tracking entities to a persistent context is equally straightforward. The Entity Framework will also provide tools for generating self-tracking entities and because these entities are just objects, they can be easily made to work with UI/presentation controls.

Drawing from DTOs

DTOs and SOA are used to give the developer more control over the service contract and payload for tier to tier communication. DTOs themselves do not have behavior, so they are typically very simple classes designed just to provide the needed information to perform a specific service operation. Not only does this provide the opportunity to optimize the wire format (some believe a DTO is all about the wire format), but it makes it possible and easy to capture the intent of each service method. The data contract used with DTOs is typically interoperable which makes it easy to use services that run on different platforms. DTOs also provide a way to separate messaging contracts from the presentation layer, the business logic, and the persistence layer which in many cases creates a maintainable architecture.

There are disadvantages to using DTOs and the primary one is complexity. DTOs are often hand-crafted to include only the specific information that is needed for an operation. When there is a common pattern for mapping DTOs to entity classes, there are some tools available that will do DTO generation and mapping but it is not always possible. With DTOs, it is up to the developer to decide how to do change tracking on each tier (especially the client) which increases the complexity of the presentation layer. The complexity of the service implementation also increases because the developer is responsible for translating the DTO into entities for doing business logic validation, as well as being able to report changes stored in the DTO to the persistence framework.

Not only do self-tracking entities share advantages with DataSets as mentioned above, they also share many advantages with DTOs. Self-tracking entities expose a simple and interoperable wire format and it is clear what kinds of data a service method requires or returns. You can also use a self-tracking entity as part of a message. However, self-tracking entities don’t give quite as much architectural separation as using pure DTOs, but you do gain a less complex solution that requires fewer data transformations. It is important to note that self-tracking entities can be made to be ignorant of any particular persistence framework making them essentially POCO objects. Particular persistence frameworks such as the Entity Framework will have the capabilities to create these entities and interpret the change tracking information when saving changes.

.NET Framework 3.5SP1 Challenge

With the .NET Framework 3.5SP1, this sort of solution was very hard to implement using the Entity Framework because change tracking was always done by a centralized ObjectContext which contains an ObjectStateManager. In particular, “reattaching” to report changes back to the ObjectStateManager was all but impossible without completely shredding your entity graph and applying changes one entity and one relationship at a time. The new API changes that are being added to the Entity Framework in .NET Framework 4.0 are enablers for an easier experience of reporting changes back to the ObjectStateManager, making self-tracking entities (as well as other architectures) easier to build.

Mid-tier experience

Using self-tracking entities on the mid-tier is about working with entity graphs and the Entity Framework. The service contract that is used in the following examples contains two simple methods for retrieving a Customer entity graph and applying updates to that entity graph:

interface ICustomerService
{
    Customer GetCustomer(string customerID);
    bool UpdateCustomer(Customer customer);
}

The implementation of the GetCustomer service method can be done using an Entity Framework ObjectContext and using LINQ or query builder methods to retrieve the entity you want. In the example below, a query is issued for a particular customer entity and all of the Orders and OrderDetails are included.

public Customer GetCustomer(string customerID)
{
    using (NorthwindEFContext context = new NorthwindEFContext())
    {
        var result = context.Customers.
              Include("Orders.OrderDetails").
              Single(c => c.CustomerID == customerID);
        return result;
    }
}

UpdateCustomer is example of how to save the changes that are made to a graph of self-tracking entities. Similar to DataSet, applying these changes to the persistence layer and saving them should be simple. In the below example, a new Entity Framework API, “ApplyChanges” is used which understands how to interpret the change tracking information that is stored by each entity and how to tell the ObjectContext’s ObjectStateManager about those changes.

public void UpdateCustomer(Customer customer)
{
    using (NorthwindEFContext context = new NorthwindEFContext())
    {
        context.Customers.ApplyChanges(customer);
        context.SaveChanges();
    }
}

Client Experience

The client experience when working with self-tracking entities is similar to how you would manipulate any object graph. You can make changes to scalar or complex properties, add or remove references, and add or remove from collections of related entities. The key part of the experience is that tracking changes is hidden from the client because it is done internally on each entity. There is no ObjectContext and there is no extra state that has to be maintained or passed from client to the tier that does persistence.

In this example, the test first queries for a customer, then deletes one of the orders. The resulting entity graph is sent back to the service tier using  the UpdateCustomer method.

public void DeleteObjectsTest()
{
    using (CustomerServiceClient client = new CustomerServiceClient())
    {
        var customer = client.GetCustomer("ALFKI");
        customer.Orders.First().Delete();
        client.UpdateCustomer(customer);
    }
}

The next test shows how to add a new Order with two OrderDetails to an existing customer. By default, the constructor of a self-tracking entity puts the entity in the “Added” state. There are conveience methods on each self-tracking entity to change this state if needed (Delete() and SetUnchanged()).

public void AddObjectsTest()
{
    using (CustomerServiceClient client = new CustomerServiceClient())
    {
        var customer = client.GetCustomer("ALFKI");
        customer.Orders.Add(new Order() { 
            OrderID = 100,
            OrderDetails = {
                new OrderDetail{
                     ProductID = 3,
                     Quantity = 7
                },
                new OrderDetail{
                     ProductID = 4,
                     Quantity = 8
                },
            }
        });
        client.UpdateCustomer(customer);
    }
}

The final example shows that a self-tracking entity graph  can contain any number of changes and those changes can be any combination of adds, modifications, and deletes.

public void ComplexModificationsTest ()
{
    using (CustomerServiceClient client = new CustomerServiceClient())
    {
        var cust = client.GetCustomer("ALFKI");
        // remove first order - will do cascade in the database
        cust.Orders.Remove(cust.Orders.First());
        // modify one of the orders
        var order = cust.Orders.Last();
        order.RequiredDate = DateTime.Now;
        order.OrderDetails.Add(
            new OrderDetail{
                ProductID = 7, 
                Quantity = 3 
        }); 
       // add new order
        cust.Orders.Add(new Order()
        {
            OrderID = 100,
            OrderDate = DateTime.Now,
            ShipCity = "Redmond",
            ShipAddress = "One Microsoft Way",
            ShipRegion = "WA",
            ShipPostalCode = "98052",
            OrderDetails = {
                new OrderDetail{
                    ProductID = 3,
                    Quantity = 7
                },
                new OrderDetail
                    ProductID = 4,
                    Quantity = 8
                },
            }
        });
        client.UpdateCustomer(cust);
    }
}

Generating Self-Tracking Entities

Building self-tracking entities should be as easy as using them. We plan to ship a T4 template that will do the code generation for creating self-tracking entities from your EDM. This template will show up in the list of available templates when you right click on your model and choose Add New Artifact Generation Item.

clip_image002 

Design Notes

Inside a Self-Tracking Entity

There have not been final decisions about what exactly will be included inside of a self-tracking entity, but it is important to track:

  • The state of the entity, Added, Deleted, Modified, or Unchanged
  • The original value from reference relationship properties
  • Adds and removes from collection relationship properties

This information will be included with the data contract of the entity. As part of the code generation of each self-tracking entity, scalar and complex property changes will mark the entity as “dirty” meaning that its state will change from Unchanged to Modified.

Inside ApplyChanges

ApplyChanges is a new API on the ObjectContext and ObjectSet classes that attaches an entity graph and interprets the change tracking information stored in each entity. The design  for discovering the change tracking information has not yet been finalized, but there are a couple of options available. One would be to code generate an interface as part of the T4 template for self-tracking entities as well as a version of ApplyChanges that knew about the interface. Another option would be fall back to runtime discovery of the particular properties using a convension and some of the new dynamic capabilities of the framework. The thing we want to avoid is causing the self-tracking entity implementation to have a dependency on any of the Entity Framework assemblies.

The algorithm that ApplyChanges uses is:

1.       Attaching the entity graph to the ObjectContext

2.       Changing the state of each entity using the ChangeObjectState API.

3.       For any reference relationship property, if there is an original value changing the state of the original reference relationship to Deleted and change the state of the current reference relationship to Added. This is done using the ChangeRelationshipState API.

4.       For any collection relationship property, use the ChangeRelationshipState API to mark any removed relationships as Deleted and to mark any added relationships as Added.

Summary

We’ve received a lot of feedback and suggestions on how to make developing multi-tier applications easier using the Entity Framework. One of the components of improving this experience is the introduction of an end-to-end architecture around self-tracking entities. We’d like to hear your feedback on this addition to the Entity Framework, and any other comments you have on the matter of multi-tier development using domain models and entities.

Jeff Derstadt,
Dev Lead, Entity Framework Team, Microsoft

This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post.
Foreign Keys in the Entity Framework
16 March 09 09:47 PM | efdesign | 30 Comments   

Background

A number of months ago we asked whether Foreign Keys (FKs) in Conceptual and Object models were important.

The feedback we got indicated that there are a real mix of opinions on the topic. Some people are all for them while others think that FKs pollute the conceptual model.

The fact that our customers were so divided meant we thought it was important to provide options here. Those who want FKs in their Entities should be able to do so, so they can gain all the benefits that having FKs in your Entities undoubtedly provide. On the other hand customers who are concerned that have FKs in their Entities in someway pollutes their model can continue to use the type of associations we had in .NET 3.5 SP1.

In the past we've called the .NET 3.5 SP1 style Associations "first class" associations. Which while accurate somehow implies that associations based on FKs in Entities are "second class". In fact when we talked about this new work recently with our MVPs one of the first questions was "are these second class?", which they most definitely are not.

The fact is these new associations are different but first class nevertheless.

So what's been added?

In .NET 4.0 we will add support for a new type of association called "FK Associations".

The way this works is we will allow you to include your FK columns in your entities as "FK properties", and once you have FK Properties in your entity you can create an "FK Association" that is dependent upon those properties.

Okay so we will have "FK Associations", what are we going to call the older style associations? Obviously saying something like ".NET 3.5 SP1 style Associations" every time isn't ideal. So we are going to call them "Independent Associations".

The term "Independent Association" resonates for us because they are independently mapped, whereas FK Associations need no mapping, simply mapping the Entity(Set) is sufficient.

How do I use an "FK Association"?

The real reason so many customers and partners are asking for "FK Associations" is that they significantly simplify some key coding patterns. So much so that we are convinced that for most people "FK Associations" will be an automatic choice going forward.

Some of the things that are hard with "Independent Associations" are trivially easy using FK Associations. Scenarios as diverse as DataBinding, Dynamic Data, Concurrency Control,  ASP.NET MVC Binders, N-Tier etc are all positively impacted.

Let's have a look at some code snippets that will work with "FK Associations":

1) Create a new Product and FK Association to an existing Category by setting the FK Property directly:

using (var context = new Context())
{
    //Create a product and a relationship to a known category by ID
    Product p = new Product
    {
        ID = 1,
        Name = "Bovril",
        CategoryID = 13
    }; 
    //Add the product (and create the relationship by FK value)
    context.Products.AddObject(p);
    context.SaveChanges();
}

This sort of approach works both for insert and update, and is particular useful for things like databinding, where you often have the new Value of the FK in a grid or something but you don't have the corresponding object, and you don't want to wear the cost of a query to pull back the principal object (i.e. the Category).

2) Create a new Product and a new FK Association to an existing Category by setting the reference instead:

public void Create_new_Product_in_existing_Category_by_reference()
{
    using (var context = new Context())
    {
        //Create a new product and relate to an existing category
        Product p = new Product
        {
            ID = 1,
            Name = "Bovril",
            Category = context.Categories
                        .Single(c => c.Name == "Food")
        };
        // Note: no need to add the product, because relating  
        // to an existing category does that automatically.
        // Also notice the use of the Single() query operator 
        // this is new to EF in .NET 4.0 too.
        context.SaveChanges();
    }
}

This programming pattern is not new to the Entity Framework, you could do this in .NET 3.5 SP1. I called it out because it shows you can still write the sort of code you write with Independent Associations even when using FK Associations.

3) Update an existing Product without informing the Entity Framework about the original value of the CategoryID (not supported with Independent Associations):

public void Edit(Product editedProduct)
{
    using (var context = new Context())
    {
        // Create a stand-in for the original entity
        // by just using the ID. Of the editedProduct
        // Note: you don't have to provide an existing Category or CategoryID
        context.Products.Attach(
            new Product { ID = editedProduct.ID });

        // Now update with new values including CategoryID
        context.Products.ApplyCurrentValues(editedProduct);
        context.SaveChanges();
    }
}

In this example "editedProduct" is a product which has been edited somewhere, this is exactly the sort of the code you might write in the Edit method of a ASP.NET MVC controller, and is a great improvement over the code you have to write using Independent Associations.

There is much more you can do with FK Associations, but these 3 samples give you a flavor of the sort of coding patterns FK Associations allow.

Keeping FKs and References in Sync

One thing that hasn't been mentioned so far is that the Entity Framework tries to keep related References and FKs in sync as much as possible.

When this synchronization occurs depends upon when the Entity Framework is notified of changes, which depends upon the type of Entity Classes involved: be they POCO with Proxies, POCO without Proxies, IPOCO or EntityObjects etc.

Another post will cover this in more detail.

How do I create an "FK Association"?

There are a number of ways.

One mainline scenario is when using the tools to infer a model from a database. We have added an option to choose whether "FK Associations" or "Independent Associations" are generated by default. The same is true if you use EdmGen.exe (i.e. our command line tool).

The second mainline scenario for creating FK Associations is creating a model from scratch, aka Model First which is new to .NET 4.0.

Here are the steps involved:

  1. Create two Entities (say Product and Category) that look something like this:

    ProductCategoryStep1
  2. Add a property that will be the FK Property: i.e. CategoryID

    ProductCategoryStep2
  3. Create an association between the two EntityTypes with the correct end multiplicities and NavigationProperty names:

    ProductCategoryStep3
  4. Double Click the line between the two Entities, that represents the Association, and add a Referential Integrity Constraint:

    ProductCategoryStep4
    This is the step that tells the Entity Framework that the CategoryID is an "FK Property" and that the "CategoryProduct" association is an "FK Association".

  5. You end up with something that looks like this:

    ProductCategoryStep5 

And you are done you've set up an FK Association. Notice that this association doesn't need to be mapped you simply need to make sure you map all the properties of both Entities.

Very easy.

Summary

In .NET 4.0 we will add support for FK Properties and FK Associations to the Entity Framework. The existing style Independent Associations will still be possible, but we expect that FK Associations will become most peoples automatic choice because they simplify so many common Entity Framework coding tasks.

As always we'd love to hear your thoughts on this work.

Alex James,
Program Manager, Entity Framework Team, Microsoft.

This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post.

Customizing Entity Classes in VS 2010
22 January 09 07:15 PM | efdesign | 21 Comments   

When VS 2010 ships it will include some significant improvements to our code generation story for the Entity Framework. The basic idea is to make use of T4 templates for code generation and ship strong integration into the Entity Framework Designer to make the experience of customizing those templates as seamless as possible.

Below Sanjay from our Tools team outlines what will be possible once VS 2010 is released.

Customize the code generated by the Entity Designer with T4 templates

In Visual Studio 2008 SP1, the ADO.NET Entity Designer generates classes from the CSDL portion of the EDMX file using the EntityClassGenerator APIs. Numerous customers have asked us how to customize the code generation for a variety of scenarios including:

  • Make the generated ObjectContext internal
  • Make the generated ObjectContext and Entity classes implement a user-defined interface
  • Add user-defined CLR attributes to generated ObjectContext and generated Entity classes
  • Influence generated classes based on structural annotations in CSDL
  • Generate the ObjectContext and Entity classes into separate files
  • Generate “proxy classes” for the generated classes
  • Partially or fully change how classes are generated, maybe even generate additional (non code) artifacts in the project
  • Completely replace entity framework code generation with custom code
  • Generate POCO classes from the model to use as a starting point in my applications
  • Generate self-tracking entity classes
  • · ...

While the SingleFileGenerator techniques described here enable some of these scenarios these techniques are generally difficult to implement and install or are tricky to debug and customize.

In Visual Studio 2010, we plan to give customers the ability to use T4 templates to generate their classes, and are also planning deep integration with the Entity Designer and Visual Studio to provide a great end-to-end experience.

Walkthrough

The example below customizes code generation to make the generated classes implement a user-defined interface (e.g. IValidate).

Preconditions:

  • User has installed Visual Studio 2010
  • User is familiar with customizing T4 templates
  • User has a C# or a VB console project that targets FX4.0 with an EDMX file in the project

Walkthrough:

  1. User: opens EDMX file in the Entity Designer. Note that the generated classes are in a child file of the EDMX file (i.e. in Northwind.Designer.cs)

    image
  2. User: right-clicks on an empty area of the designer surface and chooses “Add New Artifact Generation Item…” from the context menu.

    image
  3. Visual Studio: displays the standard “Add New Item” dialog and shows a filtered list of items for the user to select from. As expected, the “Add New Item” dialog only shows a list of item templates specific to the current project target framework version, project type and language.

    image
  4. User: selects “ADO.NET EntityObject Generator”, specifies a file name and clicks “Add"
  5. Visual Studio: checks out the project from Source Control if necessary
  6. Visual Studio: sets the “Custom Tool” property of the EDMX file to empty, which will cause the existing .designer.cs file to be deleted
  7. Visual Studio: the VS item template adds a new .tt file to the project in the same project directory as the EDMX file

    image
  8. Visual Studio: configures the new .tt file to process the selected EDMX file. The EDMX file to process is a replaceable parameter in the .tt file, and VS updates it to point to the selected EDMX file. Thus, the .tt file now knows which EDMX file to generate code for.

    image
    [Screenshot of .tt file open in the Clarius T4 Editor Community edition for VS 2008]
  9. User: double-clicks the .tt file in the project to open it in VS10
  10. User: edits the .tt file in Visual Studio to make every Entity object implement the IValidate interface

    image
    [Screenshot of .tt file open in the Clarius T4 Editor Community edition for VS 2008]
  11. User: saves and closes .tt file
  12. Visual Studio: Transforms the .tt file to produce the generated classes as a child file of the .tt file
  13. User: notes that the generated classes (under the .tt file) implement the IValidate interface as expected
  14. User: switches to the EDMX file in the designer and makes some changes (e.g. add a new entity)
  15. User: saves EDMX file
  16. Visual Studio: Transforms all .tt files in the project that reference the EDMX file. As before, the generated classes are child files of the .tt files

Design

The design affects multiple parts of the Entity Framework and the Entity Designer and the key changes are described below:

Entity Framework Runtime

  1. We started by creating C# and VB T4 templates to generate classes from CSDL and EDMX files
  2. Since the T4 engine is installed with Visual Studio and not with the.NET FX runtime, we preprocess the templates as described here and include the compiled code into System.Data.Entity.Design.dll
  3. We added a new class System.Data.Entity.Design.EntityCodeGenerator (in System.Data.Entity.Design.dll ) that generates code using the preprocessed T4 template
  4. Finally, we enhanced a few runtime metadata APIs and added extension methods to load and iterate over models a lot easier. Our T4 templates make heavy use of these new methods.

Entity Designer

We took the C# and VB T4 templates used by the runtime and changed them to be more “VS friendly” as described below.

  1. We made the EDMX file name referenced in the T4 templates a replaceable parameter
  2. We created a wizard (with no GUI) that is launched by Visual Studio when the item is added to the project. The wizard does 2 things:
    1. updates the EDMX file name referenced in the T4 templates and
    2. sets the “Custom Tool” property of the EDMX file to empty
  3. We rolled up the T4 templates (and wizard) into standard C# and VB Visual Studio item template which are installed with VS 2010; this makes them show up in the “Add New Item” dialog in Visual Studio
  4. We added a new context menu to the designer surface to launch the “Add New Item” dialog and also added code to only show VS item templates whose names start with the prefix ADONETArtifactGenerator_
  5. Our design ensures that user installed VS item templates in %MyDocuments% as well as the VS item templates we ship in box show up in the “Add…New…Item” dialog (per the template name prefix rule above)
  6. We added code to the designer that finds all .tt files related to an EDMX file and automatically transform them when the EDMX file is saved.
  7. Finally, we added a new property called “Process related T4 templates on Save” on the designer surface to let users control whether or not to transform .tt files related to the EDMX file on save.

Multi Targeting considerations

As many of you are aware, VS 2010 allows developers to target FX4.0 as well as FX3.5 (and older runtimes). The T4 templates that we ship in the Visual Studio box generate code that works with the Entity Framework in .NET FX 4.0. This means the VS item templates we ship in the box will not be available for projects that target FX3.5.

However, it is certainly possible (and supported) for users to create new T4 templates (and new VS item templates) that generate code from an EDMX file in projects that target FX3.5 and light up in the same experience. In fact, we might release some new templates ourselves later, on CodeGallery perhaps.

How can 3rd parties plug in?

Since our design is based upon VS item templates that wrap T4 templates, we have opened the door to let 3rd party code generators participate in the end-to-end experience. Our contract is straight forward and easy to implement:

  1. Create a (C# or VB) T4 template to generate code as you desire. The T4 templates we ship in the box can be invaluable points of reference as you roll your own.
  2. Make the EDMX reference in your .tt file a replaceable parameter by calling it $edmxInputFile$ like we do in our .tt file.
  3. Wrap the .tt file in a VS item template. This is pretty easy and well documented on MSDN. You can also use the Export Template wizard included Visual Studio. Also specify additional VS item template settings such as target framework, project type, etc
  4. Prefix the name of your .vstemplate file with ADONETArtifactGenerator_ if you want your VS item template to show up in the “Add New Item” dialog when it is launched from the context menu in the Entity Designer.
  5. You can use the wizard we ship in Visual Studio to do parameter replacement in your VS item template by adding the following to your .vstemplate file:

    <WizardExtension>
      <Assembly>
        Microsoft.Data.Entity.Design, Version=10.0.0.0, Culture=neutral,PublicKeyToken=b03f5f7f11d50a3a
      </Assembly>
      <FullClassName>    Microsoft.Data.Entity.Design.VisualStudio.ModelWizard.AddArtifactGeneratorWizard 
      </FullClassName>
    </WizardExtension>

    Of course, you can also create your own wizard that does custom processing when the item is added to the project as described here.
  6. Create a VSCONTENT file and zip up your custom item template & related files into a Visual Studio Content Installer (.vsi) file
  7. Upload the .vsi file someplace your customers can download it from
  8. Visual Studio already knows how to install .vsi files and everything you need is already available in Visual Studio.

A few key takeaways:

  • This design lets the EF team release new T4 templates outside the usual Visual Studio and .NET FX ship cycles and the experience to consume them is exactly the same.
  • There are several excellent resources on the web that describe advanced T4 template scenarios and our design enables them. In fact, we expect users to leverage these resources while creating new T4 templates or customizing the ones we ship. For example, code in a custom T4 template can use the EnvDTE APIs to access the project system and generate outputs to multiple files.
  • Over time, we expect to blog about some of these advanced scenarios and also work with MVPs, industry experts and Patterns & Practices to provide guidance to customers in a manner similar to the Guidance Automation Toolkit.
  • Custom T4 templates can generate anything you want, not just code. For example, we have an internal prototype of a T4 template that iterates over a model and generates an HTML file with a nicely formatted report of the entities and associations in the model
  • VS item templates can add multiple files to your project (more details here). This means your custom item templates can include all kinds of files (e.g. workflow files, multiple T4 files, pre-canned code files, etc). For example, you can easily imagine a 3rd party VS item template that adds 2 .tt files in your project that process the same EDMX file but generate different kinds of outputs.

We hope this gives you an overview of our design and intent of this feature and we would love to hear your comments.

Sanjay Nagamangalam,
Lead PM, ADO.NET Entity Designer

This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post.

Update on Computed Properties
13 January 09 04:27 AM | efdesign | 9 Comments   

A while back I wrote a post that introduced the concept of Computed Properties. Since that time we’ve had a number of conversations with both customers and internal partners and we’ve had some new ideas, that have changed our thinking somewhat. 

  • First off we decided to implement Model Defined Functions, which are significantly more powerful and flexible than Computed Properties, and provide overlapping capabilities.
  • Computed Properties on the other hand, would never have been structurally part of the entity, which lead to some tough choices and probable customer confusion:
    • Do you always pay the cost of materializing the Computed Property even when it is not required?
    • What about if there are lots of Computed Properties on the Entity?
    • Today in order to materialize an Entity and it's properties, computed or otherwise, the Entity Framework needs access to setters. Unfortunately the existence of the Setter, sends an erroneous message that Computed Properties can be altered and persisted.
    • Upon further investigation we discovered that Computed Properties didn't actually address the requirements of Reporting Services, which was one of the original drivers for Computed Properties. Our original design called for computed properties to operate over an instance of an Entity, and it turned out that Reporting Services needed computed properties to operate over both an instance and a set of instances. Model Defined Functions on the other hand are flexible enough to handle both of these scenarios.
  • In the future it may be possible to gain all the benefits of computed properties without any of the above issues, by extending our integration of Model Defined Functions with LINQ so that the [EdmFunction] attribute can be applied to CLR properties too.

    If we did this the first parameter of the specified Model Defined Functions could be mapped to be the CLR type containing the property itself.  This would allow a Computed Property, backed by a Model Defined Function, to be used in a targeted way in LINQ queries, without any of the illusions that this might somehow be set-able or structurally part of the entity.

Given all these points we started to think Computed Properties just weren’t as important.

We think our best course of actions in the short term is 1) to focus on supporting Model Defined Functions and 2) ask for your feedback on the relative importance of Computed Properties given the points listed above.

So what do you think?

Do Model Defined Functions address most of the scenarios for Computed Properties?

Particularly if in the future we can treat CLR properties as stubs for Model Defined Functions?

As always we are very keen to hear what you think.

Alex James
Program Manager
Microsoft

This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post.

NB:
For more background on the origins of [EdmFunction] check out this post.

Filed under: , ,
Model Defined Functions
07 January 09 08:32 PM | efdesign | 21 Comments   

Today the Entity Framework, and more specifically the Entity Data Model, have a limited notion of Functions.

We are currently restricted to Function Imports that allow stored procedures to be invoked, and Canonical / Store Functions for database independent and database specific functions respectively.

Now however we want to support functions defined, not just declared, in the EDM (aka. the CSDL).

An example would be:

<Function Name="GetAge" ReturnType="Edm.Int32">
      <Parameter Name="Person" Type="Model.Person" />
      <DefiningExpression>
            Edm.DateDiff(“y”, GETDATE(), Person.Birthday)
      </DefiningExpression>
</Function>

Here are some things to notice:

  • The DefiningExpression is eSQL.
  • The function can have zero or more parameters.
  • The Function must have a return type.
  • Function Parameters are referenced directly by Name in the DefiningExpression: meaning there is no parameter denoting prefix like @. This means you must be careful to choose parameter names that don't collide with identifiers you need to use in the rest of eSQL expression.
  • Unlike functions in SSDL, functions in CSDL only support In bound Parameters (i.e Mode="In") because otherwise they become non-composable.
  • For this reason the Mode of a parameter cannot be set in CSDL (it is always Mode="In").
  • Functions are declared as Global Items and are declared within the <Schema> element. As such there identity is made up of the Schema's namespace and the function name.
  • The function parameters and return type can be any of the following:
    • A scalar type or collection of scalar types.
    • An entity type or collection of entity types.
    • A complex type or collection of complex types.
    • A row type or collection of row types (See below).
    • A ref type or collection of ref types.
  • Functions with a DefiningExpression do not require mapping, since the eSQL expression is composed out of eSQL fragments that are already mapped.
  • Functions without a DefiningExpression are simply declarations. Today the Entity Framework doesn't complain when loading a CSDL with such a function, but you can't invoke it. In the future these functions might be used to support mapping Table Value Functions in the store to functions in the Conceptual Model.
  • Since it is trivial in eSQL to create arbitrary un-named types, imagine a projection that projects 3 of the properties from an Entity, we now need a mechanism for defining these "RowTypes" inline, so that they can be used when defining Function Parameters and ReturnTypes. For example:

    <Parameter Name="Coordinate">
       <RowType>
          <
    Property Name="X" Type="int" Nullable="false"/>
          <
    Property Name="Y" Type="int" Nullable="false"/>
          <
    Property Name="Z" Type="int" Nullable="false"/>
       </
    RowType>
    </Parameter>
  • Since eSQL is primarily set based, we also need a way of defining parameters and return types that are collections of RowTypes:

    <Parameter Name="Coordinates">
       <CollectionType>
          <RowType>
             <
    Property Name="X" Type="int" Nullable="false"/>
             <
    Property Name="Y" Type="int" Nullable="false"/>
             <
    Property Name="Z" Type="int" Nullable="false"/>
          </
    RowType>
       </CollectionType>
    </Parameter>

Using the Function via eSQL:

It is trivial to use the function via eSQL. For example:

SELECT Namespace.GetAge(p)
FROM Container.People AS P
WHERE P.Firstname = ‘Jim’

Here we get Jim's age, assuming of course there is only one Jim!

It is also possible to compose functions together, you must simply ensure return types and target parameter types are the same (in the case of named types) or structurally equivalent (in the case of row types or collections of row types).

Things get a little trickier when you are dealing with functions that return sets, for example imagine a function that returns someone's friends, used in conjunction with the GetAge function:

SELECT VALUE (F)
FROM Container.People AS P
CROSS APPLY Namespace.GetFriends(P) AS F
WHERE Namespace.GetAge(P) > 21

Here we get all the friends of people older than 21.

As you can see to do this sort of thing you need a crash course in eSQL.

Using the Function via LINQ:

It is also possible to use these functions in LINQ, but this does require a extra step to create an appropriate stub function in the CLR.

This solution is based on the techniques described here, and involves creating a Stub function in the CLR language of your choice, and annotating it something like this:

[EdmFunction("Namespace", "GetAge")]
public static int GetAge(Person p)

    throw new NotSupportedException(…);
}

The Entity Framework uses the signature of the function and the EdmFunction attribute to map calls to this function when encountered to the appropriate Model Defined Function.

Once you have this stub it is then trivial to use it in a LINQ query like this:

var peopleOver21 =
     from p in ctx.People
     where GetAge(p) < 21
     select p;

Indeed if you are familiar with LINQ you will probably find composing functions together a lot easier too:

var friendOfPeopleOver21 =
     from p in ctx.People
     from f in GetFriends(p)
     where
GetAge(p) < 21
     select f;

Notice that the CLR functions don't need to be directly callable, in the example above the CLR stub throws an exception if called directly.

However the existence of the stub allows you to create LINQ expressions that compile correctly, and then at runtime, when used in a LINQ to Entities query, the function call is simply translated by the entity framework into a query that runs in the database.

Summary

As you can see Model Defined Function's are very powerful, and this post has barely scratched the surface of possibilities they open up.

The Entity Framework team would love to hear your comments.

Alex James 
Program Manager,
Entity Framework Team

This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post.

Pluralization
02 December 08 12:20 AM | efdesign | 13 Comments   

Unfortunately in the current version of the Entity Framework, which ships in .NET 3.5 SP1, we don't make any attempt to Singularize or Pluralize names when reverse engineering a model from the database.

So if, for example, your database has a table called Orders you will get an EntityType called Orders too, which clearly doesn't make for the most read-able code:

Orders order = new Orders(); //? why not just Order ?

LINQ to SQL does a better job here, because it ships with simple but effective pluralization and singuralization services, that for the most part produces names you would expect.

Unfortunately the lack of a similar feature in the Entity Framework, means that generally the first thing you do after you've reverse engineered a model with the Entity Framework is fix-up all the silly names.

Now from the perspective of someone who delivers plenty of Entity Framework demos, this is quite embarrassing! 

But of course the real problem here is that it is time consuming and error prone for all but the most trivial models.

Solution:

In the next version we have added basic pluralization support. This support only really handles English, and here's why:

  1. Creating language and culture aware services is very hard, and we thought it would be better to spend our time working on the core of the Entity Framework.
  2. These sort of services are very general and as such we think that they belong elsewhere, maybe inside Windows itself?
  3. We have made the PluralizationService class public so you are free to write your own.

Our default English language pluralization is then used whenever you generate a model from a database or visa-versa (see model first) to help produce appropriate names.

And for command line users familiar with EdmGen.exe there is a new /pluralize switch also.

Basic Pluralization Rules:

So how does the Entity Framework use this pluralization service?

Well it calls the service to:

  • Singularize EntityType names
  • Singularize NavigationProperty names that point to 0-1 entities
  • Pluralize EntitySet names
  • Pluralize NavigationProperty names that point to 0-* entities

It turns out that if you do this, you get the model you are looking for most of the time.

The Pluralization APIs:

So far we've talked about default behavior, if however you need more control, for example if you want to specify custom pluralization rules or a create a completely custom service, you have to drop down to the API level.

Everything revolves around a newly added public PluralizationService class that looks something like this:

public abstract class PluralizationService

        public static PluralizationService CreateService(CultureInfo culture);

        public abstract string Pluralize(string word);
        public abstract string Singularize(string word);
}

As you can see classes that derive from PluralizationService can singularize and pluralize words.

You can also ask the PluralizationService to create a concrete PluralizationService for you based on Culture (as mentioned previously only "en" based cultures are supported out of the box).

Here is a little code that gets hold of the built-in English Language pluralization services, configures it so "Child" will pluralize to "Children" and then checks that it pluralizes correctly.

PluralizationService pluralizationService = 
    PluralizationService
.CreateService(
        new CultureInfo("en-US"));

ICustomPluralizationMapping mapping = 
  pluralizationService as ICustomPluralizationMapping;

if (mapping != null) // it shouldn't be but just checking
{
   
//Specifying the child pluralizes as children
    mapping.Add("Child", "Children");
}

Debug.Assert(pluralizationService.Pluralize("Child") == "Children");

You can use your pluralization service in Model Generation (i.e. the step that produces CSDL and MSL from SSDL) by handing your pluralization service to the EntitySchemaModelGenerator:

//Create an EDM from SSDL generator
EntityModelSchemaGenerator generator =
   
new EntityModelSchemaGenerator(
        storageModel,  
        "MyNamespace"
,
        "MyContainer", 
        pluralizationService);

//Generate CSDL and MSL (in memory)
generator.GenerateMetadata();

The resulting CSDL will have been pluralized/singularized as described in the earlier rules using the provided pluralization service.

See this for more information on the EntityModelSchemaGenerator.

Summary

So as you can see we've added simple pluralization capabilities, akin to those found in LINQ to SQL, that will reduce the amount of work required to create meaningful models.

And to top it off, the solution we've created is easy to customize and public so you can probably use this for other things too!

As always we are very keen to hear what you think.

Alex James
Program Manager
Microsoft

UPDATE: A number of people have asked whether pluralization can be turned off, because the post doesn't make this clear. The answer is absolutely, pluralization is definitely optional, in Visual Studio this is as simple as checking/unchecking a checkbox in the wizard, and for EDMGen.exe you have to opt-in for pluralization with the /pluralize commandline switch. 

This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post.

N-Tier Improvements for Entity Framework
20 November 08 05:33 AM | efdesign | 47 Comments   

The first version of Entity Framework provides convenient ways to load, manipulate and persist objects and relationships. As with many other O/RMs, Entity Framework has a state manager that tracks every change made. Existing objects are typically loaded first from the database, later modified, and finally the changes are saved back to the store.

Another feature, full graph serialization, makes it very easy for developers to ship around object graphs representing snapshots of the current state of the world, across execution boundaries.

The next version of Entity Framework will also support Persistence Ignorance. Therefore object graphs can now be made of POCO instances, which WCF now also supports.

Nonetheless, there is a task that belongs in any N-Tier application that still requires a great amount of work for developers using EF:  decoding messages that represent state changes performed by the client, meant to be processed or persisted by the service.

The major pain point in the first version is the number of intricate steps a program needs to perform in order to setup the state manager in the appropriate state according to the changes encoded in the message. For instance, for any but the simplest scenarios, in order to use the basic graph manipulations APIs in ObjectContext, like AddObject and AttachTo, it is necessary to “shred” the object graphs contained in the message, which destroys important information about relationships between object instances that therefore has to be kept somewhere else (as an illustration of this approach, see Danny Simmons’s EntityBag in project Perseus).

In addition, given that the message typically contains all the necessary information and that concurrency is usually a concern, the program shouldn’t have to re-query the database. Instead, it should be easier to “tell” the state manager about the original state (that came from the database previously) and about what changes were made.

There are a few approaches to N-Tier that are very popular in the industry. Customers using Entity Framework for the first time often carry the expectation that it will support a similar experience:

DataSet

The original ADO.NET data access technology includes the DataSet, a versatile data cache that can be serialized as XML and be used to transport collections of rows and relationships and, also diffgrams representing changes.

DataSets make many N-Tier scenarios extremely easy to implement, and over time, have proven helpful for many customers. In spite of this, they are not appropriate for some scenarios and architectural patterns:

  1. The lack of a widely accepted standard serialization format has hindered the development of implementations that could produce or consume datasets in other platforms.

  2. Diffgrams represent sets of CREATE, UPDATE and DELETE operations of arbitrary depth, which is not adequate for architectural patterns that conceive operations with more constrained semantics (i.e. PlaceOrder, CreateCustomer).

  3. No simple ways to control the application of batch CUD operations contained in diffgrams makes them unsuitable any time there is a trust boundary between the client and the server.

  4. Also, the data-centered perspective of DataSets leads to an anti-pattern known as the “anemic domain model”, which is characterized by a separation between the data aspect and the behavioral aspect of domain objects.

Data Transfer Objects

The approach typically used in the DDD and SOA communities to address  multi-tier scenarios is to hand-build a service with well defined operation semantics together with the a conceptual message format, usually composed of data transfer objects (DTOs).  Implementing DTOs often also involves writing the logic that translates DTOs back and forth to persistent objects.

Here is a simplified set of rules that characterizes the use of DTOs:

  1. Actual entities are not serialized, just DTOs, which are value-only objects (without behavior) are exposed in the boundaries.

  2. The client does not depend on the types defined on the server. The shapes of the DTOs are the sole contract, and the types used in the client are for the client only.

  3. The context or container is not sent together with the message.

  4. The service decides what to do with the contents of the message. It is not the message who decides (as an extreme counter example, imagine a message that tells the service to delete all the contents of the database).

Those rules are especially important in SOA scenarios, in which the client and the server can be separated by a trust boundary, such as the one that exists between two different companies.

There are other permutations of layered architectures, though, in which the level of decoupling that DTOs provide may not justify the increased complexity, for example:

  1. A multi-tier application in which the client is fully trusted.

  2. A rich internet application in which an infrastructure service is purely used for data persistence (is by design a CRUD-only service) and therefore types exposed do not contain behavior anyway.

  3. Any architecture in which the persistence framework is actually used to provide and manage DTOs and not the actual entities.

REST

ADO.NET Data Services provides a rich framework that developers can easily use to expose any Entity Data Model as a REST-style resource collection. Simple HTTP verbs like GET, PUT, POST and DELETE are used in combination with URIs to represent CRUD operations against the persistence service.

This pattern is suitable for many applications and can easily be combined with service operations with richer semantics in the same application.

ADO.NET Data Services, however, may not be compelling for those customers who want to define their services as operations with richer semantics entirely.

Goals

ADO.NET Data Services goes a long way addressing the needs of customers doing RESTful services, and so there is no need for us to improve Entity Framework experience on that space.

With the N-Tier improvements for Entity Framework, we want to address some of the same problem space as DataSet, but we want to avoid the primary issues with it.

Ideally, we would like to provide building blocks that are appealing for developers building solutions on a wide range of architectures. For instance, we would like to provide fine enough control for DTO proponents, but at the same time reduce the level of pain that those trying to address simpler scenarios experience today.

Design

If we just wanted to provide a DataSet like experience on top of Entity Framework, the most straightforward (although very expensive to test) way would probably be based on full serialization of the state manager. Basically, the internal representation of ObjectStateManager includes collections of objects to be deleted, inserted, and modified, and also relationships to be added and deleted, as well as objects and relationships that should remain unchanged.

Such a model would lead to very fine grain control of the state manager, but you would lose some of the benefits of operating on the domain objects and the higher level APIs. For instance, it would make it extremely easy for the program to set the state manager in a completely invalid state.

An alternative representation for such an experience would be to ship a “change log” containing the history of all operations performed on the client. This representation has the disadvantage of being more verbose than the previous one, but it would make “point in time” rollbacks pretty easy to implement.

Besides these two, there are a few more interesting generic representations for changes in a graph, but in general, they all suffer from the same disadvantage: providing a solution for them does not give the user the level of control that the most complicated scenarios and sophisticated patterns require.

Based on this fact, we decided to adopt the following goal for the design:

Entity Framework won’t define its own unique representation for the set of changes represented in an N-Tier application. Instead, it will provide basic building block APIs that will facilitate the use of a wide range of representations.

This has the desirable consequence that Entity Framework won’t impose a pattern for N-Tier. Both DTO-style and DataSet-like experiences can be built on top of a minimal set of building blocks APIs. It is up to the developer to select the pattern that better suits the application.

Here is a first cut of the new methods that we are proposing to add to the ObjectContext. The new APIs will work with EntityObjects, IPOCO and POCO entities:

public partial class ObjectContext
{
    /// <summary>
    /// Apply property values to original state of the entity
    /// </summary>
    /// <param name="entitySetName">name of EntitySet of root</param>
    /// <param name="original">entity with original values</param>
    public void ApplyOriginalValues(
        string
entitySetName,
        object
original) { }

   
    /// <summary>
    /// Changes the state of the entity and incident relationships
    /// </summary>
    /// <param name="entity">entity to change state of</param>
    /// <param name="state">new state</param>
    public void ChangeObjectState(
        object
entity,
        EntityState
state) { }


    /// <summary>

    /// Changes state of relationship
    /// </summary>
    /// <param name="source">source object of relationship</param>
    /// <param name="target">target object of relationship</param>
    /// <param name="relationshipName">name of relationship</param>
    /// <param name="sourceRole">role of source object</param>
    /// <param name="targetRole">role of target object </param>
    /// <param name="state">new state</param>
    public void ChangeRelationshipState(
        object
source,
        object
target,
        string relationshipName,
        string
sourceRole,
        string
targetRole,
        EntityState state) { }

    /// <summary>
    /// Changes state of relationship represented by navigation
    /// property

    /// </summary>
    /// <param name="source">source object of relationship</param>
    /// <param name="target">target object of relationship</param>
    /// <param name="navigationProperty">navigation property</param>
    /// <param name="state">new state</param>
    public void ChangeRelationshipState(
        object
source,
        object
target,
        string navigationProperty,
        EntityState
state) { }

   
    /// <summary>
    /// Changes state of relationships represented by lambda
    /// expression

    /// </summary>
    /// <typeparam name="TSource">type of source entity</typeparam>
    /// <param name="source">source entity of relationship</param>
    /// <param name="target">target entity of relationship</param>
    /// <param name="selector">property selector expression</param>
    /// <param name="state">new state</param>
    public void ChangeRelationshipState<TSource>(
        TSource source,
        object
target,
        Expression<Func<TSource, object>> selector,
        EntityState state) { }

 
}

ApplyOriginalValues

Similar to the existing ApplyPropertyChanges, but this API keeps the tracked entity intact, only affecting the corresponding original values. This operation only works for entities in Modified, Unchanged or Deleted states. Added entities by definition don’t have original state.

ChangeObjectState

This method will transition the state of a tracked entity passed as argument, but will also have side-effects on incident relationships. In case the new state is modified, it will also mark all properties as modified, regardless of the original and current values.

ChangeRelationshipState

Similar to ChangeObjectState, but sets the relationship between two tracked entities to the provided state. Relationships cannot be in the modified state, but this method can be used to report which relationship should be added, deleted, or unchanged.

The relationship doesn’t need to exist in the context, but the entities do. If the relationship doesn’t exist, it can be created in the new state. For instance, when this code is invoked, a new relationship can be created between customer1 and order1 in the added state:

context.ChangeRelationship(
    customer1,
    order1,
    c=>c.Orders,

    EntityState
.Added);

The relationship is typically represented by the navigation property but the metadata names of the relationship and roles can also be used in a different overload.

Design notes

  • All parameters named “entity”, “source” and “target” accept either entities or EntityKeys.

  • We could choose to deprecate the ApplyPropertyChanges API and replace it with ApplyCurrentValues, which is more consistent name-wise with the new ApplyOriginalValues method.

  • For convenience, we should add similar APIs in other types like ObjectStateEntry, RelatedEnd and others as appropriate. For instance, the ObjectStateEntry will also have a ChangeState  API that will only take the target state parameter.

  • We could choose to put ChangeObjectState() and the various ChangeRelationshipState() in ObjectStateManager rather than in ObjectContext, since this is a lower level API than everything else currently in the ObjectContext.

Scenario tests

To get a sense of how the new API can be used, we have tried several different approaches and representations. We also have plans to release sample code that will show in more detail different ways to use the new API. In the meanwhile, here are a few simple cases.

General purpose Attach with state resolution delegate

In this scenario between a client and mid-tier, the intention is to allow a client to query for a Customer entity with the Customer’s Order entities. The client then makes a variety of changes to the Customer entity graph including modifying the Customer, and adding or removing Orders.  The client would like to send the updated Customer entity graph to the mid-tier so that it can be processed and persisted.

This system will use a simple mechanism to help with change tracking on DTO entities. Each DTO will implement a small interface to help determine the state of the entity using three properties: IsNew, IsModified, and IsDeleted. In this example, original values are not tracked by the client.

public interface IEntityWithChanges
{
    bool IsNew { get; }
    bool IsModified { get; }
    bool IsDeleted { get; }
}

When the client makes changes to the Customer entity graph, the only requirement in this scenario is that when an entity changes, the appropriate property will return true to help determine the state of the entity.

The mid-tier will expose a service method that implements the logic to update the Customer entity graph. Rather than traverse the Customer, her Orders and OrderLines looking for changes, the service method will use a convention based mechanism that knows how to deal with the IEntityWithChanges interfaces. In this scenario, this mechanism takes the form of an extension method to the ObjectContext that will attach an entity graph to the context, and use a delegate to map each entity that implements IEntityWithChanges to an EntityState so that it can be used with the new ChangeObjectState APIs.

First we can define a method to map from an IEntityWithChanges to an EntityState:

/// <summary>
/// A mapping from an IEntityWithChanges to an EntityState
/// </summary>
public EntityState GetEntityState(object entity)
{
    IEntityWithChanges entityWithChanges =
       entity as IEntityWithChanges;

    if (entityWithChanges != null)
    {
        if (entityWithChanges.IsNew)
        {
            return EntityState.Added;
        }
        else if (entityWithChanges.IsModified)
        {
            return EntityState.Modified;
        }
        else if (entityWithChanges.IsDeleted)
        {
            return EntityState.Deleted;
        }
    }
    return EntityState.Unchanged;
}

Next we will define the extension method on an ObjectContext that can attach an entity graph and use a mapping delegate to determine the state of each entity in the graph.

/// <summary>
///
Attach a graph of entities using a supplied mapping function to
/// determine the entity's state.
/// In this example, the state of relationships between entities is
/// determined by the rules of ChangeObjectState:
/// - if one entity in the relationship is Added, the relationship is
///   Added
/// - if an entity is Deleted or Detached, all incident relationships
///   are Deleted or Detached
/// </summary>
///
<param name="entitySetName">name of EntitySet of root</param>
///
<param name="entityGraph">graph of entities to attach</param>
///
<param name="entityStateMap">EntityState resolver</param>
public static void Attach(
    this ObjectContext context,
    string entitySetName,
    object entityGraph,
    Func<object, EntityState> entityStateMap)
{
    context.AttachTo(entitySetName, entityGraph);

    // Create a list of unique entities in the graph
    var allEntities =
        context
        .ObjectStateManager
        .GetObjectStateEntries(EntityState.Unchanged)
        .Where(entry => !entry.IsRelationship)
        .Select(entry => entry.Entity);
       
    // Apply the entityStateMap to each entity
    foreach (object entity in allEntities)
    {
        context.ChangeObjectState(entity, entityStateMap(entity));
    }
}

Finally we can define our service method that can update a Customer entity graph using the mapping delegate and the Attach extension method.

/// <summary>
/// A service method to update a Customer's entity graph which
/// can include modifications to the Customer entity,
/// new Orders, and removal of Orders
/// </summary>
/// <param name="c"></param>
public void UpdateCustomerOrder(Customer customer)
{
    using (NorthwindEntities context = new NorthwindEntities())
    {
        // Call the Attach extenion method with the mapping delegate
        // that goes from IEntityWithChanges to EntityState
        context.Attach("Customers", customer, GetEntityState);
        context.SaveChanges();
    }
}


ChangeOrder Service Operation

In this scenario, the mid-tier exposes a method to perform a very specific operation: changing the Customer that owns an Order. The client can set the Order’s Customer to a new Customer or to one that already exists in the store. In either case, the service method uses the old Customer so that it can remove the relationship between the Order and that Customer.

/// <summary>
///
Updates an Order by changing the relationship between the Order
/// and a Customer in this case
/// </summary>
///
<param name="updatedOrder">order to update</param>
///
<param name="oldCustomer">previous customer</param>
public static void ChangeOrder(
    Order
updatedOrder,
    Customer
oldCustomer)
{
    using (NorthwindEntities context = new NorthwindEntities())
    {
        // Tell the context about the entities to persist
        context.AttachTo("Orders", updatedOrder);
  
        // Change the state of the new customer
        // In this example, this is done by the convention:
        // - Customer.CustomerID is null --> customer is 'Added'
        // - Customer.CustomerID is not null --> it already exists
        if(updatedOrder.Customer.CustomerID == "")
        {
            // When state of customer changes to 'Added', state
            // of the relationship between updatedOrder and
            // pdatedOrder.Customer is also changed to 'Added'
            context.ChangeObjectState(
                updatedOrder.Customer,
                EntityState.Added);
        }
        else
        {
            if (updatedOrder.Customer.CustomerID !=
                oldCustomer.CustomerID)
            {
                // only relationship needs to be marked as 'Added'
                context.ChangeRelationshipState(
                    updatedOrder,
                    updatedOrder.Customer,
                    o => o.Customer,
                    EntityState.Added);

                // Report removal of relationship between
                // pdatedOrder and oldCustomer

                context.ChangeRelationshipState(
                    updatedOrder,
                    oldCustomer,
                    "Customer",
                    EntityState.Deleted);

             }
        }
       
        // Persist the changes to the store
        context.SaveChanges();
    }
}

As is always the case, your feedback on this topic is welcome!

Jeff Derstadt
Jaroslaw Kowalski
Diego Vega
and The Object Services Team

This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post.

Foreign Keys in the Conceptual and Object Models
27 October 08 05:48 PM | efdesign | 36 Comments   

If you are reading this, you have probably heard by now about the so called impedance mismatch between the relational world and the object world – and there are a number of concepts in the relational database that don’t translate easily to corresponding concepts supported by the object oriented paradigm. One of these factors that is particularly interesting (and often controversial) is the concept of Foreign Keys and whether or not they belong in your conceptual/object model.

FKs are used to represent relationships in the database – but with objects, the natural way to represent relationships is through real references between objects. So as an object relational mapping platform, should a product support one or the other, or both?

I think most will agree that there is tremendous value in supporting relationships in the model as first class references between objects.

What are some of the issues with supporting both foreign keys and references in the same model? We could take a real world example i.e. LINQ to SQL and its foreign key support to look at some of the benefits as well as the downsides to having foreign keys.

Foreign Key Support in LINQ to SQL

LINQ to SQL takes the simplistic approach of making foreign keys available to you as a scalar property in your entities. In essence, LINQ to SQL allows you to write code dealing with relationships like so:

Product chai = db.Products.Where(p => p.ProductName == "Chai").Single();
chai.CategoryID = 1;
db.SubmitChanges();

This is possible in LINQ to SQL mainly because foreign keys are somewhat central to the way relationships are implemented and supported.

This particular example achieves something quite powerful, however - here you have essentially changed the category that the product Chai belonged to without ever having to query and materialize the new category that you associated Chai with.

However, LINQ to SQL also allows you to do this:

Product chai = db.Products.Where(p => p.ProductName == "Chai").Single();
Category beverages = db.Categories.Where(c => c.CategoryName == "Beverages").Single();
chai.Category = beverages;

We have been considering for a while about whether or not we want to include support to allow you to expose foreign keys in the model. Let’s see what you gain / lose by having foreign keys in the model.

One of the fundamental issues is that the simple foreign key concept that works so well in the relational model isn’t sufficient to represent relationships in the conceptual model. In the Entity Framework, a relationship is a first class concept that must be mapped to any set of columns in the database – and the important thing here is that these columns don’t have to represent relationships on the database by the means of FKs and constraints. Another point to note is that in the Entity Framework, relationships are always comprised of two ends, and are bi-directional unlike foreign keys.

What does all of this do for the model?  This opens up possibilities, such as Referential Integrity constraints that are represented in the EDM even though they may not necessarily be present on the relational model in the store. Bi-directional nature of the relationships makes it possible for you to navigate in both directions along a relationship in a way that is generally not possible with a simple FK.

Foreign Keys are not without value, however. An interesting challenge for us is to figure out how to bring the best aspects of Foreign Keys into the Entity Framework without compromising the richness and flexibility that relationships bring to the table.  Let’s look at the upside and the downsides of plain relational foreign keys:

Benefits of Foreign Keys

  1. Keeps it simple (for the simple cases)  and allows you to deal with relationship like you deal with them in the database
  2. Technically, you can update relationships without having both ends loaded/materialized. This is however in reality not always interesting since you will likely load both ends but this feature is definitely useful.

Disadvantages of Foreign Keys in the Model

  1. It is a part of the impedance mismatch problem.
  2. It doesn’t allow the concepts that you would expect from relationships in objects (easily getting from one end to the other) for instance.
  3. Having foreign keys as well as object references for relationship navigation presents the problem of two different artifacts representing relationships – this introduces complexity and now you have to make sure that you keep these two in sync.

So where do you stand? Do you like foreign keys or do you think they are evil? Would you like to see Foreign Keys exposed in the model, and if they are available in your object model, would it be sufficient if they were read-only?

Let us know!

Faisal Mohamood 
Program Manager,
Entity Framework Team

This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post.

EDM and Store functions exposed in LINQ
08 October 08 09:37 AM | efdesign | 13 Comments   

In this post Colin Meek and Diego Vega delve into some enhancements we are planning for LINQ to Entities, anyway over to them...

Entity Framework v1 customers preferring to write their queries using LINQ often hit a limitation on the range of functions and query patterns supported in LINQ to Entities. For some of those customers, having to resort to Entity SQL, or even to Entity SQL builder methods, feels awkward and reduces the appeal of Entity Framework.

There are two things we want to do in order to address this in future versions:

  • Expand the range of patterns and standard BCL methods we recognize in LINQ expressions.

  • Provide an extensibility mechanism that people can use to map arbitrary CLR methods to appropriate server and EDM functions.

This blog post expands on the second approach:

It is actually possible for us to improve our LINQ implementation so that all functions defined in the EDM and in the store, and even user defined functions, can be mapped to CLR methods with homologous signatures.

Design

Problem space

There are multiple dimensions to the problem space we want to address:

  • Functions can be defined in either the conceptual or the storage space

  • Functions can be defined in either the manifest, or just declared in the model

  • Functions can be mapped to either static CLR methods or to instance methods on the ObjectContext

  • This feature specifically targets composable functions

How it looks like: EdmFunctionAttribute

The basis of the extensibility mechanism is a new method-level attribute that carry function mapping information. Here is the basic signature of the attribute’s constructor:

public EdmFunctionAttribute(string namespaceName, string functionName)

The namespaceName parameter indicates the namespace for the function in metadata (i.e. “EDM” or “SQLSERVER”, or other store provider namespace). The functionName parameter takes the name of the function itself.

The following example could be product code or customer code applying the attribute on an extension method (it could be a regular static function) in order to map it to the standard deviation SQL Server function:

public static class SqlFunctions
{
    [EdmFunction("SqlServer", "stdev")]
    public static double? StandardDeviation(this IEnumerable<int?> source)
    {

        throw EntityUtil.NotSupported(
            System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall);
    }
}

Notice that while this method can’t be called directly it can be used in a query like this:

var query = 
    from p in context.Products
    where !p.Discontinued
    group p by p.Category into g
    select g.Select(each => each.ReorderLevel).StandardDeviation();

The following example shows how the canonical DiffYear function is mapped:

public static class EntityFunctions
{
    [EdmFunction("EDM", "DiffYears")]
    public static Int32? DiffYears(DateTime? arg1, DateTime? arg2)
    {
        throw EntityUtil.NotSupported(System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
    }
}

Usage is:

var query =
    from p in context.Products
    where EntityFunctions.DiffYears(DateTime.Today, p.CreationDate) < 5
    select p;

The following example shows how a user defined function defined in SQL Server can be mapped:

public static class MyCustomFunctions
{
    [EdmFunction("SqlServer", "MyFunction")]
    public static Int32? MyFunction(string myArg)
    {
        throw new NotSupportedException("Direct calls not supported"); 
    }
}

Convention based function name

We can establish that by convention the name of the CLR function defines the value of the functionName parameter. That makes the functionName parameter in the EdmFunctionAttribute optional.

EdmFunctionNamespaceAttribute

To avoid having to always specify the namespaceName for each function, we define a new class-level attribute named EdmFunctionNamespaceAttribute that would define the namespace mapping globally for a given class:

public EdmFunctionNamespaceAttribute(string namespaceName)

Using EdmFunctionNamespaceAttribute and the convention based constructor:

[EdmFunctionNamespace("EDM")]
public static class EdmMethods
{
    [EdmFunction]
    public static Int32? DiffYears(DateTime? arg1, DateTime? arg2)
    {
        throw EntityUtil.NotSupported(
            System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
    }
}

How it works

When a method with the EdmFunction attribute is detected within a LINQ query expression, its treatment is identical to that of a function within an Entity-SQL query. Overload resolution is performed with respect to the EDM types (not CLR types) of the function arguments. Ambiguous overloads, missing functions or lack of overloads result in an exception. In addition, the return type of the method must be validated. If the CLR return type does not have an implicit cast to the appropriate EDM type, the translation will fail.

Instance methods on the ObjectContext will be supported as well. This allows the method to bootstrap itself and trigger direct evaluation, as in the following example (definition of the method and sample query):

public static class MyObjectContext : ObjectContext
{
    // Method definition
    [EdmFunction("edm", "floor")]
    public double? Floor(double? value)
    { 
        return this.QueryProvider.Execute<double?>(Expression.Call(
            Expression.Constant(this),
            (MethodInfo)MethodInfo.GetCurrentMethod(),
            Expression.Constant(value, typeof(double?))));
    }
}



// evaluated in the store!
context.Floor(0.1);

Without the ObjectContext, the function cannot reach the store! To support this style of bootstrapping, the context needs to expose the LINQ query provider. For this reason, we now expose a “QueryProvider” property on the ObjectContext. This provider includes the necessary surface to construct or execute a query given a LINQ expression.

public class ObjectContext
{
    public IQueryProvider QueryProvider { get; }
}

If such a method is encountered inline in another query, then we must validate that the instance argument (MethodCallExpression.Object) is the correct context, but the instance is otherwise ignored:

// positive
var q1 = from p in context.Products select context.Floor(p.Price);

// negative
var q2 = from p in context.Products select context2.Floor(p.Price);

A function proxy can sometimes bootstrap itself without an explicit context, e.g. when an input argument is itself an IQueryable:

public static class SqlFunctions
{
    [EdmFunction("SqlServer", "stdev")]
    public static double? StandardDeviation(this IQueryable<int?> source)
    {
        return source.Provider.Execute<double?>(Expression.Call(
            (MethodInfo)MethodInfo.GetCurrentMethod(),
            Expression.Constant(source)));
    }
}

Nullability considerations

Particularly for functions taking collections, we will need to provide overloads for nullable and non-nullable elements. We don’t want to require awkward constructions like:

var query = (from p in products select (int?)p.ReorderLevel).StandardDeviation();

Tool for Generating the Functions

We created a simple internal tool that generates the classes that represent all the EDM canonical function and the SQL Server store functions. The tool will take the function definitions from Metadata and generate the appropriate function stubs/implementations.

The tool will be outside the product and will be run on demand. We expect to make a version of this tool available for provider writers together with the provider samples.

Naming

The methods will be in the following classes:

Namespace Class name

System.Data.Objects

EntityFunctions

System.Data.Objects.SqlClient

SqlFunctions

Note: The equivalent class in LINQ to SQL is System.Data.Linq.SqlClient.SqlMethods.

The method names will correspond to the name of the EDM/SQL function they represent. The argument names will correspond to the argument names of the EDM/SQL functions as retrieved by the metadata.

The recommendation for provider writers will be to include a similar static class in a namespace of the following form:

System.Data.Objects.[Standard provider namespace].[Standard provider prefix]Functions

Overloads and Implementation

Non-aggregate Functions

For each non-aggregate function we create an overload with all inputs type as nullable of the CLR equivalent of their EDM primitive type, and the return type nullable of the CLR equivalent of their EDM primitive type.

The implementation of the functions (what gets executed if the function is invoked outside an expression tree) will be to throw a NotSupportedException.

Example:

public static class EntityFunctions
{
    [EdmFunction("EDM", "DiffYears")]
    public static Int32? DiffYears(DateTime? arg1, DateTime? arg2)
    {
        throw EntityUtil.NotSupported(System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
    }
}

public
static class SqlFunctions
{
    [EdmFunction("SqlServer", "DiffYears")]
    public static Int32? DiffYears(DateTime? arg1, DateTime? arg2)
    {
        throw EntityUtil.NotSupported(System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
    }
}
Aggregate Functions

For each aggregate function we will provide two overloads, one with IEnumerable<Nullable<T>> and another one with IEnumerable<T>, where T is the CLR equivalent of the EDM primitive type of the input. The implementations of these will check whether the input is IQueryable in which case it will implement the self-bootstrapping.

Example:

[EdmFunction("EDM", "VARP")]
public static double? VarP(IEnumerable<int> arg1)
{
    ObjectQuery<int> objectQuerySource = source as ObjectQuery<int>;
    if (objectQuerySource != null)
    {
        return ((IQueryable)objectQuerySource).Provider.Execute<double?>(Expression.Call(
            (MethodInfo)MethodInfo.GetCurrentMethod(),
        Expression.Constant(source)));
    }
    throw EntityUtil.NotSupported(System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
}

[EdmFunction("EDM", "VARP")]
public static double? VarP(IEnumerable<int?> arg1)
{
    ObjectQuery<int?> objectQuerySource = source as ObjectQuery<int?>;
    if (objectQuerySource != null)
    {
        return ((IQueryable)objectQuerySource).Provider.Execute<double?>(Expression.Call(
            (MethodInfo)MethodInfo.GetCurrentMethod(),
        Expression.Constant(source)));
    }
    throw EntityUtil.NotSupported(System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
}

The Entity Framework team would love to hear your comments.

Alex James 
Program Manager,
Entity Framework Team

This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post.

Model First
10 September 08 09:55 PM | efdesign | 42 Comments   

One of the most painful omissions from the Entity Framework V1 was Model First, which basically means creating a conceptual 'model first' and then deriving a storage model, database and mappings from that.

People ask for this scenario all the time in the forums.

Well Noam, a Program Manager on the Entity Framework Tools team, outlines what we are considering:

Generating Databases from Models

The next release of the Entity Framework will include the ability to generate database schemas from your model. The main entry point into this feature is via the Designer context menu, to which we will add a new option called “Create Database from Model”.

(Note: User interface elements in this walkthrough are not final user interfaces, as the design is still under review, so please provide feedback...)

image

Selecting this option will bring up the following warning:

clip_image001

We would like users to understand that this feature will regenerate all SSDL and all MSL from scratch.

If you have started from an empty model, you will be asked to specify the target database. This screen is identical to the one shown when reverse engineering a model from a database. The implication here is that you will need an available server and database, which the system will use to determine what “flavor” of DDL to generate.

clip_image002

Once you have selected the target database, you will be presented with a summary screen which will provide a preview of the DDL that will be generated, as well as a tree-view of the objects.

The tree view:

clip_image003

The DDL view:

clip_image004

The DDL in the above screenshot is there merely as a graphic used to show how what the window will look like and is not intended to be representative of the DDL that will be generated by the system when you are actually creating your database (more about this soon). The DDL will however be read-only: Since it is generated by a template, editing of the results should either be done in the template, or in a separate DDL file which will not get regenerated.

Two options will be available:

- Save the DDL (off by default). This option will add the DDL as a dependent file under your EDMX file.

- Deploy the DDL (on by default). This option will deploy the DDL to the specified target database.

The generated DDL will not migrate data or schema - by default your database will be recreated from scratch.

Out of the box, we will support the Table-per-Type mapping strategy, meaning that we will create a table for each of your types and subtypes. For example, for a model like this…

clip_image006

…the following schema will be generated:

clip_image008

Of interest here is that PK-to-PK constraint between the Customer and Persons table. This helps enforce the inheritance relationship and the creation of foreign key columns to represent the various associations. In addition, the engine will create clustered keys on primary keys, and indexes on foreign keys that represent associations.

Under the Hood

The model first process is implemented using a Windows Workflow Foundation workflow that looks like this:

clip_image001[5]

Here is what the stages do:

Stage

Purpose

Stage

 

Purpose

 

CSDLtoSSDL

 

Creates the mappings (MSL) and database store model (SSDL) in the EDMX file.

 

SSDLtoDBSchema

 

Converts the SSDL to the format used by the Microsoft.Data.Schema APIs. This format is used as a “universal” database description format and includes physical information not present in the Entity Frameworks store model, such as indexes.

 

GenerateDDL

 

Uses the Microsoft.Data.Schema APIs to convert the universal format to store-specific DDL. In this release, we will support a minimum of SQL 2005 and SQL 2008. A provider model is in place, however, and we hope to add support for additional databases.

 

SuspendToConfirm

 

This activity pauses the workflow to allow the wizard to display the DDL.

 

DeployToDatabase

 

Deploys the DDL to the target database.

 

OutputDDL

 

Writes the DDL to the file system.

 

These stages are expressed in a XAML file which will be placed underneath your EDMX, to allow for customization: You can add your own steps or replace ones we have created with steps that you write.

Templates

Several of the steps above make use of templates to provide an additional point of control for users, and this is where we expect most of the customization to happen. These templates use the T4 Engine that is included in Visual Studio. We are currently working on making these templates as simple as possible by providing a set of supporting APIs that provide metadata collections that are designed for artifact generation – for example, a collection of all inherited properties for a type, or a collection of both inherited and defined properties. There are three templates:

Template

Purpose

Template

 

Purpose

 

CSDL to SSDL

 

Creates the SSDL for the target database.

 

CSDL to MSL

 

Creates the MSL for mapping the CSDL to the generated SSDL.

 

SSDL to DBSchema

 

Creates the physical database description, including elements such as foreign keys and indexes.

 

So, for example, if you need all table names to start with “_tbl”, you would modify the SSDL generation template to give all tables the appropriate name, and also modify the MSL template to provide the appropriate mappings. The DBSchema template will not need to be modified as it will automatically pick up the changes made to the SSDL which is its input.

As another example, if you wish to change the mapping strategy from Table-per-Type to Table-per-Hierarchy, you would need to change this same pair of templates.

Finally, if you needed to support a database for which no Microsoft.Data.Schema provider is available, you could replace the GenerateDDL step with your own template-driven activity which could them transform the schema “manually” to your store’s DDL format.

We hope this gives you enough information to understand the design and intent of this feature.

We would love to hear your comments.

Alex James
Program Manager,
Entity Framework Team

This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post.

Structural Annotations - One Pager
12 August 08 08:50 PM | efdesign | 3 Comments   

In V1 of the Entity Framework it is possible to annotate a schema using attributes declared in another XSD.

However XML attributes are a very limited form of annotation. It would be better if we could annotate using full elements.

This is what we are calling Structural Annotations.

This feature will allow both customers and partners like Reporting Services to modify the model so that it includes information important to them which can’t be captured in vanilla EDM format.

1     EDM extensions

While it should be possible to annotate any level in the XML hierarchy, there are however some restrictions.

The general rule is you can annotation any CSDL / SSDL element that has a corresponding  MetadataItem which in practice means everything except <Using>, <Schema>, <Key> and <PropertyRef> elements.

These annotations should be named, so that they can be accessed using the same API as today.

i.e. something like this:

<EntityType Name="Content">
      <Key>
            <PropertyRef Name="ID" />
      </Key>
      <Property Name="ID" Type="Guid" Nullable="false" />
      <Property Name="HTML" Type="String" Nullable="false" MaxLength="Max" Unicode="true" FixedLength="false" />
      <CLR:Attributes>
            <CLR:Attribute TypeName="System.Runtime.Serialization.DataContract"/>
            <CLR:Attribute TypeName="MyNamespace.MyAttribute"/>
      </CLR:Attributes>
      <RS:Security>
            <RS:ACE Principal="S-0-123-1321" Rights="+R+W"/>
            <RS:
ACE Principal="S-0-123-2321" Rights="-R-W"/>
      </
RS:Security>
</EntityType>

1.1    Key points to notice:

  1. In the above example both the CLR and RS namespaces must have been declared somewhere. Probably at the root of the CSDL somewhere, i.e. something like this:
    <Schema xmlns:RS="http://schemas.microsoft.com/RS/2006" xmlns:CLR=http://schemas.microsoft.com/net/3.5
  2. Alternatively the namespace could be in lined in the annotation:
    <Security xmlns="http://schemas.microsoft.com/RS/2006">           
    </
    Security>
  3. In both cases the unique identity of the annotation is in the form {namespace}:{elementname}. So for the RS Security annotation example, irrespective of how the RS namespace is introduced the identity would be:
    http://schemas.microsoft.com/RS/2006:Security 
  4. Structural annotations should always follow all other sub-elements i.e. when structurally annotating an <EntityType> element the annotation element should follow all <Key> <Property> and <NavigationProperty> elements.
  5. Attribute based annotations (as supported in V1 and used for things like CodeGen) are scoped to the same annotation identity namespace. Hence care should be taken to verify that annotations using either approach don’t have colliding identities.
  6. It should be possible to have more than one named “Structural Annotation” per CSDL (or SSDL) element. Indeed element names can collide so long as the combination of Namespace + Element Name is unique for a particular element (or more specifically each MetadataItem). This means for instance you can’t have two <RS:Security> elements under any one MetadataItem.

1.2    Positive and Negative Cases:

This:

<EntityType Name="Content" CLR:Attribute="Blah">
      <CLR:Attributes>
            <CLR:Attribute TypeName="System.Runtime.Serialization.DataContract"/>
      </CLR:Attributes>

…would be invalid, because of the identity collision, likewise this would also be invalid:

<EntityType Name="Content" My:Attribute="Blah">
      <CLR:Attributes>
            <CLR:Attribute TypeName="System.Runtime.Serialization.DataContract"/>
      </CLR:Attributes>
      <CLR:Attributes>
            <CLR:Attribute TypeName="MyNamespace.MyAttribute"/>
      </CLR:Attributes>

where-as this is fine:

<EntityType Name="Content" My:Attribute="Blah">
      <CLR:Attributes>
            <CLR:Attribute TypeName="System.Runtime.Serialization.DataContract"/>
      </CLR:Attributes>
      <RS:Security>
            <RS:ACE Principal="S-0-123-1321" Rights="+R+W"/>
            <RS:
ACE Principal="S-0-123-2321" Rights="-R-W"/>
      </
RS:Security>

… since all the above annotations have unique identities.

1.3    What can an <Annotation> contain?

A structural annotation is simply an XML element. As such it can be considered a root of an XML document that can contain any valid XML structures. These structures are simply ignored by the Entity Framework and Metadata APIs.

1.4    What elements can be annotated?

If an element in the CSDL or SSDL has a corresponding MetadataItem in the Metadata API that element should support structural annotations. Since <Using>, <Schema>, <Key> and <PropertyRef> elements have no corresponding MetadataItem(s) they don’t support structural annotations.

Notice while TypeUsage(s) have no CSDL representation today,they are MetadataItem(s) so they could in theory be annotated in the future. For example if a public mutable metadata API is produced.

2     Metadata API Changes:

In this example we are using the Entity Frameworks Metadata API to get the EdmType for an EntityType called Content. From that we then get the MetadataProperty called http://schemas.microsoft.com/RS/2006:Security:

EdmType contentType = metadataWorkspace.GetEdmType(“MyNamespace.Content”);
if (contentType.MetadataProperties.Contains(
"http://schemas.microsoft.com/RS/2006:Security")
{
    MetadataProperty annotationProperty = contentType.MetadataProperties[ "http://schemas.microsoft.com/RS/2006:Security"]; 
    object annotationValue = annotationProperty.Value;
    //Do something...
}

Notice that the way you access structural annotations is the same as the way you access attribute annotations, i.e. by name/identity, in the MetadataProperties collection.

This of course means that their names can’t collide, name collisions should be picked up when metadata is loaded from persistent formats (like CSDL) or when annotations are created by hand in the any future mutable metadata API.

2.1    MetadataProperty TypeUsage

When creating a MetadataProperty to hold the XElement for the structural annotation, we need to assign a TypeUsage too. For V1 style attribute annotations, the TypeUsage we chose was based on Edm.String. Ideally we should use Edm.XML etc, but since this is not currently available, we will use Edm.String in the meantime. We have filled a bug to change this if/when [j1] XML becomes an EDM type.

3    What is the MetadataProperty.Value?

The Value of the MetadataProperty will be a XElement for structural annotations originating from CSDL.

So if you know that the annotation is a structural annotation you can access it like this:

MetadataProperty annotationProperty =
                     contentType.MetadataProperties["CLR:Attributes"];

XElement myAnnotation = annotationProperty.Value as XElement;

3.1    XElement is rooted at the <Annotation> element

The XElement returned will be the XElement containing everything including the structural annotation itself.

3.2    XElement is mutable

There is nothing to stop the consumer of the Metadata API from modifying the XElement since it is simply a reference, to an instance of a type the Entity Framework doesn’t control.

We could address this by cloning on access to make things a little safer. But this negatively affects the mainline scenario, namely where a customer uses annotations purely for read, and is therefore not required.

The important thing to understand is that should a user modify the XElement all bets are off. We should provide guidance both via API hints:

/// <summary>
/// ....
/// <remarks>
/// Should this return a reference type like an XElement, you should consider
/// this immutable, no guarantees are made about behavior if this reference
/// type is modified in anyway.
/// </remarks>
/// </summary>
public object Value { get {

And via appropriate documentation.

3.3    No universal annotation format

Rather than try to invent a new universal data format for annotations, we decided to be pragmatic.

So we are limiting ourselves to XML for structural annotations loaded from CSDL (an XML format). 

However since the MetadataProperty.Value is of type object, in a world where we have an alternative EDM representation (like in memory using mutable metadata for instance) we could easily assign something other than an XElement to the MetadataProperty.Value, like for example a xaml Serializable CLR type instance.

4     Out of Scope

The current plan is to simply support annotation in the CSDL & SSDL. Meaning annotating MSL is currently out of scope.

If we uncover meaningful important scenarios for MSL annotations, we will undertake that work separately. 

Incidentally supporting structural annotations in MSL touches a completely separate code path, and as a result will require roughly the same amount of work. Additionally in order to make undertaking this work meaningful we would first need to expose a public mapping metadata API.

....

As always we are keen to hear any feedback you have on this item.

Do you think you will find a use for Structured Annotations?

Alex James
Program Manager,
Entity Framework Team

This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post.

Discussion about API changes necessary for POCO:
01 August 08 05:24 PM | efdesign | 20 Comments   

Evolving an API to support new requirements, like POCO, while maintaining backward compatibility is challenging.

The following design discussion from members of the Object Services team illustrates some of the issues and hard calls involved.

Have a read, and tell us what you think.

In particular are we missing something, or overstating the importance of something? Let us know...

Anyway over to Diego and Mirek...

POCO API Discussion: Snapshot State Management

What is in a snapshot?

In Entity Framework v1 there is a single way for the state manager to learn about changes in entity instances: the change tracking mechanism is set in such a way that the entity instances themselves notify the state manager of any property change.

This works seamlessly if you use default code-generated classes, and it is also part of the IPOCO contract for developers willing to create their own entity class hierarchies.

For version 2, we are currently working on a snapshot-based change tracking mechanism that removes the requirement from classes to send notifications, and enables us to provide full POCO support.

The basic idea with snapshots is that when you start tracking an entity, a copy of its scalar values is made so that at a later point – typically but not always at the moment of saving changes into the database - you can detect whether anything has changed and needs to be persisted.

The challenge

When we created the v1 API, we made a few assumptions that were totally safe with notification-based change tracking but don’t completely prevail in a snapshot change tracking world.

We now need to choose the right set of adjustments for the API for it to gracefully adapt to the new scenarios we want to support.

Mainline scenario: SaveChanges() Method

In notification-based change tracking, by the time SaveChanges is invoked, the required entity state information is ready to use.

With snapshot thought, a property by property comparison needs to be computed for each tracked entity just to know whether it is unchanged or modified.

Once the snapshot comparison has taken place, SaveChanges can proceed normally.

In fact, assuming the process is triggered implicitly on every call to SaveChanges, a typical unit of work in which the program queries and attaches entities, then modifies and adds new entities, and finally persists the changes to the database, works unmodified with POCO classes:

Category category = context.Categories.First();

category.Name = "some new name"; // modify existing entity

Category newCategory = new Category(); // create a new entity

newCategory.ID = 2;

newCategory.Name = "new category";

context.AddObject("Categories", newCategory); // add a new entity

context.SaveChanges(); // detects all changes before saving

Things get more complicated when a user wants to use lower level APIs that deal with state management.

State Manager Public API

ObjectStateManager and ObjectStateEntry classes comprise the APIs that you need to deal with if you want to either get input from, or customize the behavior of Entity Framework’s state management in your own code.

Typically, you use these APIs if you want to:

  • Query for objects that are already loaded into memory
  • Manipulate the state of tracked entities
  • Validate state transitions or data just before persisting to the database
  • Etc.

As the name implies, ObjectStateManager is Entity Framework’s state manager object, which maintains state and original values for each tracked entity and performs identity management.

ObjectStateEntries represent entities and relationships tracked in the state manager. ObjectStateEntry functions as a wrapper around each tracked entity or relationship and allows you to get its current state (Unchanged, Modified, Added, Deleted and Detached) as well as the current and original values of its properties.

Needless to say, the primary client for these APIs is the Entity Framework itself.

ObjectStateEntry.State Property

The fundamental issue with snapshot is exemplified by this property.

With notification-based change tracking, the value for the new state is computed on the fly on each notification, and saved to an internal field. Getting the state later only encompasses reading the state from the internal field.

With snapshot, the state manager no longer gets notifications, and so the actual state at any time depends on the state when the entity began being tracked, whatever state transitions happened, and the current contents of the object.

Example: Use ObjectStateEntry to check the current state of the object.

Category category = context.Categories.First();

category.Name = "some new name"; // modify existing entity

ObjectStateEntry entry = context.ObjectStateManager.GetObjectStateEntry(category);

Console.WriteLine("State of the object: " + entry.State);

Question #1: What are the interesting scenarios for using the state management API in POCO scenarios?

Proposed solutions

In above example there are two possible behaviors and it's not obvious for us which one is better:

Alternative 1: Public ObjectStateManager.DetectChanges() Method

In the first alternative, computation of the snapshot comparison for the whole ObjectStateManager is deferred until a new ObjectStateManager method called DetectChanges is invoked. DetectChanges would iterate through all entities tracked by a state manager and would detects changes in entity's scalar values, references and collections using a snapshot comparison in order to compute the actual state for each ObjectStateEntry.

In the example below, the first time ObjectStateEntry.State is accessed, it returns EntityState.Unchanged. In order to get a current state of the "category" entity, we would need to invoke DetectChanges first:

Category category = context.Categories.First();

category.Name = "some new name"; // modify existing entity

ObjectStateEntry entry = context.ObjectStateManager.GetObjectStateEntry(category);

Console.WriteLine("State of the object: " + entry.State); // Displays "Unchanged"

context.ObjectStateManager.DetectChanges();

Console.WriteLine("State of the object: " + entry.State); // Displays "Modified"

DetectChanges would be implicitly invoked from within ObjectContext.SaveChanges.

Pros:

· User knows when detection of changes is performed

· DetectChanges is a method that user would expect to throw exceptions if some constraint is violated

· This alternative requires minimal changes in Entity Framework current API implementation

Cons:

· Since DetectChanges iterates through all the ObjectStateEntries, it is a potentially expensive method

· User has to remember to explicitly call DetectChanges() before using several methods/properties, otherwise, their result will be inaccurate:

o ObjectStateEntry.State

o ObjectStateEntry.GetModifiedProperties()

o ObjectStateManager.GetObjectStateEntries()

· This alternative implies adding a new method to ObjectStateManager

Alternative 2: Private ObjectStateEntry.DetectChanges() Method

In the second alternative, there is no public DetectChanges method. Instead, the computation of the current state of an individual entity or relationship is deferred until state of the entry is accessed.

Pros:

· User doesn't have to remember explicitly calling DetectChanges to get other APIs to work correctly

· Existing API works as expected in positive cases regardless of notification-based or snapshot tracking

· No new public API is added

Cons:

· The following methods now require additional processing to return accurate results in snapshot:

o ObjectStateEntry.State

o ObjectStateEntry.GetModifiedProperties()

o ObjectStateManager.GetObjectStateEntries()

· Existing API works differently in negative cases in notification-based and snapshot tracking:

o some of the existing methods would start throwing exceptions

or

o would require to introduce a new state of an entry – Invalid state (see below for details)

Question #2: What API pattern is better? Having an explicit method to compute the current state based on the snapshot comparisons or having the state to be computed automatically when accessing the state?

How this affects other APIs

While the State property exemplifies the issue, there are other APIs that would have different behaviors with the two proposed solutions.

ObjectStateEntry.GetModifiedProperties() Method

GetModifiedProperties returns the names of the properties that have been modified in an entity. Similar to the State property, with notification-based change tracking, an internal structure is modified on the fly on each notification. Producing the list later on only encompasses iterating through that structure.

With snapshot, the state manager no longer gets notifications, and at any given point in time, the actual list of modified properties really depends on a comparison between the original value and the current value of each property.

Therefore, for alternative #1, this APIs will potentially return wrong results unless it is invoke immediately after DetectChanges. For alternative #2, the behavior would be always correct.

ObjectStateManager.GetObjectStateEntries()

This is the case in which an implementation detail that was a good idea for notification-based change tracking stops offering performance benefits in snapshot. Internally, ObjectStateManager stores ObjectStateEntires in separate dictionaries depending on their state. But in snapshot, any unchanged or modified entity can become deleted because of a referential integrity constraint, and any unmodified entity can become modified.

In alternative #1, DetectChanges would iterate thought the whole contents of the ObjectSateManager, and thus it would update the internal dictionaries. Once this it done, it becomes safe to do in-memory queries using GetObjectStateEntires the same way it is done today.

In alternative #2, GetObjectStateEntires would need to look in unchanged, modified and deleted storage when asked for deleted entries; also in unchanged and modified storage when asked for modified entities.

Querying with MergeOption.PreserveChanges

In Entity Framework, MergeOptions is a setting that changes the behavior of a query with regards to its effects on the state manager. Of all the possibilities, PreserveChanges requires an entity to be in the Unchanged state in order to overwrite it with values coming from the database. In order for PreserveChanges to work correctly, accurate information on the actual state of entities is needed.

Therefore, with alternative #1, querying with PreserveChanges will not have a correct behavior unless it is done immediately after invoking DetectChanges. For alternative #2, querying with PreserverChanges would behavior correctly at any time.

Referential Integrity constraints

When there is a referential integrity constraint defined in the model, a dependent entity may become deleted if the principal entity or the relationship with the principal entity becomes deleted.

For alternative #1, DetectChanges would also trigger Deletes to propagate downwards throw all referential integrity constraints.

For alternative #2, finding out the current state of an entity that is dependent in a RIC would requires traversing the graph upwards to find if either the principal entity or a relationship has been deleted.

Mixed Mode Change Tracking

At the same time we are working on snapshot change tracking, we are also working on another feature, denominated dynamic proxies. This feature consists of Entity Framework automatically creating derived classes of POCOs that override virtual properties. Overriding properties enables us to inject automatic lazy loading on property getters, and notifications for notification-based change tracking in property setters. This introduces a subtle scenario: when using proxies, it is possible that:

a. Not all properties in the class are declared virtual. The remaining properties need to still be processed using snapshot.

b. Sometimes, non-proxy instances of POCOs and proxy instances of the same type have to coexist in the same ObjectStateManager. For the former, it will have to use snapshot tracking. For thelatter, a combination of snapshot and notifications.

All in all, it becomes clear that the actual state of an entity does not entirely depend on the value of the internal state field, nor on the snapshot comparison, but on a combination of both.

ObjectStateEntry.SetModified()

As with mixed mode change tracking, SetModified() requires a combination of the internal state and the snapshot comparison to return valid results.

Handling of invalid states

When working in notification-based change tracking, Entity Framework throws exceptions as soon it learns that entity key values has been modified. With the default code-generated classes, the exception prevents the change for being accepted.

For alternative #1, DetectChanges can throw exceptions if some model constraint (i.e. key immutability) is violated. It is too late to prevent the value from changing.

Example:

Category category = context.Categories.First();

category.ID = 123;  // Modify a key property. This would throw if category wasn't a POCO class.

context.ObjectStateManager.DetectChanges(); // Throws because key property change was detected.

For alternative #2, reading the state of modified properties from an entity with modified keys could throw an exception:

Example:

Category category = context.Categories.First();

category.ID = 123;  // Modify a key property. This would throw if category wasn't a POCO class.

ObjectStateEntry entry = context.ObjectStateManager.GetObjectStateEntry(category);

Console.WriteLine("State of the object: " + entry.State);

// Throws because key property change was detected.

Getting an exception thrown here would be unexpected. But there is an alternative design that is to define a new EntityState that indicates that an entity is Invalid. This new state would account for the fact that POCO classes per se do not enforce immutable keys.

Since EntityState is a flag enum, Invalid could be potentially combined with other states.

SaveChanges would still need to throw an exception if any entity in the state manager is invalid.

It would be possible to query the state manager for entities in the Invalid state using GetObjectStateEntries method.

Question #3: Is it better to have an Invalid state for entries or should the state manager just throw exceptions immediately every time it finds a change on a key?

Our questions:

1. What are the interesting scenarios for using the state management API in POCO scenarios?

2. What API pattern is better? Having an explicit method to compute the current state based on the snapshot comparisons or having the state to be computed automatically when accessing the state?

3. Is it better to have an Invalid state for entries or should the state manager just throw exceptions immediately every time it finds a change on a key?

---

We really want to hear your thoughts on the above questions.

Alex James
Program Manager,
Entity Framework Team

This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post.

Using Stored Procedures to load structured data.
18 July 08 06:59 PM | efdesign | 4 Comments   

V1 of the Entity Framework allows you to use stored procedures in two main ways:

  1. Mapping Create, Update and Delete entity operations to appropriate stored procedures.
  2. Doing a FunctionImport that allows you to return an enumeration of Entities*

Now the thing is, in order to return an enumeration of Entities, you have to map the Entity too.

Why?

Well in V1 FunctionImport attaches the returned entities, so you can make and save changes.

But of course sometimes you don't want that functionality, you just want a structured way of moving information around, so it is a shame to be required to do the mapping too.

This leaves us two possibilities for V2.

  1. Allowing FunctionImports to return unattached/untracked Entities.
  2. Allowing FunctionImports to return ComplexTypes.

The ComplexTypes option is the topic of the following one-pager by Asad a Program Manager on the Entity Framework team:

Scenario:

Customer Goal:

The customer wants to do something like this in their .NET code:

public CustomerInfo GetCustomerInfo(int Id)
{
    using (NorthwindEntities ctx = new NorthwindEntities())
    {
        CustomerInfo info = ctx.GetCustomerInfoById(Id).FirstOrDefault();
        return info;
    }
}

Without needing to create mappings etc, as this lowers the level of friction inherent in using stored procedures for queries with the Entity Framework. 

NB: today we only support Collection(Type) as the return type of the FunctionImport, which means it results in an Enumeration, hence the call to FirstOrDefault(). A separate work item is required to support returning just one ComplexType (or EntityType) rather than collections.

Detailed Scenario walk through:

User defines a Stored Procedure in store:

Create Procedure GetCustomerInfoById(
        @Id int
    )
As
    SELECT first_name As Firstname, last_name As Lastname, city As City 
    FROM CustomerTable
    WHERE id = @Id

Complex type definition in EDM:

User wants to map the result set from this stored procedure to a Complex type “CustomerData” defined in Entity Data Model as:

  <ComplexType Name="CustomerData">
    <Property Name="Firstname" Type="String" MaxLength="50" />
    <Property Name="Lastname" Type="String" MaxLength="50" />
    <Property Name="City" Type="String" MaxLength="50" />
  </ComplexType>

The mapping involves following three steps:

Define import function definition in SSDL:

<Function Name="GetCustomerInfoById" Aggregate="false" BuiltIn="false" NiladicFunction="false" IsComposable="false" ParameterTypeSemantics="AllowImplicitConversion" Schema="dbo">
  <Parameter Name="Id" Type="int" Mode="In" />
</Function>

Expose function definition in CSDL schema file

<EntityContainer Name="CustomerEntityContainer"
  <FunctionImport Name="GetCustomerInfoById ReturnType="Collection(Self.CustomerData)">
    
<Parameter Name="Id" Mode="In" Type="Int32" /> 
  
</FunctionImport>
</EntityContainer>

Define mapping (convention based) MSL file:

<EntityContainerMapping StorageEntityContainer="StoreContainer" CdmEntityContainer="CustomerEntityContainer"
  <FunctionImportMapping FunctionImportName="GetCustomerInfoById" FunctionName="StoreNamespace.GetCustomerInfoById" />
</EntityContainerMapping>

Design Details:

-       Mapping between Complex type and result type is by convention.

-       Explicit property to column name mapping is not supported.

-       The result from the stored procedure has to exactly match the shape and the property names of the complex types.

-       In EDM the result is captured in a Complex type therefore no EntitySet definition or mapping is required.

Assumptions/suppositions:

-       A future work item (covered by a separate feature entry) would enable richer mapping capability by allowing column renames. However for now Convention based mapping is the only supported feature.

Looking forward to hearing your feedback.

Alex James
Program Manager,
Entity Framework Team

This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post.

*In V1 it is also possible to do a FunctionImport that returns a scalar value, but this FunctionImport is not available in Object Services, if you want to use it you have to drop down to eSQL in the EntityServices layer. 

Transparent Caching Support in the Entity Framework
09 July 08 09:43 PM | efdesign | 14 Comments   

The Entity Framework's provider model makes it possible for it to work over different database's.

The idea being that someone, either the database vendor or a third party, can write a provider that allows the Entity Framework to work with that database.

The Entity Framework asks the provider to convert queries and update operations from a canonical, database agnostic form, into native commands. I.e. T-SQL for SQLServer.

This Provider model however has an interesting side-effect. It makes it possible to write wrapping providers, providers that wrap another provider, layering in additional services.

Examples of possible wrapping providers include: logging providers, auditing providers, security providers, optimizing providers and caching providers. The latter is the subject of the following one-pager put together by Jarek

----

Introduction

Business applications use various kinds of Reference Data that does not change at all during the lifetime of an application or changes very infrequently. Examples may include: countries, cities, regions, product categories, etc.

Applications that present data (such as ASP.NET applications) tend to run the same or similar queries for reference data very often resulting in a significant database load. Developers typically use ASP.NET cache and/or custom caching techniques to minimize number of queries but implementing caching manually adds additional complexity and maintenance cost to an existing solution.

Entity Framework can be extended to handle data caching in a transparent way, so that any application using it can take advantage of caching with little or no modification. In Entity Framework V1 it is possible to implement transparent caching using a custom provider as demonstrated in EFCachingProvider sample (TBD). We are considering adding caching as a first-class concept in Entity Framework V2 so that it will no longer be necessary to use a wrapper provider approach.

Requirements

We are designing a query caching layer for Entity Framework that:

·         Will be transparent (existing code will automatically take advantage of caching without modification other than defining caching policy).

·         Will cache query results, not entity objects so arbitrary ESQL queries can also benefit from it

·         Will be optimized for read-only or mostly-read-only data. Caching of frequently changing data will also be supported, but we are not optimizing for that scenario.

·         Will handle cache eviction automatically on updates.

·         Will be extensible (it should be easy to use with ASP.NET cache or 3rd party caching solutions including local and distributed caches)

Implementation

To implement query caching we need to be able to intercept query and update execution.

All queries in Entity Framework (regardless of their origin: Entity SQL queries, Object Query<T>, LINQ queries or internal queries generated by object layer) are processed in the Query Pipeline which at some point passes Canonical Query Tree (CQT) to the provider to get the result set of a query. We will cache query results in such a way that when the same query is used over and over again (as determined by the CQT and parameter values), the results will be assembled from the cache instead of a database.

Updates are also centralized in Entity Framework (Update Pipeline) and handled in a similar way. At some point update commands (Update CQTs) are sent to the provider. We can add cache maintenance routines at this point that ensure that proper cache invalidation happens each time an update occurs.

Cache entries and dependencies

Query results stored in the cache will be represented by opaque data structures, which are immutable and serializable, so that they can be easily passed over the wire.  An example of such structure may be:

[Serializable]
public class DbQueryResults
{
    public List<object[]> Rows = new List<object[]>();
}

When caching query results care must be taken to make sure that returned data is not stale. To be able to detect that, we associate a list of dependencies with each query result. Whenever any of the dependencies change, the query results should be evicted from cache. In the proposed approach, dependencies will be simply store-level entity set names (tables or views) that are used in the query.

For example:

·         SELECT c.Id FROM Customers AS c is dependent on “Customers” entity sets

·         SELECT c.Id, c.Orders FROM Customers as c is dependent on Customers and Orders entity sets

·         1+2 is not dependent on any entity sets

When adding items to the cache, we will be passing a list of dependent entity sets to the cache provider.  After EF makes changes to the database, it will notify the cache about list of entity sets that have changed. All query results relying on any of those entity sets have to be removed from the cache. Dependency names will be represented as strings and collections of dependent entity sets will be IEnumerable<string>.

In the first implementation we will likely use query text or some derivative of it (such as cryptographic hash) as a cache key, but cache implementations should not rely upon cache keys being meaningful.

Interface

To be able to work with EF, cache must implement the following interface:

public interface ICache
{
    bool TryGetEntryByKey(string key, out object value);
    void Add(string key, object value, 
             IEnumerable<string> dependentEntitySets);
    void Invalidate(IEnumerable<string> entitySets);
}

 

As you can see, the values are passed as objects instead of DbQueryResults.

·         TryGetEntryByKey(key,out value)  tries to get cache entry for a given key. If the entry is found it is returned in queryResults and the function returns true. If the entry is not found, the entry returns false and value of queryResults is not determined.

·         Add (key, value, dependentEntitySets) adds the specified query results to the cache with and sets up dependencies on given entity sets.

·         Invalidate(sets) – will be invoked after changes have been committed to the database. “sets” is a list of sets that have been modified. The cache should evict ALL queries whose dependencies include ANY of the modified sets.

Cache providers will typically define some specific retention policies, limits and automatic eviction policies (using LRU, LFU or some other criteria).  ICache interface does not define that. Based on the user feedback we may want to extend ICache to specify parameters such as retention timeout for each item or add a new interface for configuring cache behavior in an abstract way.

Using the interface defined above, the pseudo-C#-code implementation of query execution operation may look like this:

DbQueryResults GetResultsFromCache(DbCommandDefinition query)
{
    if (CanBeCached(query))
    {
        // calculate cache key
        string cacheKey = GetCacheKey(query);
        DbQueryResults results;
        // try to look up the results in the cache
        if (!Cache.TryGetEntryByKey(cacheKey, out results)) 
        {
            // results not found in the cache - perform a database query
            results = GetResultsFromDatabase(query);
            // add results to the cache
            Cache.Add(cacheKey, results, GetDependentEntitySets(query));
        }
        return results;
    }
    else
    {
        return GetResultsFromDatabase(query);
    }
}

 

----

As always we are keen to hear your comments.

Alex James
Program Manager,
Entity Framework Team

This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post.

More Posts Next page »
Page view tracker