The information in this post is out of date.

Visit msdn.com/data/ef for the latest information on current and past releases of EF.


The Entity Framework (EF) in .NET 4.0 introduces the ability to interact with entities with persistence ignorance (POCOs). While this favors separation of concerns, it also means that the entities will do less because they cannot talk directly to the data access layer. Another new concept in EF 4.0 is the idea of dynamically generated proxies for such entities. Opting into proxies is an easy way to get additional capabilities like lazy loading and change tracking, while still keeping the entity class definition and implementation simple.

This is the first in a two part series that describes how proxies work in the Entity Framework. The first post describes the kinds of proxies available in EF 4.0, how they work, and about the API for dealing with proxies. Although the behaviors of proxies like lazy loading can be quite easy to understand, the internal mechanics of how to make that work can be complicated. It is not a requirement to understand these details to use proxies, but we wanted to share our approach and get feedback. The second post will describe some of the best practices for using proxies in serialization scenarios.

Proxies in a Nutshell

Proxies are a form of dependency injection where additional functionality is injected into the entity. In the case of POCO entities, they have no dependency on the EF so cannot create or execute queries or tell the state manager about changes. In the EF, a proxy is a dynamically generated derived type of an entity that overrides various properties so that additional code can be run. In the case of lazy loading, this is the code to perform a navigation property load. In the simplified example below, the EF creates the derived type “EFCustomerProxy” and overrides the Orders property. In the getter for Orders, an extra method call is performed to DoLazyLoad() to perform the load of the Orders collection if it is not already loaded.

// POCO Type
public class Customer
{
    public virtual ICollection<Order> Orders { get; set; }
}

// Dynamically Generated Type
public class EFCustomerProxy : Customer
{
    public override ICollection<Order> Orders
    {
        get
        {
            DoLazyLoad();
            return base.Orders;
        }
        set
        {
            base.Orders = value;
        }
    }
}

The EF generates proxy types on the fly using the AssemblyBuilder and TypeBuilder in the .NET framework. All the code to perform lazy loading and change tracking is added to the dynamically generated types using the Reflection.Emit APIs. Proxy types are created once per .NET type for each instance of metadata (more about this later), and these types are cached so they can be reused in the same AppDomain.

For the EF to be able to create a dynamically derived type, a couple of things need to be done:

  1. Proxy creation must be turned on. It is on by default, but there is a setting to turn it on or off:
    context.ContextOptions.ProxyCreationEnabled = true;
  2. It must be possible to create a derived .NET type in a separate assembly:

    a. The class must be public and not sealed.

    b. The class must have a public or protected parameter-less constructor.

    c. The class must have public or protected properties

    d. Certain properties must be virtual – depending on which properties are virtual, you can opt into more behaviors (see the sections on kinds of proxies).

    Kinds of Proxies

    The Entity Framework supports two kinds of proxies: proxies that can perform lazy loading, and proxies that also can perform change tracking.

    Lazy Loading Proxy

    Lazy loading is the ability for an entity to perform a query for related entities whenever a navigation property is accessed. In the following example, when lazy loading is enabled a query for the customer’s Orders is executed as soon as c.Orders is called:

 

using (var ctx = new NorthwindEntities())
{
    Customer c = ctx.Customers.Single(x => x.Id == 1);
    Console.WriteLine(c.Orders.Count);
}

In addition to the requirements for proxies in general, two things need to happen for lazy loading to be enabled for entities with persistence ignorance:

  1. Lazy loading must be enabled
    context.ContextOptions.LazyLoadingEnabled = true;

    In the runtime, lazy loading is not on by default. However, the default code generator for the Entity Framework includes an annotation that instructs code generation to add a line to your model’s derived ObjectContext constructor to turn lazy loading on – so it can appear that lazy loading is on by default.

  2. Navigation properties must be declared virtual
    public class Customer
    {
        public virtual ICollection<Order> Orders { get; set; }
    }

For collection navigation properties such as “Orders” above, the type of the property also can be important. In general, lazy loading proxies do not care what the type of the property is, whether it is an ICollection<Order>, a List<Order>, a HashSet<Order>, or a MyFancyCollection<Order> as long as there is a way to create that collection type (this is the case in general for POCO collections). The following collection types are supported:

  • All concrete class property types (List<Order> for example).
  • Interface property types that are assignable to a List<> or a HashSet<> (so ICollection<>, IList<>, and ISet<> all will work)
  • Entities that supply the collection instance in the getter (so IMyFancyCollection<> will work as long as the getter returns an instance and the EF doesn’t have to create one).

    What is not supported is exposing an interface for a property type that requires a specific concrete type. For example, if the property type is IMyFancyCollection<Order>, the EF does not know which kinds of collection types are assignable to an IMyFancyCollection<T>.

    Change Tracking Proxy

    Change tracking is important to any data access layer because it controls which INSERT, UPDATE, and DELETE commands are sent to the database. Entities with full persistence ignorance do not track any changes, but instead rely on a persistence framework such as EF to figure out what changes have been made to each entity. This is typically done via a snapshot mechanism. When a query is done for an entity, a copy of all the entities scalar and relationship properties is made and stored in the framework. At certain “sync points”, a diff is done between the current values of the entity and the copy. This can be expensive both in terms of the memory to store all the snapshots and to perform the diff. This cost can be especially high if there are a large number of entities in a context. Change tracking proxies help with this cost by injecting the ability for an entity to report changes directly back to the state manager, so no snapshot or diff is necessary.

    In the following example, a query is done for a Customer c and the Name property is updated.

using (var ctx = new NorthwindEntities())
{
    Customer c = ctx.Customers.Single(x => x.Id == 1);
    c.Name = "New Name";
    ctx.SaveChanges();
}

When no proxy is used, the query will cache a copy of the entity before returning the result. When the Name property us updated, the ObjectContext ctx does not know about the change until SaveChanges is called (which calls DetectChanges, the EF’s mechanism to perform a diff, by default). If change tracking proxies are being used, the ObjectContext knows not to create the copy of the entity, and when the change is made to the Name property, the entity is able to notify the ObjectContext’s ObjectStateManager immediately.

Change tracking proxies work by intercepting the calls that update an entity. These are all of:

  • Scalar and complex property setters
    c.Name = "New Name";
  • Reference navigation property setters
    c.Detail = new CustomerDetail { ID = c.ID };
  • Collection navigation properties
    ICollection<Order> collection = c.Orders;

To ensure that no changes are lost for change tracking proxies, you are required to allow the proxy to intercept all of these calls. This poses additional restrictions around your entity beyond those of lazy loading proxies:

  1. All mapped properties must be declared virtual and be public or protected
    public class Customer
    {
        public virtual string Name { get; set; }
        public virtual CustomerDetail Detail { get; set; }
        public virtual ICollection<Order> Orders { get; set; }
    }
  2. Collection navigation properties must be of type ICollection<T> and must have a getter and a setter

Change tracking proxies are stricter about the type of collection properties. This is because a collection is a separate instance from the entity so changes can be made without the proxy noticing. The Entity Framework solves this by setting an EntityCollection<T> instance as the collection property; EntityCollection<T> already knows how to report changes to the collection back to an ObjectContext. This collection is always set when a proxy instance is created, either through a query or through the CreateObject<T> method on the ObjectContext. This is why a setter is required for collection properties. However, once that EntityCollection<T> instance is set, it cannot be changed and the proxy ensures this by overriding the setter and throwing an exception if you try. You can still set a collection instance to the backing field of the property if you need it to be non-null for testing purposes or non-proxy use.

How to tell?

When you have an instance of an entity type, how do you tell whether you are working with a proxy or not? The ObjectContext contains several APIs for working with proxies, and you can use these to determine if you have a proxy or not. Here is a method you can write to tell if you have a proxy or not:

    public bool IsProxy(object type)
    {
        return type != null && ObjectContext.GetObjectType(type.GetType()) != type.GetType();
    }

The question of whether you have a proxy or not is interesting, but the more important question is whether it matters that you are working with a proxy or not. In many cases it should not, the entity’s definition is still the same. However, the behavior of your entity is changing and this is significant. With lazy loading, you no longer have to check whether a relationship navigation property is loaded or not so it allows developers to bake in certain assumptions (my collection will always be non-null and loaded), which can be a nice simplification and lead to a better separation of concerns.

There are cases in which you might want to verify that lazy loading and change tracking still work after you have made changes to your entity types. In EF 4.0, the proxy mechanism is internal, mainly because we did not have enough time to come up with a good public mechanism and fully test it given our other priorities, but it is our intention to do this. There are ways to determine proxy behavior that rely on internal EF details (such as interfaces the proxies implement), but it is best to avoid these in case these internal mechanisms change when proxies become more extensible. The best way to determine the kind of proxy you have is to write a unit test like this:

    public void TestCustomerIsProxy()
    {
        using (NorthwindEntities ctx = new NorthwindEntities())
        {
            Customer c = ctx.CreateObject<Customer>();
            c.CustomerID = "AAAAA";
            c.CompanyName = "A";
            ctx.Customers.Attach(c);
            
            // Determine if the instance is a proxy
            // If it is, proxy supports lazy loading
            bool isLazyLoading = IsProxy(c);
            
            // Make a change, see if it is detected
            // If it is, proxy supports change tracking
            c.CompanyName = "B";
            bool isChangeTracking = ctx.ObjectStateManager
                                      .GetObjectStateEntry(c)
                                      .State == EntityState.Modified;
    
            Assert.IsTrue(isLazyLoading && isChangeTracking, 
                          "Wanted a change tracking proxy, but didn't get one");
        }
    }

Enforce Using Proxies

If you want to enforce using proxies throughout your application, one thing you can do is to make a protected parameterless constructor on your POCO object. This will prevent instantiation of your POCO classes in your stack and will require all entity class creation to go through the CreateObject<T> method on the ObjectContext class.

Proxy Equivalence

Another interesting point about proxies is their type equivalence: if you have a proxy for a Customer you retrieved with instance context1, can it be used later with a different ObjectContext instance context2? As mentioned earlier, in EF 4.0 a proxy type is generated for each .NET type and use in metadata. The point about metadata is important because metadata is what tells the Entity Framework which properties are the scalars and which are navigation properties from all the properties on the entity. The scenarios where this really matters are where you are re-using your entity types with different models, which is common when you have a large model that you have partitioned into several smaller models.

This means for a .NET type, Customer, using metadata in NorthwindEntities, a single proxy type is created and cached per AppDomain so it can be reused. If that .NET type, Customer is used with another model such as MyOtherModelEntities, a new proxy type will be created. To differentiate between these types, the pertinent metadata (the EntityType and the EdmProperties) is encoded using an SHA-256 hash and incorporated into the name of the type. This makes the type name look something like this:

System.Data.Entity.DynamicProxies.Customer_448F4B60EA59F871D551849C8E17777AED971E19438A72A130FF2D53E75C865C

As long as the .NET type is the same and the metadata contains the same information (same EntityType and EdmProperties), this name will be the same regardless of which machine the proxy type is created on. We’ll see why this is important later when we discuss binary serialization.

Change tracking of complex type properties

Similar to entity types, in Entity Framework complex types are mapped classes that have their own properties. However, complex types don’t have their own unique identity as entity types do. Typically, complex types appear in the model as properties on entities that contain nested members. The canonical example is the Address complex type property in the Customer entity type, where Address may have additional properties such as Street, City, Country, etc.

A change tracking proxy is capable of detecting when the complex type instance in a property gets completely replaced, like in this case:

    customer.Address = new Address() { Street = "8th street", 
                                       City = "Redmond", 
                                       State = "WA" };

But when a nested property is changed in the complex type instance, a snapshot comparison is used to detect the change instead.

var customer = ctx.Customers.SingleOrDefault(c => c.CustomerID == 1);
customer.Address.Street = "";
ctx.DetectChanges();
Assert.AreEqual(
    ctx.ObjectStateManager.GetObjectStateEntry(c).State,
    EntityState.Modified);

In the code above, for instance, a call to DetectChanges is necessary in order to detect that the Address.Street property has been modified. By default, DetectChanges is invoked implicitly in SaveChanges, so invoking DetectChanges shouldn’t be necessary unless you need things like the ObjectStateEntry.State property to return the right value before saving.

Proxy Specific API

While one of the goals of POCO Proxies is to transparently substitute regular POCO instances, in some circumstances it is necessary to recognize their presence and to deal with the proxy types themselves. For those situations, the Entity Framework includes a few new APIs that help deal with proxies.

ContextOptions.ProxyCreationEnabled

As explained previously, the ProxyCreationEnabled flag controls whether the Entity Framework runtime will attempt to create proxy instances. The setting governs the behavior of query results materialization as well as of the CreateObject<T> method.

CreateObject<T>

When proxy creation is enabled and a particular entity type fulfills the requirements to create proxies, queries with that result type will automatically return POCO proxy instances. However, if you use the “new” operator to create a new entity of the same type, you will obtain a regular POCO instance.

In order to obtain a new proxy instance without querying the database, for instance to attach it or add it to the ObjectContext, you can use the CreateObject<T> factory method.

The generic argument of this method can be any concrete reference type – abstract classes, interfaces and value types are not supported.

When an arbitrary CLR that is not mapped as an entity is passed to CreateObject<T>, the method will attempt to use any parameterless constructor to create and return a new instance of it. If however the T is a POCO type that matches an entity type in the metadata of the ObjectContext instance, CreateObject<T> will look at the state of ContextOptions.ProxyCreationEnabled to determine whether to return an instance of T or to attempt to create a proxy.

Note: CreateObject<T> will not throw an exception if the CLR type used in the generic argument doesn’t comply with proxy requirements. Instead, it will simply try to create and return an instance of the original CLR type.

CreateObject<T> is meant to be used in application code as a uniform way to obtain entity instances. If your application always uses CreateObject<T> instead of the operator “new”, then proxies will be created as appropriate depending on the current configuration of ProxyCreationEnabled and the conditions attained by the POCO type.

In addition to the version of CreateObject<T> in ObjectContext, there are two overloads of the method on ObjectSet<TEntity> that are more restrictive.

A non-generic overload of CreateObject exists on ObjectSet<TEntity>, which will simply attempt to return a proxy instance of type TEntity. A generic overload of CreateObject<T> on ObjectSet<T> can be used to return instances of types derived from TEntity, which comes handy on inheritance scenarios.

As an example, the following two calls are equivalent:

var b1 = context.CreateObject<Blog>();
var b2 = context.Blogs.CreateObject();

Note: CreateObject<T> will only create an entity instance; it will not add it or attach it automatically to the ObjectContext. If you want to add or attach the instance, you have to do so explicitly:

var b2 = context.Blogs.CreateObject();
context.Blogs.AddObject(b2);

GetObjectType

This static method defined in ObjectContext returns the actual entity type for a given proxy type. Similar to CreateObject<T>, GetObjectType works with any arbitrary type, but it has especial behavior when a proxy type is passed as the input. For any arbitrary non-proxy type, the method returns the same type it receives as an input. When a proxy type is recognized however, the method will return the corresponding original POCO type. In the following example, the method will return the type Blog:

var blog = context.Blogs.CreateObject();
var blogType = BloggingContext.GetObjectType(blog.GetType());

CreateProxyTypes

Normally, POCO Proxy types are generated lazily at run time: it is not until some query asks for results of a particular POCO type that the corresponding proxy type is created. Nonetheless, in some situations it may be necessary to create the proxy types in advance.

The CreateProxyTypes method can be used to trigger the creation of a set of proxy types without actually producing any instance.

This method can be useful when serializing an object graph across AppDomain boundaries, to make sure that the proxy types actually exist in the target environment. If the types didn’t exist, deserialization would fail.

This method will create the corresponding proxy types for the entity types passed as parameter, according to the metadata loaded in the ObjectContext. We will explain the usage of the method in more detail when we cover binary serialization of proxies. In the meantime, here is a simple example of how to use this API:

context.CreateProxyTypes(new Type[] { typeof(Customer), typeof(Order) });

GetKnownProxyTypes

This static method returns an enumeration containing all the proxy types that have been created so far in the AppDomain. In serialization scenarios, the method can be used to obtain all the types that the target environment should already contain and the serializer itself should recognize. Usage is as follows:

var knownTypes = BloggingContext.GetKnownProxyTypes();

In Summary

In this post we looked at lazy loading proxies and change tracking proxies as well as the tradeoffs you face when opting into using these capabilities. The next post in this series will be about serializing proxies and some of the issues you might face when trying to accomplish this as not all serialization technologies are built with proxies in mind. If you have any feedback on the general experience of using proxies, or more questions on our implementation, we’d appreciate hearing from you.

 

Jeff Derstadt & Diego Vega
Entity Framework Team