Welcome to MSDN Blogs Sign in | Join | Help

Transparent Lazy Loading for Entity Framework – part 2

This post is a part of the series that describes EFLazyLoading library.

As I promised last time, I would like to present the result of a little experiment in implementing transparent lazy loading for Entity Framework. You can download the sample code here, the rest of this post tries to explain how it all works.

Requirements

I set myself some goals:

a) Objects should be code-generated in a way similar to the standard Entity Framework code generation and the resulting code’s public surface should be similar. There will be some differences in the way collections and references are handled.

b) Collections should be represented by classes that implement ICollection<T> and should always be ready to use without “IsLoaded/Load”.

c) EntityReference<T> and EntityCollection<T> should be completely hidden from the user

d) Each (N-to-0..1) reference should be represented solely by a property where the type is the target object type (no EntityReference<T> properties in the object layer).

e) We don’t want to materialize the object at the other end of the relationship just to see whether it is null or not:

Order o;

if (o.Customer != null)
{
    Console.WriteLine("We have a customer!");
}

f) We don’t want to materialize the object if we don’t care about its properties (for example changing “Customer” navigation property on “o” does not require the Customer object to be loaded at all - today we can use EntityKeys to achieve similar thing):

Order o;
Order o2;
o.Customer = o2.Customer;

g) Each object must be able to live in two states: loaded and unloaded and the object must be able to load itself on first access to the property. Unloaded objects that haven’t been accessed are really just wrappers for the EntityKey, objects that have been touched have actual data:

Order o = ...;

if (o.Customer != null)
{
    // loads o.Customer on-demand
    Console.WriteLine(o.Customer.Address.City);
}

h) Object in the unloaded state should be as cheap to create as possible.

Implementation

Because each object has to be delay-loadable and cheap to create, we are representing a single entity as a pair of objects. One is the “shell” that has all the properties and navigation properties of an entity and the EntityKey and the other that holds actual data (minus the key).  Property getters and setters on the shell class delegate read/write operations to the data class which is lazily created (to conserve memory when not needed).

This is a pseudo-code that demonstrates this (_data management is not shown here – actual _data reference and entity key is held in the base class)

// shell class - has no fields to hold actual data, just 
// a reference to lazy-initialized data object –this will not compile
public class Order
{
    private EntityKey _key; // each shell has an identity 
    private OrderData _data; // reference to lazy-initialized data

    public int OrderID 
    {
        get { return _key.Something; }
        set { _key.Something = value; }
    }

    public DateTime OrderDate
    { 
        get { return _data.OrderDate; } 
        set { _data.OrderDate = value; }
    }

    public string ShipTo
    {
        get { return _data.ShipTo; }
        set { _data.ShipTo = value; }
    }

    public string BillTo
    {
        get { return _data.BillTo; }
        set { _data.BillTo = value; }
    }

    public Customer Customer { get; set; } // details not shown
    public ICollection<OrderLine> Lines { get; }
}

// data class - just a bunch of fields
internal class OrderData
{
    internal DateTime OrderDate;
    internal string ShipTo;
    internal string BillTo;
}

For objects in “unloaded” stage there is just one object (Order), for loaded objects “OrderData” is initialized so property accesses actually work. The first time user accesses the property getter or setter and _data is null, the data is brought from the store.

When the user navigates a {one,many}-to-one relationship we create a shell object that has only primary key initialized, attach it to the context and return to user. The Data object is not created at all and “_data” pointer is null. When a property is accessed for the first time, the data gets initialized by calling objectcontext.Refresh(StoreWins) which brings all properties and relationships into memory.

Collections are rather simple – all we have to do is return a wrapper over EntityCollection<T> that does Load() under the hood when the data is actually needed (for example in foreach()).

Implementation details

The implementation takes advantage of the fact that Entity Framework supports IPOCO. We introduce a base class called LazyEntityObject that all code-generated objects derive from, and that implements all interfaces required by Entity Framework (IEntityWithKey, IEntityWithChangeTracking, IEntityWithRelationships) and a new interface ILazyEntityObject. The implementation of these interfaces is done explicitly, which means that there is no single public API exposed on actual entity objects (not even EntityKey).

In the actual implementation (compared to the pseudo-code) the data class is an inner private class of each entity class and property getters and setters are implemented through statically declared Data Properties – a concept similar to WPF dependency properties. They are statically initialized with delegates that get/set actual data but perform all the needed operations under the hood (such as change tracking and lazy initialization). As a result everything is type-safe and there is no need to use reflection. Thanks to Colin for the idea!

With this in place the code generated for each property getter/setter is a simple one-liner, whether it is a simple property, a reference or a collection:

[EdmScalarPropertyAttribute(EntityKeyProperty=false, IsNullable=false)]
public Single Discount
{
    get { return Data.DiscountProperty.Get(this); }
    set { Data.DiscountProperty.Set(this, value); }
}

[EdmRelationshipNavigationPropertyAttribute("NorthwindEFModel", "Order_Details_Order", "Order")]
public Order Order
{
    get { return Data.OrderProperty.Get(this); }
    set { Data.OrderProperty.Set(this, value); }
}

The Data class itself is also clean (just a bunch of fields + static data properties) and all the hard work is done in the implementation of Data Property classes.

private class Data : ILazyEntityObjectData
{
    private Decimal UnitPrice;
    private Int16 Quantity;
    private Single Discount;

    // primary key
    public static DataKeyProperty<OrderDetail,Int32> OrderIDProperty = 
                  new DataKeyProperty<OrderDetail,Int32>(c => c.OrderID, (c, v) => c.OrderID = v, "OrderID");
    public static DataKeyProperty<OrderDetail,Int32> ProductIDProperty = 
                  new DataKeyProperty<OrderDetail,Int32>(c => c.ProductID, (c, v) => c.ProductID = v, "ProductID");
    // non-key properties
    public static DataProperty<OrderDetail,Data,Decimal> UnitPriceProperty = 
                  new DataProperty<OrderDetail,Data,Decimal>(c => c.UnitPrice, (c, v) => c.UnitPrice = v, "UnitPrice");
    public static DataProperty<OrderDetail,Data,Int16> QuantityProperty = 
                  new DataProperty<OrderDetail,Data,Int16>(c => c.Quantity, (c, v) => c.Quantity = v, "Quantity");
    public static DataProperty<OrderDetail,Data,Single> DiscountProperty = 
                  new DataProperty<OrderDetail,Data,Single>(c => c.Discount, (c, v) => c.Discount = v, "Discount");
    // references
    public static DataRefProperty<OrderDetail,Data,Order> OrderProperty = 
                  new DataRefProperty<OrderDetail,Data,Order>("NorthwindEFModel.Order_Details_Order","Order","Order");
    public static DataRefProperty<OrderDetail,Data,Product> ProductProperty = 
                  new DataRefProperty<OrderDetail,Data,Product>("NorthwindEFModel.Order_Details_Product","Product","Product");
}

Data Properties Explained

Each data property is statically initialized in the data class and has two methods: Get() and Set().

  • Get() takes a single argument – the shell object and returns the property value
  • Set() takes two arguments: shell object and new property value. It sets the property to the value provided.

There are 4 types of data properties:

  1. Simple properties (DataProperty class) that are responsible for getting and setting non-key, non-navigation properties
  2. Key properties (DataKeyProperty) that are responsible for gettings and settings properties that are part of the primary key (the values are stored in the shell class itself)
  3. Collection properties (DataCollectionProperty) that manage object collections
  4. Reference properties (DataRefProperty) that are responsible for getting and setting reference properties

Simple property (implemented in DataProperty.cs) makes sure that the data object has been initialized on-demand and delegates to ObjectContext.Refresh() to fetch object values and relationships. When setting property values, it calls ReportPropertyChanging and ReportPropertyChanged so that object state is properly tracked.

Key properties do nothing more than calling ReportPropertyChanging/ReportPropertyChanged in addition to getting and setting actual key values in the shell object.

Collection properties take care of initializing relationships in the RelationshipManager and wrapping the results with LazyEntityCollection<T> for load-on-demand functionality.

Reference properties are probably the most interesting ones, because they deal with stub objects. Whenever the user navigates a relationship that has not yet been initialized, a new stub object (that is just a shell without data) is created and attached to the object context. There is a little additional complication with handling polymorphic objects, because we need to know the concrete subtype to create based just on the EntityKey, but that is a story for a separate article.

Usage

Code generation application (EFLazyClassGen project in the sample solution) emits code that is meant to be a drop-in replacement for designer-generated code (namespaces and class names are the same). Just invoke that with two parameters:

EFLazyClassGen input.[csdl,edmx] output.cs

Only simple code generation is supported (for example multiple schemas are not) at this point and I’ve only tested this against NorthwindEF and AdventureWorksXLT schemas.

Generated classes have public interface similar to one generated by EdmGen - some notable differences are:

  1. EntityKey and EntityState members are not publicly exposed (you can still get to them by casting to IEntityWithKey)
  2. Serialization is not supported (no serialization-specific are generated). If you want to serialize lazy objects, you have to do this using DTO (Data Transfer Objects)
  3. There is no *Reference property on many-to-one relationships. It means there is no way to control the "loaded" state of related end, but that should not be a problem since everything appears to be loaded.

LazyObjectContext derives from ObjectContext and adds two new events, which can be used to trace the internal workings of EFLazyLoading:

  1. LazyObjectContext.StubCreated - occurs whenever new stub object is created
  2. LazyObjectContext.ObjectLoaded - occurs whenever delayed occurs occurs

See the samples for more information. There are also new LazyObjectContext methods:

  1. Reset(ILazyEntityObject) - which detaches and releases data object from a shell object - while keeping the object attached to the context.
  2. ResetAllUnchangedObjects() - does the same thing for all unchanged objects in the context - objects will be demand-loaded next time any of the properties is accessed.

In the ZIP file there is a help file (CHM) which has auto-generated API documentation (using Sandcastle). I hope this will be useful.

Lessons learned

The first and foremost lesson learned is that it is quite possible to have transparent lazy loading working with Entity Framework. Being able to write your own entity classes (provided that they adhere to IPOCO specification) that add functionality under the hood opens up a whole new world of possibilities.

Possible applications of this technique may include cross-ObjectContext object sharing & caching (that may be actually very simple, because you can easily share “Data” objects if you can only make them read-only and copy on write).

In the next post I will explain the object type cache (for managing EntityKey to concrete type mapping) and introduce additional extension methods that make it possible to write LINQ and Entity SQL queries that return stubs of objects.

Published Monday, May 12, 2008 1:49 PM by jkowalski

Comments

# Jaroslaw Kowalski : Transparent Lazy Loading for Entity Framework ??? part 1

# Entity Framework et Lazy Loading

Tuesday, May 13, 2008 2:19 AM by Matthieu MEZIL

Régulièrement quand je parle de l'Entity Framework, on me reproche très souvent l'absence de Lazy Loading.

# re: Transparent Lazy Loading for Entity Framework – part 2

Tuesday, May 13, 2008 5:15 PM by daveblack

Thank you so much for providing this "extension" to EF!  It makes my code much more simple and eases my worries about performance.

# Entity Framework and Lazy Loading

Monday, May 19, 2008 3:26 PM by Hot Topics

There has been a lot of discussions lately about Entity Framework and Lazy Loading as well as some solutions

# Transparent Lazy Loading pro Entity Framework

Tuesday, May 20, 2008 1:49 PM by x2develop: Jiří Činčura

Jaroslaw Kowalski napsal pěkné posty o tom, jak "vyrobit" transparent lazy loading v EF a připravil i

# re: Transparent Lazy Loading for Entity Framework – part 2

Friday, May 23, 2008 4:06 AM by jamal@mavadat.net

Brilliant job!

I loved the way you mixed generics and lamba expressions. Just reminds me those C++ template library designs, and yep, after few years we've got the basis for "beautiful" coding!!!

Jamal Mavadat

# Transparent Lazy Loading for Entity Framework – part 3 – Anatomy of a Stub

Wednesday, May 28, 2008 12:38 PM by Jaroslaw Kowalski

In two previous articles ( part1 and part2 ) I have introduced EFLazyLoading &#8211; a framework for

# Paplašināta Entity Framework

Sunday, June 08, 2008 8:07 AM by SiTox.NET

Ar dažām jaunām tehnoloģijām ir tā, ka tās izlaiž, parāda kaut kādas jaunas „features”, bet praksē tie

# More on Lazy Loading in Entity Framework

Friday, June 13, 2008 8:15 AM by Hot Topics

Jarek Kowalski continues his series on implementing Transparent Lazy Loading in the EF. &#160; Part 1

# Risks with Transparent Lazy Loading for EF

Monday, June 23, 2008 8:24 PM by WardB

I fear there are at least two problems with your approach, the first of which is most severe.

Risk of wrong result: If the foreign key id for the customer of an order is 42 but there is no customer with id=42, anOrder.Customer reports that the customer exists when, in fact, it does not. The moment you write anOrder.Customer.Name you will blow when you go to fetch the customer .. and get nothing.

[aside: you and Entity Framework have chosen not to use the NULL OBJECT pattern so you have to litter your code with tests for null entities. In this case, you postpone the day of reckoning.]

Inconsistency with EF: In Entity Framework, anOrder.Customer.Load returns null because it does the lookup .. and finds no match. So you are proposing post-load behavior inconsistent with EF's. That's a troublesome design choice.

Inconsistency within your framework: You are offering a virtual proxy for "reference" properties (order.Customer) but not collection properties (order.Details).

So what did I miss?

# Transparent Lazy Loading pro Entity Framework

Sunday, July 13, 2008 4:57 PM by Jiří {x2} Činčura

Jarek Kowalski napsal pěkné posty o tom, jak "vyrobit" transparent lazy loading v EF a připravil i prográmek

Anonymous comments are disabled
 
Page view tracker