Diego Vega

Entity Framework news and sporadic epiphanies

March, 2008

Posts
  • Diego Vega

    Unit Testing Your Entity Framework Domain Classes

    • 10 Comments
    Technorati Tags: ,

    One interesting question customers that are TDD practitioners usually ask is how to do unit testing with the Entity Framework using mock objects. That is, testing only the domain logic part of the model, without ever touching persistence logic or round-tripping to the database. Usual reasons you want to do this include:

    • Test performance
    • Size of the database
    • Avoid test side effects

    The saving grace for this approach is that persistence is a separate concern from the business logic in your domain, and so it should be  tested separately.

    Also, we test the Entity Framework a lot here at Microsoft. So, for customers using our code, it should be more cost effective to test their own code :)

    How easy it is to apply this practice to the Entity Framework depends heavily on how your code is factored. There are a few things to consider:

    Explicitly separate concerns

    If you want to unit test your domain classes (either IPOCO or code-generated classes fleshed with domain logic, since EF v1 does not support pure POCO classes), the first step is to push out of the picture all the code paths that define and execute queries against the database.

    That means that all code that deals with IQueryable<T>, ObjectQuery<T>, and IRelatedEnd.Load() needs to be encapsulated in a separate DAL component.

    I can envision a pattern in which such component exposes fixed function methods that produce entire object graphs based on specific parameters.

    As a simple example, we can specify an interface with all the necessary methods to get Northwind entities:

    Edit: I changed the name of the interface from INorthwidnContext to INorthwindStore to show that it is not necessarily something you implement in your typed ObjectContext.

        interface INorthwindStore  
       
            IEnumerable<Product> GetProducts
                    (int? productID, int? categoryID); 
            IEnumerable<Customer> GetCustomers
                    (string customerID, string customerName); 
            IEnumerable<Order> GetOrdersWithDetailsAndProducts
                    (int orderID);
        ...
        }

    Once defined, the interface can be implemented as methods that hydrate the object graphs from the database, but also as a mock that hydrates pre-built object graphs for your tests.

    Why not IQueryable<T> properties?

    There is a case for exposing IQueryable<T> properties directly (or ObjectQuery<T> properties as typed ObjectContexts do) instead of fixed function methods: The ability to compose queries in LINQ comprehensions gives much flexibility and is very attractive.

    However, not all IQueryable implementations are made equal, and the differences among them are only apparent at runtime.

    There are a number of functions that LINQ to Objects support that LINQ to Entities doesn’t. Also, there are some query capabilities that in EF v1 are only available to ESQL and not for LINQ.

    Moreover, there is no way to execute ESQL queries against in-memory objects.

    Finally, query span behavior (i.e. ObjectQuery<T>.Include(string path) method) would be too difficult to reproduce for in-memory queries.

    By implementing our query method as fixed function points, we are drawing a definite boundary at a more appropriate level of abstraction.

    The good news is that it is relatively easy get an IEnumerable<T> results either from a LINQ or ESQL query, and doing so does not imply loosing the streaming behavior of IQueryable<T>.

    You can simply return query.AsEnumerable() or (in C#) write a foreach loop that “yield returns” each element.

    What happens with lazy loading?

    When I say that a method must produce entire graphs, the real constraint is that once the method is invoked, client code should be safe to assume that all necessary objects are going to be available. In theory, that constraint can be satisfied with either eager or automatic lazy loading.

    EF v1 codegen classes only support explicit loading, but if you implement your own IPOCO classes or you manipulate code generation, you can get automatic lazy loading working.

    Still, the mock implementation should better populate full graphs in one shot.

    Edit: All this is said assuming you know that you want lazy loading even if this is at the risk of in-memory inconsistencies and extra round-trips. See here for an implementation of transparent lazy loading for Entity Framework.

    How to deal with ObjectContext?

    As Danny explains in a recent post, ObjectContext provides a number of important services to entity instances through their lifecycle, and so it is generally a good idea to keep a living ObjectContext around and to keep your entity instances (at least the ones you expect to change) attached to it.

    There are few approaches that would work:

    1. Encapsulate ObjectContext in your DAL component.
    2. Pass an ObjectContext instance in each method invocation to your DAL component.
    3. Maintain some kind of singleton instance available that all the code can share.

    For the mocking implementation, it is possible to initialize a context with a “metadata-only” EntityConnection:

    var conn = new EntityConnection(
        @"metadata=NW.csdl|NW.ssdl|NW.msl;
        provider=System.Data.SqlClient;");
    var context = new NorthwindEntities(conn);

    This will provide enough information for all but the persistence related functions of ObjectContext to work.

    One common concern about keeping an ObjectContext around is that it will keep a database connection alive too. However, ObjectContext contains connection management logic that automatically opens the connection when it is needed and then closes it as soon as it is not being used.

    What about CUD operations and SaveChanges()?

    Besides providing a launch point to queries, ObjectContext implements the Unit of Work pattern for EF. Most of the behavioral difference resulting from having your entities attached to an ObjectContext, only take place at the time you perform CUD operations (Insert, Update or Delete) or invoke the SaveChanges() method. This is when changes are tracked and saved, and then is when concurrency control is enforced.

    Invoking AddObject(), Delete() or changing property values on your entities from within your test cases should work without changes.

    In order for the mock DAL component not to hit the database every time SaveChanges() is invoked, we should redirect SaveChanges() to AcceptAllChanges().

    Most operations will work as expected whether the ObjectContext is fully connected or “metadata-only”. But to make things more complicated, there are some additional side effects we need to take care of:

    • SaveChanges() may trigger the refresh of store generated values.
    • EntityKeys on entities and EntityReferences may have different values after SaveChanges().

    To mitigate these issues, no code outside your persistence layer should rely on those side effects. A simple rule of thumb that satisfies this requirement is to start anew with a fresh ObjectContext every time you finish your unit of work.

    Also, EntityKeys should be dealt with only in persistence code or serialization code, not in business logic.

    Conclusion?

    It is actually premature to use the word “conclusion”. Mixing EF and TDD in the same pan is something I am only starting to think about. This is a set of scenarios that I want to see among our priorities for future versions.

    In order to come to a real conclusion, I need to at least develop a sample application in which I apply and distill the approaches I am suggesting in this post. I hope I will find the time to do it soon.

  • Diego Vega

    Entity Framework Extensions (EFExtensions) Project available in CodeGallery

    • 1 Comments

    When I announced the start of the Entity Framework Toolkits & Extensions section in CodeGallery, Colin already had a big chunk of what he is now making available in the works. And so I had it in my mind when I defined the Entity Framework Toolkits & Extensions as a collection of source code and tools to augment EF's capabilities and extend its reach to new scenarios.

    It took Colin 2 months to get some free time (readying a product for release is no easy task), to write some extra functionality (the custom materializer was first introduced a couple of weeks ago) and to get his code properly reviewed, etc.

    I would like to recommend you to go and download EFExtensions from the project page at CodeGallery, and then enjoy Colin's first blog post explaining some of the stuff the current EFExtensions are good for.

    By the way, one of my my favorite parts of the project is the EntitySet class and its GetTrackedEntities() method :)

    Alex already introduced Colin as one super smart colleague. In fact, I cannot stress enough how smart Colin is. He is the uber developer. But I must add that intelligence is not his only quality!

    Please, send us feedback on this. The most straightforward way is to use the Discussion tool in CodeGallery, but feel free to use the email links in our blogs.

    And expect some really cool new EF Tools & Extensions form Colin and other members of the team. I know what I am talking about! :)

  • Diego Vega

    Then, should I write a data access layer or not?

    • 1 Comments

    Danny and I appear to be giving inconsistent advice on this regard in our recent weekend posts:

    In reality, I think we had different scenarios in mind.

    Danny is talking about the general case, and he is absolutely right that the benefits of creating an encapsulated data access layer have diminished dramatically because of the Entity Framework. EF now provides a complete abstraction layer that isolates application code from the store and from schema differences.

    For many applications, ObjectContext is going to be the data access layer.

    But in my opinion, the benefits of the TDD approach and of Persistence Ignorance are reasons you may still want to go the extra mile. Also, there is an argument for avoiding having code that depends on a certain persistence technology all over the place.

    Whether the guidelines I am suggesting are enough, I would say it is a work in progress. Roger already noticed some inconsistencies in them (I wish I knew what the inconsistencies are in his opinion!).

    Moreover, Danny and I have participated in some conversations on ways to "have your cake and eat it too" when it comes to seamless use of TDD on EF.

    Edit: Added some context and corrections.

  • Diego Vega

    A different kind of sample

    • 1 Comments

    Samir, a developer in the Data Programmability Team started blogging today.

    He also published a sample application with a unique feature: It can switch between Entity Framework and LINQ to SQL for persistence. He actually uses a Strategy Pattern (my beloved one) to isolate the business logic from the persistence concern. He describes how it works here.

    If that is not novel enough, his application is a graphics editor named SketchPad...

    If you haven't already, look for other samples and extensions in the ADO.NET Entity Framework and LINQ to Relational Data Portal in CodeGallery.

    The people I work with never ceases to amaze me!

Page 1 of 1 (4 items)