Diego Vega

Entity Framework news and sporadic epiphanies

  • Diego Vega

    Stretching myself on the wrong axis


    Something you need to learn as a Program Manager at Microsoft is how to scale. This mean that you need to drive issues, multitask, excel at doing it, choose your fights, etc.

    Last week I tried a different approach that kind of worked when I was younger: stretching on the time axis. I found that it doesn't work for me as well as it used to do.

    So, this is my word of advise: If you build a big backlog, and you are executing under your expectations, don't stop sleeping. Two reasons:

    1. The more you sleep, the more clear your mind is when you are awake.

    2. The problem is you are doing something wrong. Either you are spending much time solving the wrong problems or the expectations are too high.

    So, if you are lucky enough to work in a place like Microsoft (with thousands of talented people around), do yourself a favor: Raise your hand, ask for help.

  • Diego Vega

    EFContrib: An Entity Framework Community Contribution Project


    I got the news today that Ruurd Boeke, a member of the developer community, has created an Entity Framework Contrib project in CodePlex. The project home is:


    The initial goal sounds like a good idea, but overall I am just very happy to see an EF Contrib project starting and I hope it will be very successful and why not, famous :)

    Evidently, I cannot talk for the owner of the project, but since in his own blog post he is inviting people to contact him or leave comments:

    If you are another member of the community and you planned or wished to contribute your ideas on how to extend the Entity Framework capabilities in such a community project, I encourage you to check it out and contact Ruurd and see what happens.

  • Diego Vega

    Then, should I write a data access layer or not?


    Danny and I appear to be giving inconsistent advice on this regard in our recent weekend posts:

    In reality, I think we had different scenarios in mind.

    Danny is talking about the general case, and he is absolutely right that the benefits of creating an encapsulated data access layer have diminished dramatically because of the Entity Framework. EF now provides a complete abstraction layer that isolates application code from the store and from schema differences.

    For many applications, ObjectContext is going to be the data access layer.

    But in my opinion, the benefits of the TDD approach and of Persistence Ignorance are reasons you may still want to go the extra mile. Also, there is an argument for avoiding having code that depends on a certain persistence technology all over the place.

    Whether the guidelines I am suggesting are enough, I would say it is a work in progress. Roger already noticed some inconsistencies in them (I wish I knew what the inconsistencies are in his opinion!).

    Moreover, Danny and I have participated in some conversations on ways to "have your cake and eat it too" when it comes to seamless use of TDD on EF.

    Edit: Added some context and corrections.

  • Diego Vega

    A different kind of sample


    Samir, a developer in the Data Programmability Team started blogging today.

    He also published a sample application with a unique feature: It can switch between Entity Framework and LINQ to SQL for persistence. He actually uses a Strategy Pattern (my beloved one) to isolate the business logic from the persistence concern. He describes how it works here.

    If that is not novel enough, his application is a graphics editor named SketchPad...

    If you haven't already, look for other samples and extensions in the ADO.NET Entity Framework and LINQ to Relational Data Portal in CodeGallery.

    The people I work with never ceases to amaze me!

  • Diego Vega

    Entity Framework Extensions (EFExtensions) Project available in CodeGallery


    When I announced the start of the Entity Framework Toolkits & Extensions section in CodeGallery, Colin already had a big chunk of what he is now making available in the works. And so I had it in my mind when I defined the Entity Framework Toolkits & Extensions as a collection of source code and tools to augment EF's capabilities and extend its reach to new scenarios.

    It took Colin 2 months to get some free time (readying a product for release is no easy task), to write some extra functionality (the custom materializer was first introduced a couple of weeks ago) and to get his code properly reviewed, etc.

    I would like to recommend you to go and download EFExtensions from the project page at CodeGallery, and then enjoy Colin's first blog post explaining some of the stuff the current EFExtensions are good for.

    By the way, one of my my favorite parts of the project is the EntitySet class and its GetTrackedEntities() method :)

    Alex already introduced Colin as one super smart colleague. In fact, I cannot stress enough how smart Colin is. He is the uber developer. But I must add that intelligence is not his only quality!

    Please, send us feedback on this. The most straightforward way is to use the Discussion tool in CodeGallery, but feel free to use the email links in our blogs.

    And expect some really cool new EF Tools & Extensions form Colin and other members of the team. I know what I am talking about! :)

  • Diego Vega

    Lazy loading in Entity Framework


    Recently, I wrote this little article that got published in the new Insights sidebar in MSDN Magazine. In it, I mention one of the fundamental tenets of ADO.NET: 

    *Network roundtrips should not be hidden from the developer*

    But guess what... It is not always the case that there is a network (or even a process boundary) between your application and your database. Also, there are many scenarios in which you know that most of your data seldom changes or that you don't care if things changes a bit while your application is running (think of a cache). In those circumstances, implicit lazy loading just makes sense.

    We have been sending out the message that you can get implicit lazy loading by changing the standard code generation process in Entity Framework.

    My colleague Jarek went far further and created an experimental set of entity classes that completely replaces the default code-generated classes in EF. And his implementation actually includes some very cool ideas that go beyond lazy loading.

    Take a look at his post. And you can find the bits (compatible with the just released Visual Studio 2008 SP1 Beta) from our portal in Code Gallery.

    Update: Just wanted to add some relevant links to customers asking for lazing loading in EF:

  • Diego Vega

    Exposing EDM and database server functions to LINQ


    Alex published today a description Colin and I wrote on a new feature the team has been working on for LINQ to Entities.

    Beyond all technicalities, it is a very simple and attribute-based way of exposing any arbitrary server-side function to LINQ. It goes beyond what LINQ to SQL does with SqlMethods and it leverages our metadata system so that you don't have to specify the full mapping of parameters in the attribute.

    The post itself may be a little boring ;), but the scenarios it enables are quite impressive.

    Read more here.

    Update: I remembered today that Kati Dimitrova and Sheetal Gupta also contributed to the document.

  • Diego Vega

    Beth Massi on Entity Framework + WPF


    I haven’t met Beth in person but I noticed her awesome blog posts and videos focused on using Entity Framework with WPF. Very useful stuff!

  • Diego Vega

    Server queries and identity resolution


    I answered a Connect issue today that deals with a very common expectation for users of systems like Entity Framework and LINQ to SQL. The issue was something like this:

    When I run a query, I expect entities that I have added to the context and that are still not saved but match the predicate of the query to show up in the results.

    Reality is that Entity Framework queries are always server queries: all queries, LINQ or Entity SQL based, are translated to the database server’s native query language and then evaluated exclusively on the server.

    Note: LINQ to SQL actually relaxes this principle in two ways:

    1. Identity-based queries are resolved against the local identity map. For instance, the following query shall not hit the data store:

    var c = context.Customers
        .Where(c => c.CustomerID == "ALFKI");

    2. The outermost projection of the query is evaluated on the client. For instance, the following query will create a server query that projects CustomerID and will invoke a client-side WriteLineAndReturn method as code iterates through results:

    var q = context.Customers
        .Select(c => WriteLineAndReturn(c.CustomerID));
    But this does not affect the behavior explained in this post.

    In sum, Entity Framework does not include a client-side or hybrid query processor.

    MergeOption and Identity resolution

    There are chances that you have seen unsaved modifications in entities included in the results of queries. This is due to the fact that for tracked queries (i.e. if the query’s MergeOption is set to a value different from NoTracking) Entity Framework performs “identity resolution”.

    The process can be simply explained like this:

    1. The identity of each incoming entity is determined by building the corresponding EntityKey.
    2. The ObjectStateManager is looked up for an entity already present that has a matching EntityKey.
    3. If an entity with the same identity is already being tracked, the data coming from the server and the data already in the state manager are merged according to the MergeOption of the query.
    4. In the default case, MergeOption is AppendOnly, which means that the data of the entity in the state manager is left intact and is returned as part of the query results.

    However, membership of an entity in the results of a given query is decided exclusively based on the state existing on the server. In this example, for instance, what will the query get?:

    var customer1 = Customer.CreateCustomer(1, "Tiger");
    var customer2 = Customer.CreateCustomer(2, "Zombie");
    customer1.LastName = "Zebra";
    var customer3 = Customer.CreateCustomer(100, "Zorro");
    context.AddObject("Customers", customer3);
    var customerQuery = context.Customers
        .Where(c => c.LastName.StartsWith("Z"));
    foreach(var customer in customerQuery)
        if (customer == customer1)

    The answer is:

    1. The modified entity customer1 won’t show up in the query because its LastName is still Tiger on the database.
    2. The deleted entity customer2 will be returned by the query, although it is a deleted entity already, because it still exists in the database.
    3. The new entity customer3 won’t make it, because it only exists in the local ObjectStateManager and not in the database.

    This behavior is by design and you need to be aware of it when writing your application.

    Put in some other way, if the units of work in your application follow a pattern in which they query first, then make modifications to entities and finally save them, discrepancies between query results and the contents of the ObjectSateManager cannot be observed.

    But as soon as queries are interleaved with modifications there is a chance that the server won’t contain an entity that exist in the state manager only and that that would match the predicate of the query. Those entities won’t be returned as part of the query.

    Notice that the chances that this happens has to do with how long lived is the Unit of Work in your application (i.e. how much does it take from the initial query to the call to SaveChanges).

    Hope this helps,

  • Diego Vega

    Third post about POCO, first post about Code Only


    It is always busy here with all the improvements we are doing in Entity Framework to make your code work better with it. That is why I haven’t been posting to my blog much in the last months. Today however, there are two important posts from people that sit very close to me, so I am going to link to them.

    Faisal posted the third part in a series on the POCO experience with EF4. His post delves into the details of how snapshot change tracking compares with notification based change tracking and on some of the API considerations for it.

    Alex, who sits in my office (although he likes to think I sit in his :)) made the first post about the Code Only experience we are working on. I like to think of Code Only as “POCO on steroids”, because it not only gives you the right level of decoupling between your domain classes and the persistence framework, but it also puts mapping artifacts out of the way. I am especially fond of the way you can customize mapping using LINQ queries, although that feature is not going to be included in the first preview.

    Please go read the posts, play with the bits (you will need to wait a few weeks to play with code-only) and tell us what you think!

  • Diego Vega

    Entity Framework and Data Services Teams are Hiring


    Just a quick note on this: Our team is hiring!

    If you think you have the skills and the will to improve how developers around the world deal with data in their applications, then this is a great opportunity to be in the forefront of the industry and also to become part of a nice group of geeks :)

    Click here and here to read the descriptions of the positions available, both in the role of Software Design Engineer in Test.

  • Diego Vega

    What would you like to see in Entity Framework vNext?


    With Visual Studio 2010 and .NET 4.0 very close to RTM, many of us in the team are spending more and more time brainstorming about the features and experiences that we would like to include in the next release of EF. I don’t think I need to tell how exciting that is :)

    During the development of the first two versions, one of the main sources of customer feedback has been the bugs and suggestions in Microsoft Connect. Up until now, whenever you filed a bug in Microsoft Connect for Entity Framework, it would typically take a couple of days for it to be routed to our own area in out internal TFS database. But today I heard the good news that we are getting our own page in Microsoft Connect!

    This will not only make our feedback channel more agile, but over time it will also make it possible for you to find all the feedback related to Entity Framework in a single place, and more easily vote for the features and capabilities that you care the most about.

    Looking forward for hearing from you!


  • Diego Vega

    Standard generated entity classes in EF4


    A customer recently asked if there is still any advantage in using the entities that Entity Framework 4 generates by default instead of POCO classes.

    Another way to look at this is: why are non-POCO classes that inherit from System.Data.Objects.DataClasses.EntityObject and use all sort of attributes to specify mapping of properties and relationship still the default in EF4?

    This perspective makes the question more interesting for me, especially given a great portion of the investment we made in EF 4 went into adding support of Persistence Ignorance, and also given that using POCO is my personal preference, and I am aware of all the advantage this has for the evolvability and testability of my code.

    So, let’s look from closer and see what we find.

    Moving from previous versions

    If you simply diff the entity code generated using the first version with what we generate nowadays, the first thing you will notice is that they haven’t changed much.

    The fact that there aren’t many changes is actually a nice feature for people moving from the previous version to the new version. If you started your project with Visual Studio 2008 SP1 and now you decide to move it to Visual Studio 2010 (i.e. the current beta), it is a good thing that you don’t have to touch your code to get your application running again.

    It is worth mentioning that many of the improvements in the new version of EF (i.e. lazy loading) were designed to work with all kinds of entities, so they didn’t really require changes to the code we generate.

    Even if you later decided to regenerate your model to take advantage of new features (i.e. singularization and foreign key support), you might need to do some renaming, and some things may be simplified, but most things your code do will remain the same.

    New code generation engine

    As soon as you look under the hood though, you will notice that we actually changed the whole code generation story to be based on T4 templates. This opens lots of possibilities, from having our customers customize the code to suit their needs, to have us release new templates for entity types optimized for particular scenarios. This last idea is exemplified in the work we have been doing in the POCO template and the Self-Tracking Entities Template included in the Feature CTP 1.

    At this point, we don't have plans to include templates for generating entities of other kinds in Visual Studio 2010, so the default, EntityObject-based template is the only one that is included “in the box”.

    Update: The Self-Tracking Entities Template will also be in the box in RTM of Visual Studio 2010. Current thinking about the POCO Template is that its going to be available as an add-in in the Visual Studio Extension Manager.

    Change tracking and relationship alignment

    It is also important that default entities enjoy the highest level of functionality in Entity Framework. To begin with, they participate in notification-based change tracking, which is the most efficient. Also, navigation properties on default entities are backed by the same data structures Entity Framework uses to maintain information about relationships, meaning that any change you make is reflected immediately on the navigation properties on both sides.

    By comparison, plain POCO objects do not notify Entity Framework of changes on them, and relationships are usually represented by plain object references and collections that are not synchronized automatically. To work well with POCO, Entity Framework needs to compare snapshots of property values and reconcile changes in linked navigation properties at certain points of a transaction. To that end, we introduced a new DetectChanges method that allows user code to control explicitly when that change detection happens, and we also added an implicit call to it in SaveChanges.

    As an alternative to that, we also introduced POCO Proxies that inject most of the change tracking and relationships management capabilities of default entities into POCO types by the means of inheritance. This kind of POCO Proxies are created only (basically) if you make all properties virtual in the POCO class and, if you need to create a new instance, you invoke the new ObjectContext.CreateObject<T> method.

    Again, why is non-POCO still the default?

    To summarize:

    a. Default code-gen classes provide the easiest path for people moving from the previous version

    b. When creating a model from scratch or from the database, you don’t even need to write the code for the entities themselves

    c. You never need to worry about invoking DetectChanges or about making sure your code always uses POCO Proxies

    d. Finally, if you really care the most about writing entities yourself, we make it very easy for you to opt-out of code generation and start writing your own POCO classes.

    I hope this information is useful. So, now what kind of entity classes are you going to use?

  • Diego Vega

    Colin explains a simple LINQ to Relational materializer


    Just a short note about this: You can find his article here. I had the chance to see his presentation before he went to DevConnections in Orlando. Very much recommended stuff!

  • Diego Vega

    Entity Framework Extensions Project Update


    Just a couple of links:

    Colin posted a refresh today today that is compatible with .NET 3.5 SP1 Beta and includes some optimizations for the materializer using dynamic methods. Here is his post about it.

  • Diego Vega

    EntityDataSource's flattening of complex type properties


    I explained a few days ago the rules of wrapping in this blog post. But why do we wrap after all?

    Julie asked for some details today in the forums. I think the answer is worth of a blog post.

    In ASP.NET there are different ways of specifying which property a databound control binds to: Eval() Bind(), BoundField.DataField, ListControl.DataTextField, etc. In general, they behave differently.

    The flattening we did on EntityDataSource is an attempt to make the properties that are exposed by EDM entities available for 2-way databinding in most of those cases.

    For instance, for a customer that has a complex property of type Address, we provide a property descriptor for customer.Address, and also for customer.Address.Street, customer.Address.Number, etc.

    At runtime, in the case of a control binding to Eval(“Address.Street”) from a customer, Eval will use the property descriptor corresponding to Address, and it will drill down on it to extract the value of the Street property on it.

    A grid column of a BoundField derived type with DataField = “Address.Street” will work differently: it will just look for a property descriptor in the data item with a name as “Address.Street”. In fact, EntityDataSource is the first DataSource control that I know off that will provide such a thing.

    Bind(“Address.Street”) will work in a similar fashion to Eval() when reading the properties into the data bound control, but will act a little bit more like BoundField when sending back changes to the DataSource.

    There are a few cases in which the behavior is not any of the above and hence you end up with a control that cannot have access to a complex type’s properties. You can expect us to work closely with the ASP.NET team in making the experience smoother in future versions. But for the time being, what you can do is create an explicit projection of the properties. For instance, in Entity SQL:

           SELECT c.ContactId, c.Address.Street AS Street 
         FROM   Northwind.Customers AS c

    I think it is worthy of mentioning:

    • Remember flattening of complex properties only happen under certain conditions (see wrapping).
    • We worked very closely with the ASP.NET Dynamic Data in this release, to enable their technology to work EDM through the EntityDataSource. I think it is very worthy of trying.

    Hope this helps.

  • Diego Vega

    Entity Framework Sample Provider Updated for SP1 Beta


    Just to get the news out: The updated version of the Entity Framework Sample Provider that is compatible with .NET 3.5 SP1 Beta is now available in our Code Gallery page. From the description:

    The Sample Provider wraps System.Data.SqlClient and demonstrates the new functionality an ADO.NET Provider needs to implement in order to support the ADO.NET Entity Framework

      • Provider Manifest
      • EDM Mapping for Schema Information
      • SQL Generation

    Update: for more details on provider API changes since the Beta3 release, you can read Jarek's post here.

  • Diego Vega

    Sample Entity Framework Provider for Oracle now Available


    This new sample builds on top of System.Data.OracleClient and showcases some techniques a provider writer targeting databases different from SQL Server can use.

    The code is not meant for production, just a sample directed to provider writers. It has also a few limitations related both to SP1 beta bits and with types not supported in OracleClient.

    For more details, read Jarek's post.

    You can download the source code from our home page in Code Gallery.

Page 2 of 2 (43 items) 12