Diego Vega

Entity Framework news and sporadic epiphanies

November, 2007

Posts
  • Diego Vega

    Entity SQL Non-Quirkiness

    • 8 Comments

    Zlatko has been posting about one LINQ to Entities new feature included in the upcoming Beta 3, so I decided to take revenge and make a 100% Entity SQL post. Here is something I ran against the other day:

    Let's assume we need to retrieve the Order with the maximum OrderID, which is a really awful way to get the ID of the order you just inserted! :)

    In your everyday store-specific SQL, you can use a MAX() aggregate function in a subquery as a WHERE predicate. In Transact SQL, it should look like this:

    SELECT *
    FROM   Products AS p
    WHERE
      p.ProductID =(SELECT MAX(p2.ProductID
                        
    FROM   Products as p2);

    So far, so good. If you have been playing a little with Entity SQL, you will probably guess how the equivalent Entity SQL would look like:

    SELECT VALUE p
    FROM   Products AS p
    WHERE  p.ProductID =(SELECT MAX(p2.ProductID
                        
    FROM   Products as p2);

    But if you run this query, what you get is this interesting exception:

    System.Data.QueryException: Argument types 'Edm.Int32' and 'Transient.collection[Transient.rowtype[(_##groupAggMax2,Edm.Int32(Nullable=True,DefaultValue=))](Nullable=True,DefaultValue=)]' are incompatible for this operation, near WHERE predicate, line 1, column 60.

    The subquery is actually returning a Transient.collection of a Transient.rowtype... Those are internal things, so for illustration purposes, let's turn to the LINQ perspective of life:

    var query = from p in context.Products
               
    select new { p.ProductID };

    int
    productID = query;

     

    (Argh, this post is no longer 100% Entity SQL!)

    No surprisingly, what you get is a compile-time exception:

    Cannot implicitly convert type 'System.Linq.IQueryable<AnonymousType#1>' to 'int'.

    Both exceptions are homologous, and for a text-based query language, Entity SQL happens to be very type-safe at its core. Standard SQL makes the basic assumption that it is ok to implicitly convert single-item collections of single-column projections to discrete scalars. We don't.

    The basic theme in Version 1.0 of the Entity Framework is to build a solid foundation for the future. As a consequence, one thing we avoid doing is "magic" type conversions except when they make perfect sense (think union of projection queries with exactly the same shape). The motive: magic conversions tend to mine the stability and composability of the language.

    That said, this buys us freedom to hand-pick certain implicit behavior in the future, if we find enough feedback and proof that it makes sense.

    That's enough on the rationale. Now, how do I make it work? There are two approaches.

    First:

    SELECT VALUE p
    FROM
       Products AS p
    WHERE
      p.ProductID = MAX(SELECT VALUE p2.ProductID
                            
    FROM   Products AS p2);

    This one works because:

    a) The SELECT VALUE returns the scalar itself, instead of a projection (rowtype) of the scalar.

    b) MAX() operates on the collection of scalars returned by the subquery, returning a single maximum value that will be directly comparable (same type) as ProductID.

    Second:

    SELECT VALUE p
    FROM
       Products AS p
    WHERE
      p.ProductID = ANYELEMENT(
              
    SELECT VALUE MAX(p2.ProductID)  
               FROM   Products AS p2);

    This works because:

    a) The subquery will return a single-item collection of a scalar value.

    b) ANYELEMENT will retrieve a single element (in this case, the only one) contained in the collection. That element will be directly comparable with ProductID.

    In case you are wondering now how efficient this is, don't worry. Entity SQL is still a functional language. So, while understanding the type reasoning is interesting and useful, these queries still express "what you want to get" rather than "how you want the job done".

    As a matter of fact, with our current SqlClient implementation, these queries will be translated to some simple, yet unexpected Transact-SQL. But I'll leave that to you as an exercise...

  • Diego Vega

    Choosing an Entity Framework API

    • 1 Comments

    Last month, a question was asked in the ADO.NET Prerelease forum that went more or less like this:

    Considering that there are many APIs you can use (Entity SQL, ObjectQuery<T>, LINQ to Entities), is there any guidance that could help me decide when to use each?

    The best I could do based on my knowledge at the time:

    It is matter of taste.

    While my answer was partially correct and had the great quality of being easy to look at, I immediately realized I should do a better job in helping people choose the appropriate API for each of their scenarios.

    I won’t pretend here to give the definitive and detailed answer, just a head start. You will find more information in our docs and I am sure this topic alone will easily fill a few chapters in upcoming books about the product.

    Service Layers and Query languages

    We basically support two distinct programming layers and two different query languages your applications can use:

    Service layers and query languages supported

    Query language

     

    Entity SQL

    LINQ Queries

    Service layer

    Entity Services

    Yes

     

    Object Services

    Yes

    Yes

    For those coming from the Object/Relational Mapping world, one easy way to look at our stack is to understand that we have two mapping tools layered one on top of the other:

    1. An Entity/Relational Mapper known as Entity Services.
    2. An Object/Entity Mapper named Object Services.

    Of course, once you have mapped your relational tables to entities and your entities to objects, what you get is a fully functional O/R Mapper.

    But as it is usual in our profession, adding a level of indirection uncovers a lot of power and flexibility :)

    First Service Layer: Entity Services

    The public surface of this layer is the EntityClient component, which is a new type of ADO.NET provider that gives you access to a store agnostic entity-relationship model of your data called Entity Data Model (EDM), and decouples your code from the store specific relational model that lives underneath.

    Besides a pair of new classes, the EntityClient contains most of the same types as previous providers: Connection, Command, DataReader, Parameter, Adapter, Transactions and a ProviderFactory.

    To be able to use this layer, you typically need three elements:

    1. ADO.NET provider that is specific to your database engine and has been extended to work with the Entity Framework. Basically, the extensions involve the inclusion of a detailed provider manifest,support for command objects consisting of command trees and the ability to generate store specific SQL from those command trees. An appropriate provider for SQL Server will be included with the Entity Framework, and various provider writers are working right now to give you access to non-Microsoft relational databases.
    2. Mapping information in the form of SSDL, CSDL, and MSL files that describe your storage model, your application’s conceptual model and the mapping among the two. More recently we have added EDMX, a format that packages all the mapping information in a single file at design-time.
    3. Queries expressed in Entity SQL (eSQL), which is a new dialect of SQL that delivers the power of the Entity Framework. Typically, the EntityClient will take a string containing eSQL everywhere your regular provider would accept a string containing store specific SQL.

    One advantage of programming against this layer is that being the first public surface intended for application development, it is also the most lightweight.

    Moreover, at this level you use full eSQL queries to obtain data readers and not actual entity classes. For this reason, we call EntityClient our “value” oriented programming interface. Neither the columns included in your rows, nor the source of your rows, nor the filtering, grouping or sorting criteria, are fixed at compile time. The query is just a string that we parse at run-time, and the results are just EntityDataReaders.

    All this makes Entity Services suitable for applications that today typically exploit the flexibility of writing dynamic SQL queries, like reporting, ad-hoc querying, etc.

    Notice however, that even when the EntityClient closely follows the traditional ADO.NET connected object model, you cannot get an ADO.NET DataSet on top. There are two main reasons for this:

    1. The DataSet does not have the necessary constructs to represent the variety of relationships the EDM can support.
    2. The EntityClient does not support the metadata protocols used to create the DataSet schema.

    Moreover, the Entity Framework currently lacks a string based data manipulation language, so you cannot directly express UPDATE, INSERT and DELETE operations in eSQL. Given this, our EntityAdapter is hardly any similar to the previous DataAdapters. We do not even derive it from the DbDataAdapter class!

    Second Service Layer: Object Services

    Object Services lives immediately on top of the EntityClient, and provides your application an Object Oriented view your data. Many public classes live in this space, but the two most important are ObjectContext and ObjectQuery<T>.

    ObjectContext

    This object’s main role is to encapsulate the underlying EntityConnection, and serve as a porthole for objects performing CRUD operations.

    When you choose to use our code generation, you get a type-safe ObjectContext that incorporates some methods specific to your data model.

    ObjectQuery<T>

    ObjectQuery<T> and its builder methods let you create queries in an completely object oriented way. It also provides a type-safe way to create queries. Most of the time, the shape and source of your data, the filtering, grouping and sorting criteria are known at compile time. So we call this our object-oriented programming interface.

    You can still use fragments of eSQL with many builder methods, but the idea here is that you typically use ObjectQuery<T> in an early-bound manner to build queries that get compiled in your application. Even more important, the results of those queries can be full entity classes or new types created for projections.

    First Query Language: Entity SQL

    Entity-SQL is a text based query language that currently gives you the most expressiveness over the Entity Framework stack on late-bound scenarios. You can use Entity-SQL to get collections of rows in the Entity Services layer, but also instances of entity classes, when used with Object Services.

    I highly recommend reading Zlatko Michailov’s Entity SQL post for a head start on the language and on its main differences with traditional SQL.

    Second Query Language: LINQ Query Comprehensions

    The Language Integrated Query is a set of strategic language extension Microsoft is including both in C# and VB that facilitate the creation of query expressions using a terse syntax familiar to anyone who has used SQL.

    LINQ is very powerful, and it is broadly applicable since it aims to solve the problem of querying any data source, including objects in memory, databases and XML files while maintaining a consistent, object-oriented and type-safe programming interface.

    For the Entity Framework, ObjectQuery<T> is the center of our LINQ implementation. This class implements the necessary interfaces to fully support the creation and deferred execution of queries comprehensions against our stack.

    We have invested a great amount of work in correctly mapping CLR features that can be useful in queries to our EDM and query capabilities. Still, LINQ and the Entity Framework are built and optimized against different goals and assumptions, and some concepts of LINQ and the Entity Framework simply do not map one-to-one.

    We certainly plan to continue investing in better alignment. But right now the reality is that there are some things you can do with Entity SQL that still cannot be expressed in LINQ, and there are a few things you can do with LINQ that still we cannot be translated or compose over in our LINQ implementation.

    Conclusion

    My original answer stays correct: Using one or other API to create your applications also has to do with a matter of taste. This is specially true thanks to the flexibility of ObjectQuery<T>, which allows you to mix and match start with query building methods that take eSQL fragments inside or LINQ queries. Just be aware that you could run into some corners scenarios in which we cannot completely go from one model to the other and back.

    Edit: The assertion that you can mix and match LINQ and ESQL was incorrect. Once you started one way, you have to keep in that route in ObjectQuery<T>.

  • Diego Vega

    Entity Framework FAQ

    • 2 Comments

    One of the best learning resources I have found since I joined my team at Microsoft is this page Danny Simmons has just published. Look here for his post explaining it. If you have any further questions that are not covered or you feel are not clear enough, feel free to add your questions in comments here or even email me.

    Danny already answers lots of questions everyday in the forums, and I like "learning by explaining", so give me a chance :)

  • Diego Vega

    Stretching myself on the wrong axis

    • 1 Comments

    Something you need to learn as a Program Manager at Microsoft is how to scale. This mean that you need to drive issues, multitask, excel at doing it, choose your fights, etc.

    Last week I tried a different approach that kind of worked when I was younger: stretching on the time axis. I found that it doesn't work for me as well as it used to do.

    So, this is my word of advise: If you build a big backlog, and you are executing under your expectations, don't stop sleeping. Two reasons:

    1. The more you sleep, the more clear your mind is when you are awake.

    2. The problem is you are doing something wrong. Either you are spending much time solving the wrong problems or the expectations are too high.

    So, if you are lucky enough to work in a place like Microsoft (with thousands of talented people around), do yourself a favor: Raise your hand, ask for help.

  • Diego Vega

    This is what happened since 10/23

    • 1 Comments

    I was in the kitchen close to my office having a coffee and mulling about what great thing I could do next about the blog. Then I came here and I saw this comment by Guillaume (a good friend in disguise), that translated from Spanish means "so, what happened since 10/23?". I guess I will try to lower the bar a bit and just tell what is happening:

    Well, I have been mostly learning what it means to be a PM at Microsoft, and in particular in Data Programmability. I have been sharing my days with this group of super smart people that is working on shipping this great product (we are close to beta 3). The atmosphere is very interesting and it is easy to be overwhelmed with the heaps of information I get exposed to everyday. Not only I am learning about the Entity Framework stack but also about our LINQ implementation, about our procedures, about our branching strategies, thread modeling, driving features as a PM, filing bugs, status reports, etc.

    I am waiting until I feel more confident with the stack before I do some more technical posting. I am ok with going with the basics, but I just don't want to post any inaccurate information.

    In the meanwhile, life here is nice. The family is adapting very well. We have most of our stuff solved, have a car, rent an apartment on the east side. I am riding The Connector everyday to work, so I can start early and finish late with email... Which is a good thing, isn't it?

    By the way, I see in Outlook that right now I have 1223 messages in my inbox and 443 in my sent items folder. Seems a little unbalanced :)

Page 1 of 1 (5 items)