Welcome to MSDN Blogs Sign in | Join | Help

Correction...

Greg Young pointed out that I inadvertently perpetuated the canonical model stuff when I said the following:

To be clear, one misconception is that we were trying to deliver on the old promise of canonical schemas, when we said one common model we meant a single shape for a given concern in the org (say Customer). Specifically what we mean is a common representation or metamodel through which one can reason about the various concepts. For example, if an ecosystem of services  (say Reporting Services, Sync, Analysis Services, Integration Services, Workflow) new how to reason about a common representation for an entity and a relationship then custom apps, packaged apps and database instances that exposed the metadata about their concepts in terms of this metamodel could play in terms of tooling, integration and the basic offerings of these services. The canonical example that we use is the idea of something like MS CRM or SharePoint exposing their metadata as EDM. If they could do so and if there was a mechanism for accessing these stores in terms of the EDM (say an ADO.NET provider like Entity Client) then one could get a reporting, synchronization and ETL experience over these solutions that would be consistent with what one could get over a SQL Server database.

I spaced on that paragraph. What I should have said is a common metamodel (entities and relationships) with which one could describe shapes. An example of a shape that could be described would be a Customer in terms of a Customer Entity. There may be "n" Customer Entity Types defined in a given enterprise ... the shape customer is not the thing that is interesting to be common. The notion of an Entity and the means for reasoning about Entities in a common way is interesting.

Thanks Greg... for pointing this out.

Tim M

Posted by timmall | 5 Comments

A Brief History in ENTITIES (... or what the Heck is this EF THing?) - part I

So, it seemed like it would be useful to talk a bit about why we built the EF and the EDM, what we think it is and where we think it is going.

A lot of docs exist on the EF and EDM. One can find many opinions on both all over the web. I wanted to take time to talk about them from my perspective as a person who has been on the team since its inception. This is merely my perspective and rationale, not intended to be spin. Hopefully it will clarify some things, hopefully it will promote more candid feedback and debate.

Starting at the beginning  (kind of) -> the making of the EDM

Back in the days when we were working on the Whidbey Release (VS 2005), I was in a team called MBF (Microsoft Business Framework). We were building an application framework for ISV's focusing on writing LOB applications. At the same time there was the ObjectSpaces team and the WinFS teams working on technologies which all had one common overlap... MBF, ObjectSpaces and WinFS all had some form of object persistence.

There was an attempt to rationalize these technologies (it is probably worthwhile to add that none of these technologies ended up shipping). One of the artifacts of the rationalization effort, however, was this thing called the Entity Data Model. The Entity Data Model (previously referred to as the Common Data Model) was an attempt to align the WinFS Item Data Model, the MBF data model and asks from across a series of partners (internal and external). The end goal of the Common Data model was to unify data models that were emerging across the company. For example, just within the SQL Server division there were data models for Reporting Services (SMDL), Analysis Services (UDM), System Management (SMO) and WinFS (IDM). These data models all represented one common characteristic, they tended to be one level higher than the logical model that people were defining for the store. Consider DBA's who were building logical models and deploying them (resulting in a particular physical data model), these new data models tended to be somewhat higher level and represented a more domain specific transformation of the shapes defined in the "authoritative" logical model.

The desire was to be able to provide a common data model that could unify the core concepts of these different data models so that we could unify the sets of services and our investments in tooling so that they could all be based on a single, higher level model. Furthermore, there was an expectation that if one could define a common data model then other services like ETL (via SQL Server Integration Services), Sync (via our Sync Framework) could then operate against a common data model. In fact, you could start building out a horizontal platform for data programmability in terms of this common data model.

To be clear, one misconception is that we were trying to deliver on the old promise of canonical schemas, when we said one common model we meant a single set of concepts with which one could describe nouns in an application (say Customer). Specifically what we mean is a common representation or metamodel through which one can reason about the various concepts. For example, if an ecosystem of services  (say Reporting Services, Sync, Analysis Services, Integration Services, Workflow) new how to reason about a common representation for an entity and a relationship then custom apps, packaged apps and database instances that exposed the metadata about their concepts in terms of this metamodel could play in terms of tooling, integration and the basic offerings of these services. The canonical example that we use is the idea of something like MS CRM or SharePoint exposing their metadata as EDM. If they could do so and if there was a mechanism for accessing these stores in terms of the EDM (say an ADO.NET provider like Entity Client) then one could get a reporting, synchronization and ETL experience over these solutions that would be consistent with what one could get over a SQL Server database.

So now you have the desire for a common data model -> shouldn't that be the CLR type system?

The most common question that was asked was why did there need to be a new data model why couldn't you just model everything in the CLR. The objections to using the CLR ended up along the following lines

  • Not all services discussed would be for the managed (.NET) platform
  • Most of the data models that were being aligned were relational models
  • Many of the teams did not need an object persistence solution... they needed the ability to expose metadata about their storage in a common representation
  • The concrete models that were defined were typically projections of a relational model and preserved relational semantics with a more domain specific shape

OK so you don't want to use the CLR as the core model, why did you have to introduce things like first class relationships

The existence of first class relationships was a hotly contested topic during the formative days of the EDM, many of the architects behind the EDM were rooted in the relational world. Simultaneously as we looked at most of the models that we were trying to align on, they all had notions of first class relationships that carried particular semantics and to which one could ascribe constraints. When one "binds to the CLR", for example by using the Entity Framework to retrieve objects from a persistent store the existence of relationships should not be a matter. Relationships can be surfaced in the CLR realization of an EDM model as navigation properties or collections on a class and thus one gets the experience of references and collection.

EDM Overview

From a developer’s perspective the Entity Data Model can be thought of as a way to define a model for a given application or system. For example, if one were building an application for a video library, the domain model may look something like:

Type: Video {ID, Title, Description, PublishDate, Actors*}
Type: Person {ID, FirstName, LastName }
Type: Actor: Person {Gender, Bio} 
Type: Customer: Person {Address}
Type: Rental {ID, Video, Customer, RentalDate, DueDate}

In the above case, these items look much like the “nouns” in the system. Many application developers would build such a domain model using UML or, in Visual Studio, some may start with the Class Diagram tool. Other developers like to start by designing the database or leveraging an existing database and then building a data access layer that surfaces the data up to their application in whatever shape they desire.

Note that both models represent the same concepts, but in very different ways. The class diagram reflects an application developer’s view of the abstractions; the database model represents a model that is better suited to data persistence.

Even though the shape of the models are different, the basic concepts used to define the models are the same. There is some notion of a “Thing” (class, table) and a notion of a relationship between “things” (associations). The Entity Data Model builds on these concepts to allow a developer to define a domain model that can map to classes and tables and that can be rationalized with other models. The basic building blocks of the Entity Data Model are Entity Types (analogous to the “thing”) and Relationships (which relate Entity Types). In the next few sections we will work through the various concepts in the Entity Data Model. The Entity Data Model, in version 1.0 of the product, can be represented in an XML form or can be edited using the Entity Framework Tools. For the purpose of this discussion we will show both forms.

Entity Types

Entity Types represents the first class nouns in the data model. Entity Types have the following characteristics:

Members:

An Entity Type has two types of members, properties and navigation properties.

Properties

Properties are first class members of the type, if one considers the example of the Video Type from the video library:

Type: Video {ID, Title, Description, PublishDate, Actors*}

The ID, Title, Description, PublishDate are all properties of the type. These properties can be primitive types or complex types (inline types analogous to structs) but cannot be other Entity Types or Collections of primitive or complex types.

<Property Name="ID" Type="Int32" />
<Property Name="Title" Type="String" />
<Property Name="Description" Type="String" />
<Property Name="PublishDate" Type="DateTime" />

Navigation Properties

Navigation properties provide a syntactic helper property on Entity Types, that surface the ends of relationships in an Entity Type. Consider the Types Video and Actor:

Type: Video {ID, Title, Description, PublishDate, Actors*}
Type: Actor: Person {Gender, Bio}

The Actors member on the Video type represents a collection of related Actor entities. In the Entity Framework, this set of related actor entities is defined by a relationship. The Navigation Property “Actors” allows someone to reason about the relationship from the perspective of the Enclosing type.

<NavigationProperty Name="Actors" Relationship="VideoLibModel.ActorVideo" FromRole="Video" ToRole="Actor" />

Distinct Identity:

Much like Primary Keys in a database, Entity Types have a distinct identity which is represented by members of the type. The identity of an Entity Type is represented by an EntityKey which identifies the properties that makeup the Identity. In the case of the Video type the Identity is represented by the ID property.

<Key>
    <PropertyRef Name="ID" />
</Key>

The entire specification for the video entity type can be represented as follows:

<EntityType Name="Video">
    <Key>
        <PropertyRef Name="ID" />
    </Key>

    <Property Name="ID" Type="Int32" />
    <Property Name="Title" Type="String" />
    <Property Name="Description" Type="String" />
    <Property Name="PublishDate" Type="DateTime" />

    <NavigationProperty Name="Actors" Relationship="VideoLibModel.ActorVideo" FromRole="Video" ToRole="Actor" />
</EntityType>

 

Relationships

One of the primary concepts of the Entity Data Model is the notion of first class relationships. Whereas one can surface the relationship between two types as a NavigationProperty, the actual model concept is a relationship. In the first version of the Entity Data Model the only type of relationship that can be defined is an Association. Consider the relationship between the Video and Actor, one could express this as:

Association{Video[*]:Actor[*]}

Where this is an Association between two Entity Types, Video and Actor each of which can have a multiplicity of many, hence a many to many relationship between Actor and Video.

<Association Name="ActorVideo">
    <End Type="VideoLibModel.Actor" Role="Actor" Multiplicity="*" />
    <End Type="VideoLibModel.Video" Role="Video" Multiplicity="*" />
</Association>

The EDM supports the following multiplicities on each side of the relationship:

  • Zero or One (0..1)
  • Exactly One (1)
  • Zero or More (*)

Complex Types

Complex types provide a named structural representation much like an Entity Type. The difference between a Complex Type and an Entity Type is principally that Complex Types do not have an explicit identity, cannot reference instances of EntityTypes and are only “reachable” via dereference from an EntityType.

Sets

Defining the core types is the first step. In order to take these concepts and actually build a model where one can reason about storage of instances we need to introduce the concepts of sets. Once one has a concept of sets it is often useful to define some construct that describes the closure of meaningful sets. Within the EDM these concepts are the Entity & Relationship Sets and the Entity Container.

Entity Sets, as the name implies, define the storage for instances of entity types. Entity Sets are nominal and typed, in other words an Entity Set has a name and declares the type of instances that can be contained within the set. Entity Sets are also polymorphic which means that a given Entity Set can store instances of its declared type and any derived types.

<EntitySet Name="Videos" EntityType="VideoLibModel.Video"/>

The above statement declares an Entity Set with the name Videos and of Type VideoLibModel.Video… if Video had any derived types these would be legal members of this EntitySet.

In the EDM there is no single Entity Set for instances of a type. If one desired, one could create multiple Entity Sets where one would store different instances. So, for example if one wanted to create an Entity Set for Cartoons and an Entity Set for Dramas one could do so:

<EntitySet Name="Cartoons" EntityType="VideoLibModel.Video"/>
<EntitySet Name="Dramas" EntityType="VideoLibModel.Video"/>

As relationships are first class concepts in the EDM, one must declare a relationship set for a relationship. The relationship set is the declaration which associates instances of a given set with instances of another set… it is done purely in terms of the storage (sets) as opposed to the types:

<AssociationSet Name="ActorVideo" Association="VideoLibModel.ActorVideo">
<End Role="Actor" EntitySet="People" />
<End Role="Video" EntitySet="Videos" />
</AssociationSet>

The above is the definition of the relationship set corresponding to the relationship between Actors and Videos. Note this presupposes that there exists an Entity Set called People and an Entity Set called Videos.

The construct that demarks the closure around Entity Sets and Relationship Sets is the Entity Container. An Entity Container is merely a named “thing” through which one can reason about or dereference a group of Entity Sets and Relationship Sets.

On the use of XML

The examples are in XML because the V1.0 version of the Entity Framework uses an XML representation of the EDM as its basis for representing EDM. As we move forward with the EDM and the Entity Framework we expect that there will be different representations of the EDM. For example one should be able to describe an EDM model in the CLR by convention and extend/specialize it with configuration (attributes or external). We also expect that particular partners will maintain their models in metadata repositories.

Posted by timmall | 4 Comments

Look Mom... no XML

We just wrapped up our first iteration of V2.

We are shooting to get another iteration in before PDC and are still working on how we can get early bits out to customers outside of the rhythms of the CTP's and such.

One of the nifty things in our first iteration, though, was some of the work that we did around POCO. There is a lot more to be done with POCO, we need to still deliver lazy load, value objects and more. With these bits, however, it is possible to write basic POCO code now.

For a simple experiment I decided to see if I could use these bits to provide a code-first experience with none of the XML artifacts we have in a typical EF application. The support for doing this without XML artifacts on disk is all implemented via public surface from V1....

Here is the end experience:

            Northwind northwind = ContextFactory.CreateContext<Northwind>(@"Data Source=.\sqlexpress;Initial Catalog=Northwind;Integrated Security=True");

            var prods = from p in northwind.Products where p.UnitPrice > 50 select p;

            foreach (Product p in prods)
            {
                Console.WriteLine(String.Format("[{0}] : {1} - {2}", p.ProductID,p.ProductName, p.UnitPrice ) );
            }

The Context and the Product Class are as follows:

    public class Product
    {
        public int ProductID { get; set; }
        public string ProductName { get; set; }
        public Decimal UnitPrice { get; set; }
    }

    public class Northwind : ObjectContext
    {
        public Northwind(EntityConnection conn) : base(conn) { }

        public ObjectQuery<Product> Products
        {
            get
            {
                if (null == _products)
                {
                    _products = base.CreateQuery<Product>("Northwind.Products");
                }
                return _products;
            }
        }

        private ObjectQuery<Product> _products;
    }

The above code has no requirement (for instances) on the EF.

The ObjectContext is nice to have as the session of interaction with the store... providing the ObjectQuery properties yields a target for formulating the LINQ queries.

The entry point for being able to use this code without artifacts is the ContextFactory. The ContextFactory reflects over the Context type that one wrote by hand. It looks for all properties of type ObjectQuery<T> and then uses these to define an in-memory representation of the models (conceptual, store, mapping) which are then passed to the metadata infrastructure.

    public class ContextFactory
    {
        public static T CreateContext<T>(string connectionString) where T : ObjectContext
        {
            MetadataWorkspace workspace = CreateMetadataWorkspace<T>();
            SqlConnection storeConn = new SqlConnection(connectionString);
            EntityConnection entityConn = new EntityConnection(workspace, storeConn);
            ConstructorInfo contextConstructor = typeof(T).GetConstructor(new Type[] { typeof(EntityConnection)});            
            return (T)contextConstructor.Invoke(new Object[] { entityConn});            
        }


        protected static MetadataWorkspace CreateMetadataWorkspace<T>() where T : ObjectContext
        {
            MetadataWorkspace workspace = new MetadataWorkspace();
            //--- build collections
            EdmItemCollection edmCollection = CreateEdmItemCollection<T>();
            StoreItemCollection storeCollection = CreateStoreItemCollection<T>();
            MappingItemCollection mappingCollection = CreateMappingCollection<T>(edmCollection, storeCollection);
            //--- register collections            
            workspace.RegisterItemCollection(edmCollection);
            workspace.RegisterItemCollection(storeCollection);
            workspace.RegisterItemCollection(mappingCollection);
            workspace.RegisterItemCollection(new ObjectItemCollection());
            workspace.LoadFromAssembly(typeof(T).Assembly);
            //--- done
            return workspace;
        }

        protected static EdmItemCollection CreateEdmItemCollection<T>() where T : ObjectContext
        {
            CodeFirst.CSDL.TSchema schema = new CsdlBuilder().BuildSchema<T>();
            EdmItemCollection edmCollection = new EdmItemCollection(new XmlReader[]{GetXmlStream(typeof(CodeFirst.CSDL.TSchema),schema)});
            return edmCollection;
        }

        protected static StoreItemCollection CreateStoreItemCollection<T>() where T : ObjectContext
        {
            CodeFirst.SSDL.TSchema schema = new SsdlBuilder().BuildSchema<T>();
            StoreItemCollection storeCollection = new StoreItemCollection(new XmlReader[] { GetXmlStream(typeof(CodeFirst.SSDL.TSchema), schema) });
            return storeCollection;
        }

        protected static StorageMappingItemCollection CreateMappingCollection<T>(EdmItemCollection edmCollection,StoreItemCollection storeCollection) where T : ObjectContext
        {
            TMapping schema = new MslBuilder().BuildMapping<T>();
            StorageMappingItemCollection mslCollection = new StorageMappingItemCollection(edmCollection,storeCollection,new XmlReader[] { GetXmlStream(typeof(TMapping), schema) });
            return mslCollection;
        }

        protected static XmlReader GetXmlStream(Type schemaType, Object schema)
        {
            XmlSerializer schemaSerializer = new XmlSerializer(schemaType);
            MemoryStream schemaStream = new MemoryStream();            
            schemaSerializer.Serialize(schemaStream, schema);
            schemaStream.Position = 0;
            XmlReader reader = XmlReader.Create(schemaStream);           
            return reader;
        }
    }

The "builder" classes (SsdlBuilder, CsdlBuilder, MslBuilder) merely use the Type that is supplied to infer a model and return an in-memory representation which can be serialized to our XML representation. We are looking at public mutable API's and code-first surface in V2 which would allow people to do this without having to roll a separate OM.

As soon as we get the latest EF bits out in the wild I can share the sample so that people can play with it - it is hacky PM code but it illustrates the usage of the public surface to have an alternative representation of the requisite metadata. It is also a good exercise to illustrate that for the ORM developer scenario, the EDM and related artifacts can be perceived as implementation detail.

Posted by timmall | 5 Comments

Newsflash: EF V1.0 was not intended to be a NHibernate compete

I spent some time this week with Scott Bellware. He and Greg Young have been in town talking to the EF team about Test Driven Development, Behavioral Driven Design and Domain Driven Design.

On Tuesday night we went out to dinner and had a long chat. Despite the desire on both of our parts to insult each other every few minutes, I think the conversation was largely productive. The events of the evening plus my preparation for the Advisory Council made me think about going into some more detail around the EF, where we are today, where we want to go and rationale therein.

This is not intended to be "spin" - a word I have heard a lot recently. This is the opinion of a PM on the team that has been at this since pretty much the beginning.

Back to the point of this post:

We did not set out to build the NHibernate compete product in V1.0.

We did set out to execute on the first part of a longer term strategy around the EDM and the set of services that could be delivered therein. ORM happens to be one of the scenarios and the one that you tend to most closely approximate as you build out the foundation.

One of the major points that Scott repeated at dinner was that he and his compatriots feel that people would just blindly adopt the EF because it was a part of the Framework and that this would set them back because they have moved on with their approach to software development and the EF does not meet their requirements. Having the EF in the market creates noise for them because it becomes a technology choice between a Microsoft product that does not yet address this school of software development and open source solutions that do.

There are schools of software development that will do just fine with the EF. There are others where the approaches at hand would require developers to compromise their abstractions and their approaches. People should just be intentional and not try to use the EF as a wonder hammer to hit all potential projects. As with any other technology, it should be evaluated in the context of the project constraints and attributes.

I shall attempt to make a series of posts leading up to and following the advisory council where I go into a bit of the history and future. This will not yield a decision matrix for a developer but it may be interesting for folks.

Posted by timmall | 8 Comments

The Great Entity Smack down

There is a wonderful debate going on right now on the entities wiki. You need only read some of the comments on the front page to see that different people are sharing their perspectives on application architecture, methodologies and patterns.

For a couple weeks now I have been stewing over a wacky idea that I think the time is right for. I would like to propose that we should create "The Great Entity Smack Down". My thought is that we could go get a bunch of folks that represent different perspectives in the community in a public forum. We put together a fictitious development project (oh let's say pet shop for grins) and we get representatives of different camps to discuss/debate how they would approach the project.

I am going to shake the trees here in Redmond to see if we can host the event. I would love to start a discussion on what people think of the idea, what groups should be represented, what the format should be and how we go about doing it. I think hosting it in front of a .NET user group and advertising it fairly broadly to different communities would be an interesting venue.

Chime in... provide some ideas... let's get it on.

Tim M

Posted by timmall | 5 Comments

Alex James is on point for the EF Design Blog

Although we want all folks from the team to be able to post content as they saw fit on the EF Design Blog, we wanted someone on point to be the person who did actual care & feeding of it. Alex James, one of the PM's on the team is the person on point. Alex just posted a 1-pager on computed properties on the design blog.

The computed properties post is an interesting post because it is an example of some of the non-ORM'ish work that we are doing in V2. Funny thing was that we have this somewhat schizophrenic being. On the one hand we solve a number of ORM scenarios - to be fair, we are largely about solving ORM-like scenarios today. On the other hand, however, we are really trying to build something different. In the fullness of time we are trying to align the conceptual models and infrastructure that a lot of the data services (Reporting & Analytics, Sync, ETL...) use.

Suspend the perspective of today for a second, ignore the debates on reuse of models or applicability of an apps OLTP model for decision support. Here is the thing, if Microsoft can provide a single representation for these services with common tooling and integration then we provide a better developer platform. Sometimes we get caught up in the technology snapshot of today and it is hard to see the forest through the trees. Sometimes we have to go out on a limb and say this is not something we can do in just one release, it requires the laying of a foundation and then alignment across multiple teams and release cycles to provide the real value. I am really excited that we have the first part of the foundation down. I have been on this team for almost 4 years now and it is great to see where we have come and how our partnerships internally are starting to round out the overall platform vision. As mentioned in previous posts, we are working to round out the rest of the foundation plus work on the core developer (ORM, N-Tier...) scenarios and we should be putting out more pieces on the design blog as we progress. Look for more posts from Alex and the team in the weeks/months to come.

Posted by timmall | 2 Comments

New Wiki

So, 

I just threw up a new wiki to collect patterns and practices from the developer community that can help inform our understanding of real world developer scenarios.

Where did this come from? ... I had a good thread with Scott Bellware where he was providing some feedback on his perspective. A bunch of us on the team also exchanged some mail with Greg Young about the POCO feature that we pushed out on the EF design blog. After these conversations I was wondering how we could catalog a bunch of the concepts that we were discussing. The concepts were really reflective of the ways that people applied practices like DDD and TDD in the real world today. Sure, we could continue to read books, blogs and such and engage with customers in the way that we have. We could also start thinking about other ways to engage with the community in different ways.

Why not just mine other sites? ... well we could, I figured it would be an interesting exercise to create a more directed, intentional aggregation of patterns and scenarios specifically targeting how developers want to interact with data and how this surfaces in the ways that they build apps. I would love people to share thinking around how they build data access layers, how they define domain models, how they test, how they expose data services and provide solutions for things like data aggregation, synchronization and offline scenarios.

I have invited a number of folks to come in and start contributing, hopefully they shall. I would love all interested folks to come on by.

Posted by timmall | 4 Comments

To Lazy Load or not to Lazy load?

I just exchanged email with Martin Fowler about the term Lazy load. The interpretation that we on the Entity Framework team had about Lazy Loading was that on a given query we would not "eagerly" load an entire graph (i.e. load a customer, their orders, order lines and products...) but instead would, by default, retrieve a shallow version of the queried instances.

 

I believe this definition holds true but is incomplete, according to feedback from Martin. Per our exchange, Martin indicated that the notion of lazy load expects that one does not have to do anything beyond dereference operations to retrieve the related instances. As a result this coding pattern should work:

            using (MSPetShop4Entities context = new MSPetShop4Entities())
            {
                var prod1 = (from p in context.Product select p).FirstOrDefault();
                Console.WriteLine(prod1.Category.Name);                
            }

In the Entity Framework we were concerned that people using our framework would not be aware that the call prod1.Category.Name above would result in a query to the store. The result was that we require that the person be explicit about making the call to the store before doing the dereference:

In order for someone to retrieve the related instances one would call the ".Load" method on the reference or collection to indicate that one was indeed willing to execute a subsequent query to the store.

            using (MSPetShop4Entities context = new MSPetShop4Entities())
            {
                var prod1 = (from p in context.Product select p).FirstOrDefault();
                prod1.CategoryReference.Load();
                Console.WriteLine(prod1.Category.Name);                
            }

I already mentioned in an earlier post how this can cause a leaky abstraction for people trying to abstract EF. We are looking at this pattern in V2. The interesting thing is whether the second example is lazy loading or not. We had been calling this "explicit lazy loading" but, to Martin's point it is an overload of a clear term and we should not do that. I think we will attempt to refer to this as Explicit Loading without introducing the "lazy" overload to try and be clear moving forward.

Posted by timmall | 8 Comments

POCO Prototype Video...

We just threw up a screencast on the EF design site to complement the POCO feature design notes. This was a prototype that Mirek (one of the developers on the team) has been doing around a full POCO enabled state manager.

I am hoping that we can get early builds out of some of this work in the next month or so. We are still trying to figure out how to drop early and often outside of the CTP rhythms and are looking at what the ASP.NET team has been doing with their bits.

Posted by timmall | 1 Comments

First Thing on the EF Design Blog

We just pushed the first piece of content to the EF Design Blog. This one is a "Feature Design" posting about the POCO feature that we are working on for V2. Sometime today we hope to get a quick screen cast of a prototype up. We will actively try and get more content either at the feature definition or design spec level throughout the project lifecycle.

Posted by timmall | 1 Comments

Vote of No Confidence

So,

It's been a long, long time since I have posted anything on my blog. Reality is I tried to maintain a blog where I thought I could come up with wonderfully profound things to share with the world but clearly that was not the case. Having said that a few events have happened that prompted me to start blogging again.

1: We are about to ship the first version of EF
2: We have an interesting thread going on about a "
vote of no confidence" about our product
3: We are starting work on our next version of the product

Seems like this would be a good time to throw myself out there. First, to respond to the general thread in the community and secondly to be more available for conversations around the current and future versions of the product.

Anyway... I will attempt to be more available and present. People can always just ping me timmall@microsoft.com directly about any of this.

In response to the community threads I will paste a response here that I made, yesterday, on an internal thread on the topic....

The unfortunate reality is that these are scenarios that we care deeply about but do not fully support in V1.0. I can go into some more detail here. One point to note is that the choices on these features were heavily considered, but we had to deal with the tension between trying to add more features vs. trying to stay true to our initial goal which was to lay the core foundation for a multiple-release strategy for building out a broader data platform offering. Today, coincidentally, marked the start of our work on the next version of the product, and we are determined to address this particular developer community in earnest while still furthering the investment in the overall data platform. Here is my take on the points below and things we are thinking about moving forward:

INORDINATE FOCUS THE DATA ASPECT OF ENTITIES LEADS TO DEGRADED ENTITY ARCHITECTURES

In v1.0 we chose to support default scenarios where one would have a distinguished based class (our EntityObject) and we would provide a mapping solution from these classes to our EDM model and from the EDM model to the database. The reason for this was because the investment we were making was in terms of the EDM – a new conceptual data model that allowed people to describe the shape of their data in terms of their domain instead of the layout in the database. Our goal is to allow people, in the fullness of time to be able to have a set of common services in terms of the Entity Data Model. Today they have query (ESQL) and REST based Services (Astoria describes its metadata using the EDM), they also get a first generation Object Persistence facility with the Object Layer component in the Entity Framework. The Object Layer, of course, also comes with our first class LINQ implementation.

There are a couple of interesting things here…

1: We expose the start of a pure value based API using ADO.NET (Entity Client) so that you can issue ESQL queries against the database in terms of your domain model, in a store agnostic manner (i.e. using ESQL instead of TSQL of PLSQL…) . There are many partners in the building for whom this is the only way they would want to leverage the EF and the EDM. Partners that fall in this camp tend to be interested in queries and presentation, data transformation or stores where the schema is frequently changing (Universal Table implementations) so that relying on CLR classes becomes intractable.

2: The focus on the EDM for the ORM developer can appear to be a red herring, as we heard from customers. The EDM describes the shape of the domain objects but does not allow for encapsulation of behavior and provides limited constraints and no specific grammar for action-semantics. In order to define a domain model where one wanted to leverage the EDM and EF today one would have to do one of the following:

  • Define an EDM model, code-gen classes and implement logic in partial classes.
  • Define an EDM model and use this as a pure DAO with a parallel set of domain objects – you could abstract the EF via many patterns… the common one we here is the repository pattern (gotchas will be pointed out in the discussion of points below).
  • Define an EDM model for query and update and then use LINQ to project into Domain Objects…   You get a case for simpler query expression and use LINQ for a projection facility to retrieve and transform from the EDM objects to the true Domain Objects.

I think there are a couple of things we are trying to do in the next version that can help here, and we actively want the feedback of folks in the community to help make sure we do the right thing:

  • Code-First scenarios…  In the code first scenarios we allow people to define their own classes (POCO) which they can work with as they see fit. We then infer a model from the classes. For situations where one desires to have a model that cannot be inferred directly from the classes, developers will be able to use new CLR attributes or an external mapping specification.
  • Code-Gen Pipeline… For some framework scenarios it is desirable to start with an EDM model (possibly with custom annotations) and then generate either the DB, the classes or both. We are looking at a general pipeline that developers can plug into that will give a lot of flexibility in generating classes and databases as well as in the way a model can be derived from the database, CLR classes or some other form (maybe a model repository built by a third party, a UML tool, or some other artificat like sharepoint).
  • Adding action-semantics to the EDM...  We have looked at things we can do here. There are things that would be quite nice but likely they would come at a cost of addressing particular feedback we have heard or our other investments. We will share thoughts and designs on these and see what people think regardless.

EXCESS CODE NEEDED TO DEAL WITH LACK OF LAZY LOADING

The EF, today, does support lazy loading. It does not support IMPLICIT lazy loading… this distinction is subtle but important. In EF we, by default, do not load ends of references or collections we wait until asked to load these. Today, however, one must explicitly ask by calling a .Load method. We took a fairly conservative approach in v1.0, because we wanted developers to be aware of when they were asking the framework to make a roundtrip to the database… our take on “boundaries are explicit”. When one builds out a repository pattern or starts to abstract the EF in some other form of data access abstraction this can become an issue because the need to call an explicit method to perform a lazy load operation causes the EF abstractions to bleed up and now you are struggling with an intersection of concerns as opposed to a clean separation of concerns.

We have heard this feedback and are looking at supporting optional implicit lazy loading as well as other strategies around eager loading (such as general LoadOptions ala LINQ to SQL) in a future release of the EF  Until then, it is possible to build implicit lazy loading on top of the explicit mechanisms which the EF support in v1, and one of our team members has published to his blog and code gallery a sample which demonstrates how this can be done.

SHARED, CANONICAL MODEL CONTRADICTS SOFTWARE BEST PRACTICES

There seems to be confusion around this topic. We are not recommending that folks return to the days where we were evangelizing the use of XSD for “canonical schemas”. I don’t believe that people think that this is tractable. What we do believe, however, is that it is desirable to have a single meta-model (EDM if you will) with which you can describe many domain models and that by having a single grammar we can provide a set of common services on any given domain model. For example, consider an application that is to be written against a database with 600 tables. Do I believe that this app should have a single model with 600 Entity Types in it? No… Furthermore, do I believe that any given domain entity (say Customer) has only one shape in that app and that this shape must be the canonical shape for the entire Enterprise?… Heck no.

I would expect, however, that with a common way to describe these models I could do some interesting things in the fullness of times…

  • I could use reporting services over any instance of one of these models and define reports in terms of the domain entities instead of the underlying tables.
  • I could perform ETL tasks between two stores in terms of entities.
  • I could write sync services between a local store and a remote store where the sync contract and programming model on both sides are in terms of EDM.
  • If I invested in learning how to build a model to use with some tool like reporting services, I would not have to learn a new tool and model description language when it comes time to build a model for one of the other tools.

This is the world that we are working towards… We are not there yet, but a number of these scenarios are what we are aggressively pursuing now.

LACK OF PERSISTENCE IGNORANCE CAUSES BUSINESS LOGIC TO BE HARDER TO READ, WRITE, AND MODIFY, CAUSING DEVELOPMENT AND MAINTENANCE COSTS TO INCREASE AT AN EXAGGERATED RATE

Agreed. There are different developer segments for whom PI is either important or not. We do not have a good story about PI today. We are working on one. Based on customer feedback from over a year ago, we started work on PI, but we said from the first that we would be unable to complete that work in v1 given the other demands of the release.  We made some initial steps for v1, and then began planning future steps for the next release.  At the last MVP summit we presented an early look at one of the techniques we were investigating which involved performing IL rewrite tricks to support these scenarios. We got strong feedback from many of the signatories on the “vote of no confidence” letter that this was not the right approach, and as a result we are not doing the IL rewrite in the next release. Instead we are doing a full POCO implementation that we hope to get feedback from the community on.

Tomorrow we will post some of the initial thinking on the new “EF Design Blog” including a little video of some of the initial dev work we are doing here. It is worthwhile pointing out that this work is a direct response to the feedback we have gotten from this community.

EXCESSIVE MERGE CONFLICTS WITH SOURCE CONTROL IN TEAM ENVIRONMENT

At the end of the day, this is feedback that we have heard, tried to address and are actively addressing moving forward. We are taking steps that we think should help in general:

1: We intend to have a more transparent design process.
2: We have a new advisory council to be more proactive in getting thoughts on our work and our directions.  This advisory council has some key community folks participating:

    Eric Evans - http://www.domainlanguage.com/about/ericevans.html 
    Stephen Forte -
http://www.stephenforte.net/ 
    Martin Fowler -
http://martinfowler.com/ 
    Pavel Hruby -
http://www.phruby.com/ 
    Jimmy Nilsson -
http://jimmynilsson.com/

3: We intend to drop more frequent interim builds. These builds will be unsupported interim builds where people can see what we have rolled into the product earlier and more often. We are still working the logistics of this out.
4: For our internal development we will be doing significantly shorter iterations.  We had quite long milestones in Orcas which made it more difficult to respond to customer feedback, and we hope that the transparent design process, the advisory council and the frequent drops will give us more feedback, and that with shorter iterations we will be more capable of responding.

Here is the thing though, even though we attempt to address the feedback, we know we won’t get everything.  From a product ownership perspective it pains me greatly to read the “no confidence” letter.  Especially as we have spent so much of the last couple of months working internally on how we can get better.  It is ironic that we find out about this letter the same week that we were already planning to roll out our goals for our engineering process for this next release, but such is life. Many of the signatories have given us great feedback, and I only hope that they will continue to do so as we proceed.

 

 

Posted by timmall | 36 Comments

Databinding with ADO.NET Entity Framework

So,

 

I am a self-admitted “value layer” bigot. I spend a lot of my coding time in ADO.NET Entities building stuff on top of our value layer (in other words working directly against Entity Client without using our object abstractions). Recently however I have been having a lot of fun writing apps at the Object Layer. One thing that I found I needed to do to make my life easier is have a more general purpose data binding solution until we have our production data binding support in place. For what it’s worth our next CTP should have data binding in place so my post today should not be relevant after that.

 

Anyway… I decided that what I would do is derive from BindingList<T> and have a trivial binding list implementation that was aware of our Entities and our ObjectContext to make my life a little easier… not much code but here goes:

 

    /// <summary>

    /// Trivial extension of BindingList<T> for working with Entities

    /// </summary>   

    public class EntityBindingList<T> : BindingList<T> where T : Entity

    {

        /// <summary>

        /// reference an ObjectContext so that we can do saves

        /// </summary>

        private ObjectContext context;

 

 

        /// <summary>

        /// Constructor that takes an ObejctContext as an argument

        /// </summary>

        /// <param name="context">The current ObejctContext being used to interact with the persistent entities</param>

        public EntityBindingList(ObjectContext context)

            : base()

        {

            this.context = context;

        }

 

 

        /// <summary>

        /// Remove an item from the list, if the item is already attached to

        /// the object context call DeleteObject so that when we save the context

        /// this object is deleted from the store, if it is in the Detached state

        /// just remove it from the list.

        /// </summary>

        /// <param name="index"></param>

        protected override void RemoveItem(int index)

        {

 

            Entity itemToRemove = this.Items[index];

            if (itemToRemove.EntityState != EntityState.Detached)

            {

                context.DeleteObject(itemToRemove);

            }

            base.RemoveItem(index);

        }

      

        /// <summary>

        /// A save changes method that allows one

        /// to save the changes in the list to the underlying

        /// store.

        /// The assumption is that entities that have been retrieved

        /// from the store were retrieved in a mode that stores them

        /// in the state manager and that they are being change-tracked.

        /// As a result, all we need to do is add any detached entities (new rows)

        /// to the context and then invoke save changes.       

        /// </summary>

        public void SaveChanges()

        {

            foreach (Entity entity in this.Items)

            {

                if (EntityState.Detached == entity.EntityState)

                {

                    context.AddObject(entity);

                }

 

            }

            context.SaveChanges();

        }

       

        /// <summary>

        /// Helper method for loading a binding list from an IEnumerable<T>

        /// T for this class is constrained to be an Entity or specialization       

        /// </summary>

        /// <param name="entities">The entities we want loaded into the binding liste</param>

        public void Load(IEnumerable<T> entities)

        {

            foreach (T entity in entities)

            {

                this.Add(entity);

            }

        }

 

    }

 

With this class I can now use object data sources with little additional coding… for example consider the following code:

                               

//--- create a new EntityBindinglist passing it the instance of our object context (in this case model)

            this.customers = new EntityBindingList<Customer>(model);

           

//--- Populate the binding list by executing the Customers query (a property on my generated object context)

//--- Note that when I execute I use the MergeOption “AppendOnly” I have a new context instance and

//--- want to add the retrieved instances to the state manager for state tracking and identity resolution

            this.customers.Load(model.Customers.Execute(MergeOption.AppendOnly));

           

//--- set the datasource property of my CustomersBindingSource to my new binding list

            this.CustomersBindingSrc.DataSource = customers;

 

To save the changes by wiring up “save” functionality from a binding navigator I provide a save event handler for the save button:

 

                      private void saveToolStripButton_Click(object sender, EventArgs e)

        {

            this.customers.SaveChanges();

        }

 

That’s about it…               

Posted by timmall | 5 Comments

ADO.NET Entities in Orcas CTP

The latest Orcas CTP has ADO.NET Entities available and we have posted some samples.

The bits available with the Orcas CTP do not have LINQ support or out of the box tooling, if you want LINQ support then you should pull our August CTP and Tools.

 

Posted by timmall | 1 Comments

ADO.NET CTP

We shipped our CTP last week (ADO.NET August CTP). Pretty cool. I guess I should blog some on things like EDM, Metadata, Mapping and the like. For now I will point out some interesting blogs:

Murali made a large post on queries
There's the channel 9 video if you have not seen it yet
Ther's the ADO.NET Forum where people are asking questions

Posted by timmall | 0 Comments

New post and screencast from Shyam...

Shyam Pather, a dev lead on the ADO.NET effort just posted on the Data Access Blog he has done some screencasts with our current bits:

  • Part 1
  • Part 2

    enjoy...

  • Posted by timmall | 5 Comments
    More Posts Next page »
     
    Page view tracker