Structural Annotations - One Pager
12 August 08 08:50 PM | efdesign | 2 Comments   

In V1 of the Entity Framework it is possible to annotate a schema using attributes declared in another XSD.

However XML attributes are a very limited form of annotation. It would be better if we could annotate using full elements.

This is what we are calling Structural Annotations.

This feature will allow both customers and partners like Reporting Services to modify the model so that it includes information important to them which can’t be captured in vanilla EDM format.

1     EDM extensions

While it should be possible to annotate any level in the XML hierarchy, there are however some restrictions.

The general rule is you can annotation any CSDL / SSDL element that has a corresponding  MetadataItem which in practice means everything except <Using>, <Schema>, <Key> and <PropertyRef> elements.

These annotations should be named, so that they can be accessed using the same API as today.

i.e. something like this:

<EntityType Name="Content">
      <Key>
            <PropertyRef Name="ID" />
      </Key>
      <Property Name="ID" Type="Guid" Nullable="false" />
      <Property Name="HTML" Type="String" Nullable="false" MaxLength="Max" Unicode="true" FixedLength="false" />
      <CLR:Attributes>
            <CLR:Attribute TypeName="System.Runtime.Serialization.DataContract"/>
            <CLR:Attribute TypeName="MyNamespace.MyAttribute"/>
      </CLR:Attributes>
      <RS:Security>
            <RS:ACE Principal="S-0-123-1321" Rights="+R+W"/>
            <RS:
ACE Principal="S-0-123-2321" Rights="-R-W"/>
      </
RS:Security>
</EntityType>

1.1    Key points to notice:

  1. In the above example both the CLR and RS namespaces must have been declared somewhere. Probably at the root of the CSDL somewhere, i.e. something like this:
    <Schema xmlns:RS="http://schemas.microsoft.com/RS/2006" xmlns:CLR=http://schemas.microsoft.com/net/3.5
  2. Alternatively the namespace could be in lined in the annotation:
    <Security xmlns="http://schemas.microsoft.com/RS/2006">           
    </
    Security>
  3. In both cases the unique identity of the annotation is in the form {namespace}:{elementname}. So for the RS Security annotation example, irrespective of how the RS namespace is introduced the identity would be:
    http://schemas.microsoft.com/RS/2006:Security 
  4. Structural annotations should always follow all other sub-elements i.e. when structurally annotating an <EntityType> element the annotation element should follow all <Key> <Property> and <NavigationProperty> elements.
  5. Attribute based annotations (as supported in V1 and used for things like CodeGen) are scoped to the same annotation identity namespace. Hence care should be taken to verify that annotations using either approach don’t have colliding identities.
  6. It should be possible to have more than one named “Structural Annotation” per CSDL (or SSDL) element. Indeed element names can collide so long as the combination of Namespace + Element Name is unique for a particular element (or more specifically each MetadataItem). This means for instance you can’t have two <RS:Security> elements under any one MetadataItem.

1.2    Positive and Negative Cases:

This:

<EntityType Name="Content" CLR:Attribute="Blah">
      <CLR:Attributes>
            <CLR:Attribute TypeName="System.Runtime.Serialization.DataContract"/>
      </CLR:Attributes>

…would be invalid, because of the identity collision, likewise this would also be invalid:

<EntityType Name="Content" My:Attribute="Blah">
      <CLR:Attributes>
            <CLR:Attribute TypeName="System.Runtime.Serialization.DataContract"/>
      </CLR:Attributes>
      <CLR:Attributes>
            <CLR:Attribute TypeName="MyNamespace.MyAttribute"/>
      </CLR:Attributes>

where-as this is fine:

<EntityType Name="Content" My:Attribute="Blah">
      <CLR:Attributes>
            <CLR:Attribute TypeName="System.Runtime.Serialization.DataContract"/>
      </CLR:Attributes>
      <RS:Security>
            <RS:ACE Principal="S-0-123-1321" Rights="+R+W"/>
            <RS:
ACE Principal="S-0-123-2321" Rights="-R-W"/>
      </
RS:Security>

… since all the above annotations have unique identities.

1.3    What can an <Annotation> contain?

A structural annotation is simply an XML element. As such it can be considered a root of an XML document that can contain any valid XML structures. These structures are simply ignored by the Entity Framework and Metadata APIs.

1.4    What elements can be annotated?

If an element in the CSDL or SSDL has a corresponding MetadataItem in the Metadata API that element should support structural annotations. Since <Using>, <Schema>, <Key> and <PropertyRef> elements have no corresponding MetadataItem(s) they don’t support structural annotations.

Notice while TypeUsage(s) have no CSDL representation today,they are MetadataItem(s) so they could in theory be annotated in the future. For example if a public mutable metadata API is produced.

2     Metadata API Changes:

In this example we are using the Entity Frameworks Metadata API to get the EdmType for an EntityType called Content. From that we then get the MetadataProperty called http://schemas.microsoft.com/RS/2006:Security:

EdmType contentType = metadataWorkspace.GetEdmType(“MyNamespace.Content”);
if (contentType.MetadataProperties.Contains(
"http://schemas.microsoft.com/RS/2006:Security")
{
    MetadataProperty annotationProperty = contentType.MetadataProperties[ "http://schemas.microsoft.com/RS/2006:Security"]; 
    object annotationValue = annotationProperty.Value;
    //Do something...
}

Notice that the way you access structural annotations is the same as the way you access attribute annotations, i.e. by name/identity, in the MetadataProperties collection.

This of course means that their names can’t collide, name collisions should be picked up when metadata is loaded from persistent formats (like CSDL) or when annotations are created by hand in the any future mutable metadata API.

2.1    MetadataProperty TypeUsage

When creating a MetadataProperty to hold the XElement for the structural annotation, we need to assign a TypeUsage too. For V1 style attribute annotations, the TypeUsage we chose was based on Edm.String. Ideally we should use Edm.XML etc, but since this is not currently available, we will use Edm.String in the meantime. We have filled a bug to change this if/when [j1] XML becomes an EDM type.

3    What is the MetadataProperty.Value?

The Value of the MetadataProperty will be a XElement for structural annotations originating from CSDL.

So if you know that the annotation is a structural annotation you can access it like this:

MetadataProperty annotationProperty =
                     contentType.MetadataProperties["CLR:Attributes"];

XElement myAnnotation = annotationProperty.Value as XElement;

3.1    XElement is rooted at the <Annotation> element

The XElement returned will be the XElement containing everything including the structural annotation itself.

3.2    XElement is mutable

There is nothing to stop the consumer of the Metadata API from modifying the XElement since it is simply a reference, to an instance of a type the Entity Framework doesn’t control.

We could address this by cloning on access to make things a little safer. But this negatively affects the mainline scenario, namely where a customer uses annotations purely for read, and is therefore not required.

The important thing to understand is that should a user modify the XElement all bets are off. We should provide guidance both via API hints:

/// <summary>
/// ....
/// <remarks>
/// Should this return a reference type like an XElement, you should consider
/// this immutable, no guarantees are made about behavior if this reference
/// type is modified in anyway.
/// </remarks>
/// </summary>
public object Value { get {

And via appropriate documentation.

3.3    No universal annotation format

Rather than try to invent a new universal data format for annotations, we decided to be pragmatic.

So we are limiting ourselves to XML for structural annotations loaded from CSDL (an XML format). 

However since the MetadataProperty.Value is of type object, in a world where we have an alternative EDM representation (like in memory using mutable metadata for instance) we could easily assign something other than an XElement to the MetadataProperty.Value, like for example a xaml Serializable CLR type instance.

4     Out of Scope

The current plan is to simply support annotation in the CSDL & SSDL. Meaning annotating MSL is currently out of scope.

If we uncover meaningful important scenarios for MSL annotations, we will undertake that work separately. 

Incidentally supporting structural annotations in MSL touches a completely separate code path, and as a result will require roughly the same amount of work. Additionally in order to make undertaking this work meaningful we would first need to expose a public mapping metadata API.

....

As always we are keen to hear any feedback you have on this item.

Do you think you will find a use for Structured Annotations?

Alex James
Program Manager,
Entity Framework Team

This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post.

Discussion about API changes necessary for POCO:
01 August 08 05:24 PM | efdesign | 16 Comments   

Evolving an API to support new requirements, like POCO, while maintaining backward compatibility is challenging.

The following design discussion from members of the Object Services team illustrates some of the issues and hard calls involved.

Have a read, and tell us what you think.

In particular are we missing something, or overstating the importance of something? Let us know...

Anyway over to Diego and Mirek...

POCO API Discussion: Snapshot State Management

What is in a snapshot?

In Entity Framework v1 there is a single way for the state manager to learn about changes in entity instances: the change tracking mechanism is set in such a way that the entity instances themselves notify the state manager of any property change.

This works seamlessly if you use default code-generated classes, and it is also part of the IPOCO contract for developers willing to create their own entity class hierarchies.

For version 2, we are currently working on a snapshot-based change tracking mechanism that removes the requirement from classes to send notifications, and enables us to provide full POCO support.

The basic idea with snapshots is that when you start tracking an entity, a copy of its scalar values is made so that at a later point – typically but not always at the moment of saving changes into the database - you can detect whether anything has changed and needs to be persisted.

The challenge

When we created the v1 API, we made a few assumptions that were totally safe with notification-based change tracking but don’t completely prevail in a snapshot change tracking world.

We now need to choose the right set of adjustments for the API for it to gracefully adapt to the new scenarios we want to support.

Mainline scenario: SaveChanges() Method

In notification-based change tracking, by the time SaveChanges is invoked, the required entity state information is ready to use.

With snapshot thought, a property by property comparison needs to be computed for each tracked entity just to know whether it is unchanged or modified.

Once the snapshot comparison has taken place, SaveChanges can proceed normally.

In fact, assuming the process is triggered implicitly on every call to SaveChanges, a typical unit of work in which the program queries and attaches entities, then modifies and adds new entities, and finally persists the changes to the database, works unmodified with POCO classes:

Category category = context.Categories.First();

category.Name = "some new name"; // modify existing entity

Category newCategory = new Category(); // create a new entity

newCategory.ID = 2;

newCategory.Name = "new category";

context.AddObject("Categories", newCategory); // add a new entity

context.SaveChanges(); // detects all changes before saving

Things get more complicated when a user wants to use lower level APIs that deal with state management.

State Manager Public API

ObjectStateManager and ObjectStateEntry classes comprise the APIs that you need to deal with if you want to either get input from, or customize the behavior of Entity Framework’s state management in your own code.

Typically, you use these APIs if you want to:

  • Query for objects that are already loaded into memory
  • Manipulate the state of tracked entities
  • Validate state transitions or data just before persisting to the database
  • Etc.

As the name implies, ObjectStateManager is Entity Framework’s state manager object, which maintains state and original values for each tracked entity and performs identity management.

ObjectStateEntries represent entities and relationships tracked in the state manager. ObjectStateEntry functions as a wrapper around each tracked entity or relationship and allows you to get its current state (Unchanged, Modified, Added, Deleted and Detached) as well as the current and original values of its properties.

Needless to say, the primary client for these APIs is the Entity Framework itself.

ObjectStateEntry.State Property

The fundamental issue with snapshot is exemplified by this property.

With notification-based change tracking, the value for the new state is computed on the fly on each notification, and saved to an internal field. Getting the state later only encompasses reading the state from the internal field.

With snapshot, the state manager no longer gets notifications, and so the actual state at any time depends on the state when the entity began being tracked, whatever state transitions happened, and the current contents of the object.

Example: Use ObjectStateEntry to check the current state of the object.

Category category = context.Categories.First();

category.Name = "some new name"; // modify existing entity

ObjectStateEntry entry = context.ObjectStateManager.GetObjectStateEntry(category);

Console.WriteLine("State of the object: " + entry.State);

Question #1: What are the interesting scenarios for using the state management API in POCO scenarios?

Proposed solutions

In above example there are two possible behaviors and it's not obvious for us which one is better:

Alternative 1: Public ObjectStateManager.DetectChanges() Method

In the first alternative, computation of the snapshot comparison for the whole ObjectStateManager is deferred until a new ObjectStateManager method called DetectChanges is invoked. DetectChanges would iterate through all entities tracked by a state manager and would detects changes in entity's scalar values, references and collections using a snapshot comparison in order to compute the actual state for each ObjectStateEntry.

In the example below, the first time ObjectStateEntry.State is accessed, it returns EntityState.Unchanged. In order to get a current state of the "category" entity, we would need to invoke DetectChanges first:

Category category = context.Categories.First();

category.Name = "some new name"; // modify existing entity

ObjectStateEntry entry = context.ObjectStateManager.GetObjectStateEntry(category);

Console.WriteLine("State of the object: " + entry.State); // Displays "Unchanged"

context.ObjectStateManager.DetectChanges();

Console.WriteLine("State of the object: " + entry.State); // Displays "Modified"

DetectChanges would be implicitly invoked from within ObjectContext.SaveChanges.

Pros:

· User knows when detection of changes is performed

· DetectChanges is a method that user would expect to throw exceptions if some constraint is violated

· This alternative requires minimal changes in Entity Framework current API implementation

Cons:

· Since DetectChanges iterates through all the ObjectStateEntries, it is a potentially expensive method

· User has to remember to explicitly call DetectChanges() before using several methods/properties, otherwise, their result will be inaccurate:

o ObjectStateEntry.State

o ObjectStateEntry.GetModifiedProperties()

o ObjectStateManager.GetObjectStateEntries()

· This alternative implies adding a new method to ObjectStateManager

Alternative 2: Private ObjectStateEntry.DetectChanges() Method

In the second alternative, there is no public DetectChanges method. Instead, the computation of the current state of an individual entity or relationship is deferred until state of the entry is accessed.

Pros:

· User doesn't have to remember explicitly calling DetectChanges to get other APIs to work correctly

· Existing API works as expected in positive cases regardless of notification-based or snapshot tracking

· No new public API is added

Cons:

· The following methods now require additional processing to return accurate results in snapshot:

o ObjectStateEntry.State

o ObjectStateEntry.GetModifiedProperties()

o ObjectStateManager.GetObjectStateEntries()

· Existing API works differently in negative cases in notification-based and snapshot tracking:

o some of the existing methods would start throwing exceptions

or

o would require to introduce a new state of an entry – Invalid state (see below for details)

Question #2: What API pattern is better? Having an explicit method to compute the current state based on the snapshot comparisons or having the state to be computed automatically when accessing the state?

How this affects other APIs

While the State property exemplifies the issue, there are other APIs that would have different behaviors with the two proposed solutions.

ObjectStateEntry.GetModifiedProperties() Method

GetModifiedProperties returns the names of the properties that have been modified in an entity. Similar to the State property, with notification-based change tracking, an internal structure is modified on the fly on each notification. Producing the list later on only encompasses iterating through that structure.

With snapshot, the state manager no longer gets notifications, and at any given point in time, the actual list of modified properties really depends on a comparison between the original value and the current value of each property.

Therefore, for alternative #1, this APIs will potentially return wrong results unless it is invoke immediately after DetectChanges. For alternative #2, the behavior would be always correct.

ObjectStateManager.GetObjectStateEntries()

This is the case in which an implementation detail that was a good idea for notification-based change tracking stops offering performance benefits in snapshot. Internally, ObjectStateManager stores ObjectStateEntires in separate dictionaries depending on their state. But in snapshot, any unchanged or modified entity can become deleted because of a referential integrity constraint, and any unmodified entity can become modified.

In alternative #1, DetectChanges would iterate thought the whole contents of the ObjectSateManager, and thus it would update the internal dictionaries. Once this it done, it becomes safe to do in-memory queries using GetObjectStateEntires the same way it is done today.

In alternative #2, GetObjectStateEntires would need to look in unchanged, modified and deleted storage when asked for deleted entries; also in unchanged and modified storage when asked for modified entities.

Querying with MergeOption.PreserveChanges

In Entity Framework, MergeOptions is a setting that changes the behavior of a query with regards to its effects on the state manager. Of all the possibilities, PreserveChanges requires an entity to be in the Unchanged state in order to overwrite it with values coming from the database. In order for PreserveChanges to work correctly, accurate information on the actual state of entities is needed.

Therefore, with alternative #1, querying with PreserveChanges will not have a correct behavior unless it is done immediately after invoking DetectChanges. For alternative #2, querying with PreserverChanges would behavior correctly at any time.

Referential Integrity constraints

When there is a referential integrity constraint defined in the model, a dependent entity may become deleted if the principal entity or the relationship with the principal entity becomes deleted.

For alternative #1, DetectChanges would also trigger Deletes to propagate downwards throw all referential integrity constraints.

For alternative #2, finding out the current state of an entity that is dependent in a RIC would requires traversing the graph upwards to find if either the principal entity or a relationship has been deleted.

Mixed Mode Change Tracking

At the same time we are working on snapshot change tracking, we are also working on another feature, denominated dynamic proxies. This feature consists of Entity Framework automat