EDM and Store functions exposed in LINQ
08 October 08 09:37 AM | efdesign | 9 Comments   

In this post Colin Meek and Diego Vega delve into some enhancements we are planning for LINQ to Entities, anyway over to them...

Entity Framework v1 customers preferring to write their queries using LINQ often hit a limitation on the range of functions and query patterns supported in LINQ to Entities. For some of those customers, having to resort to Entity SQL, or even to Entity SQL builder methods, feels awkward and reduces the appeal of Entity Framework.

There are two things we want to do in order to address this in future versions:

  • Expand the range of patterns and standard BCL methods we recognize in LINQ expressions.

  • Provide an extensibility mechanism that people can use to map arbitrary CLR methods to appropriate server and EDM functions.

This blog post expands on the second approach:

It is actually possible for us to improve our LINQ implementation so that all functions defined in the EDM and in the store, and even user defined functions, can be mapped to CLR methods with homologous signatures.

Design

Problem space

There are multiple dimensions to the problem space we want to address:

  • Functions can be defined in either the conceptual or the storage space

  • Functions can be defined in either the manifest, or just declared in the model

  • Functions can be mapped to either static CLR methods or to instance methods on the ObjectContext

  • This feature specifically targets composable functions

How it looks like: EdmFunctionAttribute

The basis of the extensibility mechanism is a new method-level attribute that carry function mapping information. Here is the basic signature of the attribute’s constructor:

public EdmFunctionAttribute(string namespaceName, string functionName)

The namespaceName parameter indicates the namespace for the function in metadata (i.e. “EDM” or “SQLSERVER”, or other store provider namespace). The functionName parameter takes the name of the function itself.

The following example could be product code or customer code applying the attribute on an extension method (it could be a regular static function) in order to map it to the standard deviation SQL Server function:

public static class SqlFunctions
{
    [EdmFunction("SqlServer", "stdev")]
    public static double? StandardDeviation(this IEnumerable<int?> source)
    {

        throw EntityUtil.NotSupported(
            System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall);
    }
}

Notice that while this method can’t be called directly it can be used in a query like this:

var query = 
    from p in context.Products
    where !p.Discontinued
    group p by p.Category into g
    select g.Select(each => each.ReorderLevel).StandardDeviation();

The following example shows how the canonical DiffYear function is mapped:

public static class EntityFunctions
{
    [EdmFunction("EDM", "DiffYears")]
    public static Int32? DiffYears(DateTime? arg1, DateTime? arg2)
    {
        throw EntityUtil.NotSupported(System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
    }
}

Usage is:

var query =
    from p in context.Products
    where EntityFunctions.DiffYears(DateTime.Today, p.CreationDate) < 5
    select p;

The following example shows how a user defined function defined in SQL Server can be mapped:

public static class MyCustomFunctions
{
    [EdmFunction("SqlServer", "MyFunction")]
    public static Int32? MyFunction(string myArg)
    {
        throw new NotSupportedException("Direct calls not supported"); 
    }
}

Convention based function name

We can establish that by convention the name of the CLR function defines the value of the functionName parameter. That makes the functionName parameter in the EdmFunctionAttribute optional.

EdmFunctionNamespaceAttribute

To avoid having to always specify the namespaceName for each function, we define a new class-level attribute named EdmFunctionNamespaceAttribute that would define the namespace mapping globally for a given class:

public EdmFunctionNamespaceAttribute(string namespaceName)

Using EdmFunctionNamespaceAttribute and the convention based constructor:

[EdmFunctionNamespace("EDM")]
public static class EdmMethods
{
    [EdmFunction]
    public static Int32? DiffYears(DateTime? arg1, DateTime? arg2)
    {
        throw EntityUtil.NotSupported(
            System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
    }
}

How it works

When a method with the EdmFunction attribute is detected within a LINQ query expression, its treatment is identical to that of a function within an Entity-SQL query. Overload resolution is performed with respect to the EDM types (not CLR types) of the function arguments. Ambiguous overloads, missing functions or lack of overloads result in an exception. In addition, the return type of the method must be validated. If the CLR return type does not have an implicit cast to the appropriate EDM type, the translation will fail.

Instance methods on the ObjectContext will be supported as well. This allows the method to bootstrap itself and trigger direct evaluation, as in the following example (definition of the method and sample query):

public static class MyObjectContext : ObjectContext
{
    // Method definition
    [EdmFunction("edm", "floor")]
    public double? Floor(double? value)
    { 
        return this.QueryProvider.Execute<double?>(Expression.Call(
            Expression.Constant(this),
            (MethodInfo)MethodInfo.GetCurrentMethod(),
            Expression.Constant(value, typeof(double?))));
    }
}



// evaluated in the store!
context.Floor(0.1);

Without the ObjectContext, the function cannot reach the store! To support this style of bootstrapping, the context needs to expose the LINQ query provider. For this reason, we now expose a “QueryProvider” property on the ObjectContext. This provider includes the necessary surface to construct or execute a query given a LINQ expression.

public class ObjectContext
{
    public IQueryProvider QueryProvider { get; }
}

If such a method is encountered inline in another query, then we must validate that the instance argument (MethodCallExpression.Object) is the correct context, but the instance is otherwise ignored:

// positive
var q1 = from p in context.Products select context.Floor(p.Price);

// negative
var q2 = from p in context.Products select context2.Floor(p.Price);

A function proxy can sometimes bootstrap itself without an explicit context, e.g. when an input argument is itself an IQueryable:

public static class SqlFunctions
{
    [EdmFunction("SqlServer", "stdev")]
    public static double? StandardDeviation(this IQueryable<int?> source)
    {
        return source.Provider.Execute<double?>(Expression.Call(
            (MethodInfo)MethodInfo.GetCurrentMethod(),
            Expression.Constant(source)));
    }
}

Nullability considerations

Particularly for functions taking collections, we will need to provide overloads for nullable and non-nullable elements. We don’t want to require awkward constructions like:

var query = (from p in products select (int?)p.ReorderLevel).StandardDeviation();

Tool for Generating the Functions

We created a simple internal tool that generates the classes that represent all the EDM canonical function and the SQL Server store functions. The tool will take the function definitions from Metadata and generate the appropriate function stubs/implementations.

The tool will be outside the product and will be run on demand. We expect to make a version of this tool available for provider writers together with the provider samples.

Naming

The methods will be in the following classes:

Namespace Class name

System.Data.Objects

EntityFunctions

System.Data.Objects.SqlClient

SqlFunctions

Note: The equivalent class in LINQ to SQL is System.Data.Linq.SqlClient.SqlMethods.

The method names will correspond to the name of the EDM/SQL function they represent. The argument names will correspond to the argument names of the EDM/SQL functions as retrieved by the metadata.

The recommendation for provider writers will be to include a similar static class in a namespace of the following form:

System.Data.Objects.[Standard provider namespace].[Standard provider prefix]Functions

Overloads and Implementation

Non-aggregate Functions

For each non-aggregate function we create an overload with all inputs type as nullable of the CLR equivalent of their EDM primitive type, and the return type nullable of the CLR equivalent of their EDM primitive type.

The implementation of the functions (what gets executed if the function is invoked outside an expression tree) will be to throw a NotSupportedException.

Example:

public static class EntityFunctions
{
    [EdmFunction("EDM", "DiffYears")]
    public static Int32? DiffYears(DateTime? arg1, DateTime? arg2)
    {
        throw EntityUtil.NotSupported(System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
    }
}

public
static class SqlFunctions
{
    [EdmFunction("SqlServer", "DiffYears")]
    public static Int32? DiffYears(DateTime? arg1, DateTime? arg2)
    {
        throw EntityUtil.NotSupported(System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
    }
}
Aggregate Functions

For each aggregate function we will provide two overloads, one with IEnumerable<Nullable<T>> and another one with IEnumerable<T>, where T is the CLR equivalent of the EDM primitive type of the input. The implementations of these will check whether the input is IQueryable in which case it will implement the self-bootstrapping.

Example:

[EdmFunction("EDM", "VARP")]
public static double? VarP(IEnumerable<int> arg1)
{
    ObjectQuery<int> objectQuerySource = source as ObjectQuery<int>;
    if (objectQuerySource != null)
    {
        return ((IQueryable)objectQuerySource).Provider.Execute<double?>(Expression.Call(
            (MethodInfo)MethodInfo.GetCurrentMethod(),
        Expression.Constant(source)));
    }
    throw EntityUtil.NotSupported(System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
}

[EdmFunction("EDM", "VARP")]
public static double? VarP(IEnumerable<int?> arg1)
{
    ObjectQuery<int?> objectQuerySource = source as ObjectQuery<int?>;
    if (objectQuerySource != null)
    {
        return ((IQueryable)objectQuerySource).Provider.Execute<double?>(Expression.Call(
            (MethodInfo)MethodInfo.GetCurrentMethod(),
        Expression.Constant(source)));
    }
    throw EntityUtil.NotSupported(System.Data.Entity.Strings.ELinq_EdmFunctionDirectCall); 
}

The Entity Framework team would love to hear your comments.

Alex James 
Program Manager,
Entity Framework Team

This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post.

Model First
10 September 08 09:55 PM | efdesign | 30 Comments   

One of the most painful omissions from the Entity Framework V1 was Model First, which basically means creating a conceptual 'model first' and then deriving a storage model, database and mappings from that.

People ask for this scenario all the time in the forums.

Well Noam, a Program Manager on the Entity Framework Tools team, outlines what we are considering:

Generating Databases from Models

The next release of the Entity Framework will include the ability to generate database schemas from your model. The main entry point into this feature is via the Designer context menu, to which we will add a new option called “Create Database from Model”.

(Note: User interface elements in this walkthrough are not final user interfaces, as the design is still under review, so please provide feedback...)

image

Selecting this option will bring up the following warning:

clip_image001

We would like users to understand that this feature will regenerate all SSDL and all MSL from scratch.

If you have started from an empty model, you will be asked to specify the target database. This screen is identical to the one shown when reverse engineering a model from a database. The implication here is that you will need an available server and database, which the system will use to determine what “flavor” of DDL to generate.

clip_image002

Once you have selected the target database, you will be presented with a summary screen which will provide a preview of the DDL that will be generated, as well as a tree-view of the objects.

The tree view:

clip_image003

The DDL view:

clip_image004

The DDL in the above screenshot is there merely as a graphic used to show how what the window will look like and is not intended to be representative of the DDL that will be generated by the system when you are actually creating your database (more about this soon). The DDL will however be read-only: Since it is generated by a template, editing of the results should either be done in the template, or in a separate DDL file which will not get regenerated.

Two options will be available:

- Save the DDL (off by default). This option will add the DDL as a dependent file under your EDMX file.

- Deploy the DDL (on by default). This option will deploy the DDL to the specified target database.

The generated DDL will not migrate data or schema - by default your database will be recreated from scratch.

Out of the box, we will support the Table-per-Type mapping strategy, meaning that we will create a table for each of your types and subtypes. For example, for a model like this…

clip_image006

…the following schema will be generated:

clip_image008

Of interest here is that PK-to-PK constraint between the Customer and Persons table. This helps enforce the inheritance relationship and the creation of foreign key columns to represent the various associations. In addition, the engine will create clustered keys on primary keys, and indexes on foreign keys that represent associations.

Under the Hood

The model first process is implemented using a Windows Workflow Foundation workflow that looks like this:

clip_image001[5]

Here is what the stages do:

Stage

Purpose

Stage

 

Purpose

 

CSDLtoSSDL

 

Creates the mappings (MSL) and database store model (SSDL) in the EDMX file.

 

SSDLtoDBSchema

 

Converts the SSDL to the format used by the Microsoft.Data.Schema APIs. This format is used as a “universal” database description format and includes physical information not present in the Entity Frameworks store model, such as indexes.

 

GenerateDDL

 

Uses the Microsoft.Data.Schema APIs to convert the universal format to store-specific DDL. In this release, we will support a minimum of SQL 2005 and SQL 2008. A provider model is in place, however, and we hope to add support for additional databases.

 

SuspendToConfirm

 

This activity pauses the workflow to allow the wizard to display the DDL.

 

DeployToDatabase

 

Deploys the DDL to the target database.

 

OutputDDL

 

Writes the DDL to the file system.

 

These stages are expressed in a XAML file which will be placed underneath your EDMX, to allow for customization: You can add your own steps or replace ones we have created with steps that you write.

Templates

Several of the steps above make use of templates to provide an additional point of control for users, and this is where we expect most of the customization to happen. These templates use the T4 Engine that is included in Visual Studio. We are currently working on making these templates as simple as possible by providing a set of supporting APIs that provide metadata collections that are designed for artifact generation – for example, a collection of all inherited properties for a type, or a collection of both inherited and defined properties. There are three templates:

Template

Purpose

Template

 

Purpose

 

CSDL to SSDL

 

Creates the SSDL for the target database.

 

CSDL to MSL

 

Creates the MSL for mapping the CSDL to the generated SSDL.

 

SSDL to DBSchema

 

Creates the physical database description, including elements such as foreign keys and indexes.

 

So, for example, if you need all table names to start with “_tbl”, you would modify the SSDL generation template to give all tables the appropriate name, and also modify the MSL template to provide the appropriate mappings. The DBSchema template will not need to be modified as it will automatically pick up the changes made to the SSDL which is its input.

As another example, if you wish to change the mapping strategy from Table-per-Type to Table-per-Hierarchy, you would need to change this same pair of templates.

Finally, if you needed to support a database for which no Microsoft.Data.Schema provider is available, you could replace the GenerateDDL step with your own template-driven activity which could them transform the schema “manually” to your store’s DDL format.

We hope this gives you enough information to understand the design and intent of this feature.

We would love to hear your comments.

Alex James
Program Manager,
Entity Framework Team

This post is part of the transparent design exercise in the Entity Framework Team. To understand how it works and how your feedback will be used please look at this post.

Structural Annotations - One Pager
12 August 08 08:50 PM | efdesign | 2 Comments   

In V1 of the Entity Framework it is possible to annotate a schema using attributes declared in another XSD.

However XML attributes are a very limited form of annotation. It would be better if we could annotate using full elements.

This is what we are calling Structural Annotations.

This feature will allow both customers and partners like Reporting Services to modify the model so that it includes information important to them which can’t be captured in vanilla EDM format.

1     EDM extensions

While it should be possible to annotate any level in the XML hierarchy, there are however some restrictions.

The general rule is you can annotation any CSDL / SSDL element that has a corresponding  MetadataItem which in practice means everything except <Using>, <Schema>, <Key> and <PropertyRef> elements.

These annotations should be named, so that they can be accessed using the same API as today.

i.e. something like this:

<EntityType Name="Content">
      <Key>
            <PropertyRef Name="ID" />
      </Key>
      <Property Name="ID" Type="Guid" Nullable="false" />
      <Property Name="HTML" Type="String" Nullable="false" MaxLength="Max" Unicode="true" FixedLength="false" />
      <CLR:Attributes>
            <CLR:Attribute TypeName="System.Runtime.Serialization.DataContract"/>
            <CLR:Attribute TypeName="MyNamespace.MyAttribute"/>
      </CLR:Attributes>
      <RS:Security>
            <RS:ACE Principal="S-0-123-1321" Rights="+R+W"/>
            <RS:
ACE Principal="S-0-123-2321" Rights="-R-W"/>
      </
RS:Security>
</EntityType>

1.1    Key points to notice:

  1. In the above example both the CLR and RS namespaces must have been declared somewhere. Probably at the root of the CSDL somewhere, i.e. something like this:
    <Schema xmlns:RS="http://schemas.microsoft.com/RS/2006" xmlns:CLR=http://schemas.microsoft.com/net/3.5
  2. Alternatively the namespace could be in lined in the annotation:
    <Security xmlns="http://schemas.microsoft.com/RS/2006">           
    </
    Security>
  3. In both cases the unique identity of the annotation is in the form {namespace}:{elementname}. So for the RS Security annotation example, irrespective of how the RS namespace is introduced the identity would be:
    http://schemas.microsoft.com/RS/2006:Security 
  4. Structural annotations should always follow all other sub-elements i.e. when structurally annotating an <EntityType> element the annotation element should follow all <Key> <Property> and <NavigationProperty> elements.
  5. Attribute based annotations (as supported in V1 and used for things like CodeGen) are scoped to the same annotation identity namespace. Hence care should be taken to verify that annotations using either approach don’t have colliding identities.
  6. It should be possible to have more than one named “Structural Annotation” per CSDL (or SSDL) element. Indeed element names can collide so long as the combination of Namespace + Element Name is unique for a particular element (or more specifically each MetadataItem). This means for instance you can’t have two <RS:Security> elements under any one MetadataItem.

1.2    Positive and Negative Cases:

This:

<EntityType Name="Content" CLR:Attribute="Blah">
      <CLR:Attributes>
            <CLR:Attribute TypeName="System.Runtime.Serialization.DataContract"/>
      </CLR:Attributes>

…would be invalid, because of the identity collision, likewise this would also be invalid:

<EntityType Name="Content" My:Attribute="Blah">
      <CLR:Attributes>
            <CLR:Attribute TypeName="System.Runtime.Serialization.DataContract"/>
      </CLR:Attributes>
      <CLR:Attributes>
            <CLR:Attribute TypeName="MyNamespace.MyAttribute"/>
      </CLR:Attributes>

where-as this is fine:

<EntityType Name="Content" My:Attribute="Blah">
      <CLR:Attributes>
            <CLR:Attribute TypeName="System.Runtime.Serialization.DataContract"/>
      </CLR:Attributes>
      <RS:Security>
            <RS:ACE Principal="S-0-123-1321" Rights="+R+W"/>
            <RS:
ACE Principal="S-0-123-2321" Rights="-R-W"/>
      </
RS:Security>

… since all the above annotations have unique identities.

1.3    What can an <Annotation> contain?

A structural annotation is simply an XML element. As such it can be considered a root of an XML document that can contain any valid XML structures. These structures are simply ignored by the Entity Framework and Metadata APIs.

1.4    What elements can be annotated?

If an element in the CSDL or SSDL has a corresponding MetadataItem in the Metadata API that element should support structural annotations. Since <Using>, <Schema>, <Key> and <PropertyRef> elements have no corresponding MetadataItem(s) they don’t support structural annotations.

Notice while TypeUsage(s) have no CSDL representation today,they are MetadataItem(s) so they could in theory be annotated in the future. For example if a public mutable metadata API is produced.

2     Metadata API Changes:

In this example we are using the Entity Frameworks Metadata API to get the EdmType for an EntityType called Content. From that