The official source of information on Managed Providers, DataSet & Entity Framework from Microsoft
The information in this post is out of date.
Visit msdn.com/data/ef for the latest information on current and past releases of EF.
Visual Studio 2008 and .Net 3.5 SP1 come with two tools that are capable of supporting a database-first application design scenario: the EdmGen command-line tool and the New ADO.Net Entity Model item template. These two tools produce Entity Framework metadata artifacts from a database instance, but the generated model and mapping have a one-to-one correspondence with the database. Any inheritance or partitioning relationships that may be naturally present in the schema or data of that instance are not represented in the generated model.
The new version of EdmGen2, recently released on Code Gallery, includes an option “RetrofitModel”. This extension is the result of collaboration with the database team at Microsoft Research, notably summer intern Ankit Malpani. When using this option, EdmGen2 uses data mining techniques to determine if there are TPH, TPT, or vertical partitioning relationships that are inherent in the instance. It then generates a suitable model and mapping in the form of csdl, msl, ssdl, and edmx files. The following figure illustrates one of the patterns that EdmGen2 looks for in an instance, and the resulting model:
To use the tool, launch EdmGen2 with the following arguments:
/RetrofitModel "connection string" "provider" "model name"
For instance, the following will connect to a local AdventureWorks database, and create files and an EDM instance named AVWorks:
EdmGen2 /RetrofitModel "Server=(local); Integrated Security=true; Initial Catalog=AdventureWorks;" "System.Data.SqlClient" "AVWorks"
There is an optional final parameter that can be added. The parameter must be a numeric value between 0 and 1. This parameter represents a threshold ratio of elements in a generated child entity to the number of elements in its parent in a hierarchy. If the ratio of any generated inheritance falls below this threshold, the child entity is effectively removed and merged back into the parent. The tool may identify potential inheritance relationships on an insignificant portion of an entity; this parameter allows the user to define what “insignificant” means. If no parameter is specified, a default value of 0.05 is used.
Internally, EdmGen2 uses a sequence of rules that mine the database for patterns in data and schema that provide evidence of inheritance or partitioning relationships. We first use the System.Data.Entity.Design API to generate a default model. We then identify any entities that represent many-to-many associations and turn them into associations. Finally, we run the following rules, in order, to iteratively refine the model:
We have defined rules that can identify TPC relationships and horizontal partitioning as well. These rules were excluded from this release to ensure that all generated models could be viewable in the Visual Studio design surface. In the future, if there is sufficient demand, we will add this functionality to the release. We are also interested to see if there are additional patterns that signify inheritance hierarchies in live production databases that are not covered by these rules, and if so, how we can generalize them for automatic identification.
- James Terwilliger, Post Doc researcher, MSR