Update Model – a Question of Identity – Part 2 of 3

Published 22 July 08 09:23 AM | dpblogs 

This article is Part 2 of 3 – please see http://blogs.msdn.com/adonet/archive/2008/07/21/update-model-a-question-of-identity-part-1-of-3.aspx for Part 1.

In the last article I explained the principle that the C-side updates would be non-destructive.

The “Update Model” code needs to decide when we should add new C-side EntityTypes, new C-side Properties and new C-side Associations. This brings us into the idea of identity. Bear with me – this can get pretty complicated.

Suppose you have a database with two tables in it. They have the same name but are in different schemas e.g. schema1.Customer and schema2.Customer. All S-side EntitySets (which are the way the EDM represents a database table or view) in a given model must have unique names. So if you attempt to import both of the above tables then one of them will be represented as an EntitySet with name ‘Customer’ and one with name ‘Customer1’ each with a matching EntityType. Which one will be which? Well, it depends on in what order you import them and sometimes on what else you import at the same time. The important point is that the Name attribute of the EntitySet is not always the same as the name of underlying database object.

So we cannot rely on the Name attribute of a given EntitySet to identify the underlying database object. But for “Update Model from Database” we do need to identify what the true underlying database object is to decide whether a given one is “new”. So instead we construct a “database identity” for each S-side EntitySet which consists of its underlying schema name and its underlying table/view name. If the EntitySet’s Name attribute does not match the underlying table/view name then either a Table attribute or the special attribute store:Name (where ‘store’ is a prefix defined as the XML namespace “http://schemas.microsoft.com/ado/2007/12/edm/EntityStoreSchemaGenerator”) is defined on that EntitySet. The value of this attribute is the true underlying database object name. Similarly the Schema or store schema attribute will represent the true underlying database schema name (and there’s a similar story for Function elements which is how the EDM represents stored procedures).

This combination of the true database schema name and the true database object name are the “database identity” of that EntitySet. “New” S-side EntitySets are defined as those whose “database identity” does not appear in the S-side of the existing model.

For every imported, new S-side EntitySet we create a single, matching C-side EntitySet and EntityType and the mappings necessary to map them through to the S-side EntitySet. If it is available, we will use the existing S-side name for the matching C-side EntityType, but that name may already be in use in which case we will try appending 1, 2, 3 … until we find a name that is not in use.

So as an example, if a C-side EntityType, say ‘Customer’, was originally mapped to the S-side ‘Customer’ EntityType but has been manually unmapped, and “Update Model from Database” is run then the above rules will _not_ result in a “duplicate” C-side EntityType called ‘Customer1’ because “Update Model” knows that there was already a ‘Customer’ S-side EntityType – we assume the user knew what s/he was doing when they unmapped the C-side EntityType. However if a _new_ S-side EntityType called ‘Customer’ is added and there happens to exist a C-side EntitySet already called ‘Customer’ then the new, matching C-side EntityType would be called ‘Customer1’ and it will be up to the user to rename the C-side EntityTypes as needed.

In the next article I’ll explain how this concept is extended to C-side Properties and Associations.

Lawrence Jones
Software Design Engineer, Entity Framework Tools

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

# Frans Bouma said on July 23, 2008 3:29 AM:

I fail to see why you didn't solve this by adding the concept of 'grouping' of entitysets: you can have catalogs, catalogs have schemas, schemas have tables, views. The example you give above therefore doesn't give problems in that case because they're not in the same group, so an entity mapped on catalog1.schema1.customer knows it's not mapped on catalog2.schema4.customer

Some db's don't have catalogs, just schemas, but you can use a default name for 'catalog', and also if they don't support schemas, a default name for the schema.

The beauty of this is that you can immediately support multi-catalog projects as well, or multi-schema on oracle. This is how we do it in LLBLGen Pro and it works for oracle, sqlserver etc. and we can migrate a mapping layer just fine (including inheritance) to multiple catalogs being refreshed, simply because the mappings don't use cooked up names, but actually refer to elements in the meta-data which have unique names.

What I also find a little odd is that no-one at the EF team apparently has found it necessary to add an option to this refresher to make it the choice of the user if the model refresh logic will add new entities or update entities automatically. People who work from the DB want just 1 thing: refresh the catalog, regen the code and recompile their project so they can move on. Ideally in a one-shot way, like pushing a single button or run a single command line tool.

Because, a developer working on a project isn't interested in doing everything manually, he has a deadline to catch: if a tool can fix things, all the better, and if the tool has to make a choice it can make: pop up a dialog or enlist the tasks to perform (e.g. two tables are renamed, both have the same field layout etc.)

Unless you get these kind of things solved, the EF designer will never be what it could have been and as the EF is presented to the developer through its designer and usage patterns, it will be a tough battle to sell this to the average developer who doesn't want to waste time clicking in tiny windows to fix errors which could have been solved for him  by the tool he uses, and definitely the manager of these developers doesn't want his / her team members to waste time on things like that.

But of course, your council of 5 clever people will have told you all about these kind of things by now, I'm sure ;).

# VB said on July 29, 2008 2:29 PM:

When I select "Update Model from DB" it shows me the pop up windows with "Add" and "Update" tables, views, SP option. When I select "Update" it throws an message:

"You cannot select or deselect....."

Hence I end up with all my DB tables again in my EDMX. If I want to update and get only 1/2 DBtables I still end up getting all tables again.

Is there any way out?????

# JB said on July 31, 2008 2:15 PM:

I'd just like to say thanks for the EF. I know the concept has been around for awhile, but it's nice to finally not have to use custom tools, etc.

I'd also like to point out that company's that want to compete, as well as developers that don't want to be left behind on scalable technologies would see a great benefit in having these collabrative tools. And don't share the same opinion as the board troll Frans Bouma.

Regards,

# Kosher said on August 2, 2008 5:22 AM:

The column and table identity issues were resolved with datasets in the past by using annotated columns and tables.  But I'm sure you knew that.  I like the naming (1,2,3) etc... if the name exists.  Go with it =)

I hope these "entities" are modeled in the designer like datasets but with the added benefit that each "entity" is stored in a separate class file.  I'm not sure if this is totally related to your issue with identities but it could provide a way to use partial classes to add a feature where two tables with colliding names are combined with partial classes.  That old concept of a master schema for relating entities across catalogs, ya know?  Either that or you simply allow the user to specify an "intermediate" class name that pulls the schema from both tables and allows the users to specify which table has "master" status.  Of course in this intermediate table, the users might also need to annotate columns to resolve name collisions.

The DataTableAdapter is what this "Update Model" feature sounds like to me.  When the underlying database model changed in the past, I would run through the wizard and choose the tables/columns that I wanted to append.  That shouldn't change much.

I am new to this whole "Entity Framework" stuff.  I am interested in how it addresses serialization?  The old days of passing a datatable (entity) schema (with data) over a web service and then having the ability to consume the web service with visual studio's nice web service wizard made life good.

# Mathias said on August 15, 2008 9:09 AM:

@JB, dont think that everybody is a board troll if he doesnt pray for EF...

i also have to use EF and searching for answers to simple questions... and at this time i'm reaching the first limitations to implement all given pattern by our architect.

and we trying to use the EF domain driven, and not by a existing database...

but it seems to fail...

# Günter Zöchbauer said on September 1, 2008 3:32 PM:

IMHO it's mandatory to let the developer decide if the C-Side should be modified - even before the smallest change you can think of.

It's horrible to correct or remove all automatically made additions manually after each update.

Anyway, I don't understand why MS didn't ship a designer optimized for forward engineering before playing with reverse engineering.

This would be stupid simple to develop and would do the job for a vast majority of developers.

Reverse engineering is fine, but if it doesn't work fluently it's better to not use it at all or maybe just for a first import before customizing the C-Side.

The missing designer for the S-Side is a burden as well.

If I don't want to use update schema from DB due to it's destructiveness (yes, destructiveness, even if it doesn't remove anything but adding) the S-Side has to be maintained by hand.

But the EDMX-Syntax is so verbose that it's a hard to maintain using the XML-editor.

My conclusion:

The designer is absolutely useless as it is currently and maintenance of the EDMX by hand is a pain (not at least due to the combination of C-Side, S-Side and CSM in one file).

So, another one or two year waiting for a possibly usable ORM from MS.

# Fred Morrison said on December 11, 2008 3:52 AM:

Did you ever publish Part 3 of 3? Please post a link with either the article or the reason the last part has not yet been published.

Leave a Comment

(required) 
(optional)
(required) 
Page view tracker