Recently I shared this post which has some code that demonstrates how to create an object matching the original state of an entity tracked in the ObjectStateManager. While this is useful, it’s more interesting to create the entire original value graph. As I thought about how to explain that code, it seemed pretty obvious to me that in order to understand it, you need to have a fairly deep understanding of how the EF models relationships both in entity data classes and in the ObjectStateManager. So, this post is about those concepts.
People who work with me will tell you that I’m a very visual guy, and I’m practically unable to speak without a whiteboard marker in my hand. So, my first step in trying to explain this stuff was to create a diagram. I’m no award-winning illustrator, but maybe this will help get across some of the ideas:
The first piece of the overall puzzle is to realize that the EF conquers the problem of Object/Relational mapping by dividing it into two parts: One, EntityClient, handles transforming the shape of data between a schema designed for efficient storage and a different schema, or model, designed around the way we think about the problem when coding, but it does this shape transformation entirely in terms of values rather than objects. The other, Object Services, addresses differences between the value representation (generally DbDataRecords) and an object-oriented view of the data. The great thing about divide and conquer is that it simplifies the problem and avoids forcing every part of the code to understand or opt-in to the whole solution. The challenge, though, is that something needs to bridge the gap between the parts. In the EF, the ObjectStateManager is that bridge—it lets the framework (and your code) reason about data in terms of objects when that makes sense and in terms of values when that’s more appropriate. Specifically, when it comes to relationships, the object representation is a collection or reference while the value representation is a record with an EntityKey for each end of the relationship and metadata indicating which relationship the record represents.
The second concept which is important to understand is how the entity classes enable code to sometimes work with relationships in terms of strong types known in advance and other times in a late-bound fashion. So, for example, an object model built around an order with order lines, a customer, a sales person, etc. would have a collection of order lines on the order as well as a reference from the order to the customer. This would work great for code implementing a business process specific to that object model, but a general-purpose component would not be written in terms of those types. It could be written using reflection, but not only might that be difficult, but there would likely be significant perf issues. To address these problems the EF has the RelationshipManager, which gives late-bound access to all the relationships in which an entity participate, and the IRelatedEnd interface, which provides a common abstraction across collections and references for many scenarios.
The last key idea is a “stub” ObjectStateEntry which is used in the state manager when the system is aware of a relationship but the entity for the other side of that relationship has not yet been loaded into memory. These cases primarily show up for EntityReferences because the system often needs to reason about relationships where the model requires that there be exactly one related entity—for example, the model might require that every order have a sales person. In this case, object services queries for a customer entity are automatically re-written to also return the EntityKey of the related SalesPerson, because the update system needs to reason about the relationship between the Order and the SalesPerson in some scenarios—like when deleting the Order. In this case the state manager will contain an entry for the relationship record which has the key of the Order and the key of the SalesPerson as well as an entry which represents the SalesPerson but just contains its key since the full entity is not yet in memory. This state entry which contains only an EntityKey is what we call a stub. Its corresponding representation In the object-graph is a non-null value on the EntityKey property of the EntityReference while the reference itself is null.
These stubs are not only important for the update system to do its work, but they also have an impact on the way the object graph functions—particularly in disconnected scenarios. If the ObjectStateManager contains a stub with a particular EntityKey, and the full entity which has that key is queried or attached to the state manager, then the stub entry is replaced with a full entry, and the object graph is fixed to reflect the relationship. So in our example, if you were to receive an Order entity as well as a set of OrderLines from a remote location (maybe serialized over a webservice) and if the OrderLines’ Order reference had its EntityKey property set, then as each OrderLine was attached, a relationship and sub entry would also be created. Later, when the Order was attached, it would replace the stub and the graph would be fixed up so that you would effectively have serialized and remoted the graph even though you actually only sent a set of separate, shallow entities.
So there you go, a nice, big picture and three important concepts to understand about how the EF manages relationships. Enjoy!