(with apologies to Billy Joel <grin>)
Last summer I wrote a few articles for MSDN Magazine about N-Tier patterns with the Entity Framework (Entity Framework- Anti-Patterns To Avoid In N-Tier Applications, Entity Framework- N-Tier Application Patterns and Building N-Tier Apps with EF4). One of the most fun parts of the project has been the emails I’ve gotten from various readers with follow-up questions and interesting discussions. Today I responded to one of those messages and after writing up the email I thought the discussion might be of interest to more than just that one reader, so I’m repurposing the content here. Further thoughts and discussion are welcome.
The message was specifically regarding Anti-Pattern #3: Mishandled Concurrency from the first article, and essentially the issue was around how to handle concurrency given the design principle that a good service should not assume trust with its clients. The contention was that in order to avoid trusting the client, the service should either re-load data before doing an update in order to check what was really changed and validate consistency with business rules or it should digitally sign the original version of the entity, send it to the client and then verify it when the result comes back before relying on that data.
Here’s my response:
I agree that trust is a significant issue, but I'm not sure that your two options are the only ones or even that they are the preferable ones in many situations. The critical questions, I think, are: What are we trying to protect? And what kinds of things are we protecting from?
First off, we have the issue of whether or not the client is who we think it is (a question of authentication). If it's possible for someone with evil intent to make a call to the service and pretend to be someone other than who we think, then the service might allow that person to accomplish a task that should not be allowed, which of course would be a major issue, but that's a concern of other parts of the solution--not what we're talking about when we are looking at concurrency. If a request is modified between the client and the service (maybe a man-in-the-middle attack or something like that), then that has the same kind of problem, but again that's not so much a matter for the persistence or service implementation part of the application so much as it is for the messaging substrate of WCF or the like.
So more relevant to this case we have the issue of whether or not the operation being requested by the client should be allowed (authorization / validation). It seems to me that this has three parts:
1) Is the client allowed to change a particular value or not? Most of the time this kind of check is relatively simple / static, and unless I'm missing something it's never really affected by a question of concurrency. In any case we're not really looking at whether or not the client has really changed something compared to what's in the database--it's just a matter of whether the client is allowed to change the values at all.
2) Do the requested changes validate? Are they self-consistent? Do they violate any of our business rules? Again, this is a matter just of whether or not the request makes sense. It's not really a matter of whether the client is lying about what it changed or didn't or anything of the kind. The only time this would be an issue is if we had a business rule that said something like "the client is allowed to increase the value of their insurance coverage by up to 5% but no more than that." If that were the case, then of course we would not be able to trust the original value of the insurance coverage sent by the client--we would either need to re-query the database or send round-trip signed data or something, but this kind of rule is much less common, and again, the question is not one of concurrency so much as it is about reference data. (I'll also point out that for this kind of business rule we'd probably want it to actually be that it can't be increased by more than 5% during some particular time period or something like that rather than per request or else we'd be open to other kinds of attacks.)
3) What about concurrency? Finally we get to the heart of the matter. Can we trust the client to supply the correct original value for the concurrency token? What if there is an evil or buggy client which gets past the authentication checks and then makes calls to the service with a concurrency token that does not match the original value sent? If the value is modified, there are two possibilities. Either the value is some random thing that doesn't match what's in the database in which case the request will fail (keep in mind that the EF will use that concurrency token when it attempts to update the database so it is checked against the current value in the DB before any changes go through--just more efficiently because that check doesn't require an extra round-trip/it's part of the update statement), or someone else has modified the database in the meantime and the value happens to match the new concurrency token so the update goes through when it shouldn't have. This last one is the only case we really have to worry about, but we do have to keep it in perspective. We're talking about a request that passed the other checks--it's something the client should be allowed to do--just not if someone else has modified the data between when it was read and when the update went through. Further, the client has to correctly anticipate the next value of the concurrency token as well as how many times it has been updated, etc.
Yes, I can imagine some cases where the last case could cause a problem if my data is very sensitive or my service is exposed on the internet instead of an intranet, and I allow public access or something like that. Someone might write a client that randomly tries things as a vandalism/denial of service type thing, and in that case I might go to the length of signing things sent to the client or of caching the original values so that I can use them instead of what I get back from the client, but I'd say that in the majority of cases exploiting this kind of concurrency issue isn't realistic and trusting the original value of the concurrency token sent from the client is a good, pragmatic approach.
P.S. Happy New Year! May we all have a blessed 2010.
Danny.. great stuff. Catching up on your n-tier EF blog posts and its great to see when an MS dude employs a mix of effective & pragmatic (real-world) solutions grounded in solid architectural fundamentals (n-tier fundamentals in this case) without overselling the technology.
Keep up the great work, its a great service for all the architects & leads trying to design, build & deploy this stuff in real-world enterprises. You may want to consider writing an advanced EF / n-tier book for architects (if u havent already) -- its sorely needed.