I no longer work at Microsoft, so please don't bother leaving a comment here or trying to contact me through my MSDN blog.
You can find my new blog at http://www.technologytoolbox.com/blog/jjameson. My new site also provides copies of all posts from my MSDN blog.
First off, my apologies to Paul Simon regarding the title of this blog post -- but I simply couldn't resist ;-)
When architecting and building solutions for customers, I tend to make heavy use of typed DataSets.
I believe I used them on my very first .NET project almost ten years ago, because I still clearly remember Mike Pizzo (one of the original architects on the ADO.NET team) explaining to us how we should think of DataSets as a "scratch pad" for disconnected data (as opposed to the old DAO, RDO, and ADO objects -- which thankfully are a thing of the past for most organizations). [Okay, I suppose that, technically, you could disconnect an ADO Recordset from the underlying database, but that's not how it was frequently used.]
The tooling for typed DataSets has always been very good (at least in my opinion). You can create a typed DataSet very quickly using the designer in Visual Studio and restructure an existing typed DataSet with ease. In my opinion, this is almost always faster than writing the equivalent C# code directly.
I also love the way typed DataSets can be somewhat complex -- in terms of the business rules that are encapsulated by them -- yet at the same time very easy to understand for developers that are not familiar with your solution (e.g. new team members or developers responsible for maintaining the solution).
For example, consider the typed DataSet shown below.
What can you tell just by looking at the diagram? What are the business rules that are enforced simply by the structure of the tables, relationships, and columns?
Why about the primary keys on the various tables?
If you were to right-click on the ScorecardId column in the ScorecardItem table in Visual Studio and then click Edit key..., you would see that there is a unique constraint on (ScorecardId, ClientSiteId, KpiName). In other words, each scorecard can only specify one scorecard item for given site and KPI (e.g. we don't want to allow "Site1" to have two scorecard items that refer to "KeyPerformanceIndicator1").
Also note that since a primary key on a table must be unique, the constraint on the KpiStatus table ensures that a scorecard item (i.e. a KPI) is only allowed to specify one KPI status for a particular time period. It just doesn't make sense that "KeyPerformanceIndicator1" could be both "Green" and "Red" for, say, the "2010 Q1" time period -- it has to be one or other (or "Yellow", I suppose).
We can also see that there is a unique constraint on the ClientSiteName column in the ClientSite table. Here's where things get slightly more complicated. Suppose that we wanted to retrieve data from this DataSet by filtering on the ClientSiteName. In other words, we want to only show KPI information for a particular site. Seems reasonable, right?
What if two completely different clients each had a site named "Headquarters"? That could definitely be a problem, because we certainly wouldn't want to show one client's data to a different client.
However, in order for that to actually occur, we would first have to populate the DataSet with information from two different clients. Since that wasn't how I intended this DataSet to be used, I chose to add a unique constraint on the ClientSiteName column. If, at some later point in time, new scenarios were added and I needed to use this typed DataSet to store data from multiple clients, then I would probably add a Client table and adjust the relationships and constraints accordingly.
Okay, so where exactly are we -- after all this verbiage?
The point I was hoping to make so far is that -- without looking at a single line of code or any documentation (which many developers don't like to read anyway) -- we know quite a bit about the solution, including some of the business rules, just by looking at the typed DataSet.
In my next post, I'll discuss some smart ways we can interact with a typed DataSet when actually building a solution.
[I know the content of this post is very elementary, but bear with me, I promise things will get more interesting in subsequent posts ;-) ]