Welcome to MSDN Blogs Sign in | Join | Help

Sometimes you run into a bug in DataSet, or some other product. At this point, you have a couple of choices. You can work around the bug, or you can call CSS/PSS. Most people take the former route unless they can't figure out a workaround. When you call PSS/CSS, you sometimes get a patch (hotfix), either one that exists, or a new one produced to fix the issue that you have.

Wouldn't it be nice if you could cut out the middle step, and just download the patch? Now you can... See Soma's post for more information.

http://blogs.msdn.com/somasegar/archive/2008/04/23/hotfix-access-on-code-gallery.aspx

So if you are running into a bug, take a look and see if there isn't a hotfix that might help. Let me know if you find patches that have worked, or if the search was fruitless.

Erick

I'm currently down in sunny Orlando, getting ready for Tech Ed (or is it TechEd? That dot between letters is a very odd thing). I'm going to be presenting on LINQ to DataSet, and Understanding the Entity Framework. As usual, I have far too much material for the time that I have. So I ask you, what aspects of LINQ to DataSet or the Entity Framework would you like me to focus on? Are there particular areas that are confusing, or that deserve attention?

As well as the sessions, I will also be doing a couple of TEO panels, which will be a new experience for me. The topics are LINQ in the UI Layer, and VB XML Literals for C# developers. Both panels should be great fun, and I'm told that they are highly interactive, so come with your questions!

If you see me there, come by and say hi. You'll know me by my broken foot, which I got fighting ninjas.

Thanks,
Erick

Index usage

One of the features of the DataView that often gets overlooked is its ability to very quickly find a DataRow (or multiple DataRows) by an index value.  The index is built on the sort criteria, so if you sort based on the last name column, the index will also be on the last name. In the 2.0 version of the DataSet, this was restricted to a column value. Now, with LinqDataView, this can be the result of a functional evaluation, that is, anything at all, which really opens up new possibilities.

Why an Index?

The nature of an index means that we can use it to find specific rows very quickly, based on the key of the index, or the value on which the sort occurs. We can see how this works with LinqDataView by performing a number of searches. Let’s say that we retrieve and cache a DataSet that contains our customer list, and we need to find the CustomerID from the Account number. The quick and dirty way to do this would be to have a method like the following.

private int FindCustomerID(string accountNum)

{

    var query = from cust in adventureWorksDS1.Customer

                where cust.AccountNumber == accountNum

                select cust;

 

    if (query.Count() > 0)

    {

        return query.First().CustomerID;

    }

    else

    {

        throw new Exception("Account Number not found!");

    }

}

This method works fine, and it might be ok if your load isn’t too high, and the results arrive quickly enough. However, you are repeatedly doing the same search so you can speed things up quite a bit by using an index. I am going to refer to the DataView as an index, but be aware that it’s not a special DataView, it’s just a regular old DataView (or LinqDataView).

static DataView index;

private int FindCustomerID(string accountNum)

{

    if (index == null)

    {

        var query = from cust in adventureWorksDS1.Customer

                    orderby cust.AccountNumber

                    select cust;

        index = query.AsDataView();

    }

 

 

    if (index.FindRows(accountNum).Count() > 0)

    {

        return index.FindRows(accountNum).First().Row.Field<int>("CustomerID");

    }

    else

    {

        throw new Exception("Account Number not found!");

    }

}

While this code is a lot longer, it will perform quite a bit better. Without an index, finding a particular row value will cost O(N) (as we might have to go through every DataRow) With the index, the cost to find the DataRow with the matching keys is O(log N). This is a huge gain, especially if your DataSet is fairly large.

How high is too high?

When should you use an index, as opposed to a much simpler query? Like most important questions in life – it depends. You will need to measure, and see if your code will speed up. For this simple example, the gains would likely be very small, and would likely not benefit from an index. However, if your sort criteria was something complex (long running calculations, etc), then the benefit shows up very quickly.

If you’d like to see more on this topic, please let me know. Stress and performance are complex beasts, and the usage of DataSet can have a big impact on your overall application.

Thanks,
Erick

 

Let's say you want to have a DataSet with an Integer column, and this column can have no values, which for argument's sake we'll call NULL values (there is a whole argument about NULL really means, but we'll punt on that). The DataSet has been around since the first version of the .NET Framework, and this problem has always existed, as has it's solution, DBNull. This works ok, but leads to a bunch of ugly code like the following.

int value = -1; // or some const value that represents null 

if (dataRow.IsNull("myIntColumn") == false)

{

    value = (int)dataRow["myIntColumn"];

}

Fast forward to the current day. We've had nullable types for a couple of years now, so you can now write the following code.

int? value = null;

if (dataRow.IsNull("myIntColumn") == false)

{

    value = (int?)dataRow["myIntColumn"];

}

While we have nullable Integers, we’re still left with a bunch of ugly code that we shouldn’t need to write. However, if you are using Visual Studio 2008, you do have a new option, the Field<T> method that lives in the new System.Data.DataSetExtensions assembly. With this method, you can specify the type, and for nullable types, we handle the null conversions for you.

int? value = dataRow.Field<int?>("myIntColumn");

Does that mean we’re done, and we don’t need true nullable columns? I don’t think so, but we just finished Visual Studio 2008, so please stay tuned, and if this is an important feature to you, please let me know.

 

It's been a while - but don't worry, I didn't drop off the face of the planet, or get eaten by huge worms. In the time since my last post, I've taken on some new responsibilities, namely a number of XML technologies, including those that are shipping in Silverlight 2! So look forward to new posts on this exciting new technology as well as on DataSet and the Entity Framework.

Thanks,
Erick

If you've been playing with LINQ to DataSet, you may have noticed the EnumerableRowCollection type. For example, if you have a query like the following, the type of query is EnumerableRowCollection.

var query = from row in ds.Tables[0].AsEnumerable()

            where row.Field<string>("name") == "Bob"

            select row;

So what is this type and what is it used for?

As you know, the return type for any particular call must exist. While LINQ works with IEnumerable<T>, there still needs to be an implementation type. EnumerableRowCollection is that type. The EnumerableRowCollection type has always existed, but you would never have cause to see it before Beta2. What would happen is that the AsEnumerable call would return am EnumerableRowCollection (it used to have a different name – EnumerableDataTable, but that was just a refactor). Once you made your first LINQ call, that type would be “lost”. That is, the return type from the LINQ call is IEnumerable<T>, so the actual implementation class is no longer part of the signature.

In order for LinqDataView to work without using IQueryable (which we didn’t want to do for performance reasons), we have to hold on to the implementation class. As long as are working with a EnumerableRowCollection instance instead of a “something” which implements IEnumerable we can capture each LINQ operator into a state bag which allows us to recreate the effects of the LINQ query in a DataView. Once you call a LINQ operator that we can’t support in LinqDataView (e.g., Group By), the return type is then IEnumerable, and all further calls execute in the LINQ library, not LINQ to DataSet, and the return type will then show up as IEnumerable. If there is interest I can go into further detail.

This is something that doesn’t really have an impact on the usage of LINQ to DataSet, but knowledge is never a bad thing.

Thanks,
Erick

LINQ To DataSet Data Binding – LinqDataView restriction joins

In previous episodes

In the last post I made on LINQ To DataSet Binding, I talked about how to create a LinqDataView. I’m going to spend some time today talking about some of the interesting applications that the LinqDataView allows. This is far from a comprehensive list, and I’m sure that people will even more creative ways to use LinqDataView in the future!

Our old friend, the AdventureWorks Schema

All the examples that I’m working with today are based on the AdventureWorks database, in particular, the SalesOrderHeader, SalesOrderDetail and Customer tables. The examples use a Typed DataSet with this structure, but all these techniques are equally usable with both Typed and untyped Datasets.

Join – where art thou?

One of the most common requests for the DataView is to allow joins, yet the LinqDataView doesn’t allow the Join operator, so what gives? We wanted the LinqDataView to usable in all the same places and ways as the original DataView, and this required that we maintain the same capabilities. However, all is not lost. The most common reason for needed a join is to perform a restriction join, which is something that the LinqDataView can easily do. What do I mean by a restriction join?

There are a lot of situations where the joined table represents a lookup table, for example US states. A restriction join is where you need to filter the primary based on the values of a joined table. In this case, the US states. An example from our schema is Orders and Order Details. You may want to filter Orders based on some query against the Order Details, say when a particular quantity in one of the associated Order Details is over some threshold value.

Because the LinqDataView uses lambda methods (which eventually become delegates) for the Where clause, we can do whatever we want to in the delegate, including the join!

Parent Joins

The ability to have any code in the Where clause can be used in combination with the GetParentRow method (including the Typed DataSet versions, which are simply the name of the parent name) to perform a restriction join. For the DataSet that we are using, you might need to find all the orders for a particular customer. This can be done by doing a query on the SalesOrderHeader table, with a parent restriction join to the Customer table. Such a query might look like the following.

var query = from order in adventureWorksDS1.SalesOrderHeader

            where order.CustomerRow.AccountNumber == "AW00000015"

            orderby order.OrderDate descending

            select order;

 

DataView view = query.AsDataView();

 

Child Joins

Sometimes you need to restrict a table by values in a child table. This is something that was quite difficult to do in the past, but with LINQ to DataSet it becomes much easier. You can put anything in the Where clause, including another query! This, in combination with the GetChildRows method (including the Typed DataSet versions, which is the name of the child DataTable), allows you to write powerful child restriction joins.

There is one little wrinkle. The GetChildRows methods return an array of DataRows from the child table. This is where some of the more exotic LINQ operators really come in handy. For example, the Any operator takes a lambda predicate, and returns true if any of the source elements match. To use Any (and most of the other operators I’ll be talking about) you need to use the lambda syntax – you can’t use the SQL like LINQ queries that you might be used to.

Say we want to find all the Orders with an Order Detail that has a quantity greater than 20, perhaps to send them an entire package or pallet. This query would look like the following

var query = from order in adventureWorksDS1.SalesOrderHeader

            where order.GetSalesOrderDetailRows().Any(row => row.OrderQty > 20)

            orderby order.OrderDate descending

            select order;

 

DataView view = query.AsDataView();

If you don’t mind writing a sub-query, you can convert the above query into the following.

var query = from order in adventureWorksDS1.SalesOrderHeader

            where (from od in order.GetSalesOrderDetailRows()

                   where od.OrderQty > 20

                   select od).Count > 0

            orderby order.OrderDate descending

            select order;

 

DataView view = query.AsDataView();

 

There are a number of other operators that really come in handy here. These operators include Any, All and Contains. These are all operators that work over a list (list the Child rows of a relation) and return a Boolean that can be used as the criteria for the join.

Summary

I’ve shown a few different ways that LinqDataView can be used, and some of the scenarios where it makes sense. It’s a powerful way to get data binding into your application, and is especially useful where your application is already based around DataViews.

As you work with LinqDataView, you might find additional scenarios that it works well for. I would love to hear about these – a creative use of technology is something I love to hear about.

In the next post, I’m going to talk about another data binding option with LINQ to DataSet – CopyToDataTable. There will also be a post on fast index usage in the mix somewhere.

BTW, I’d love to hear what parts of these posts are the most useful for you.

Thanks,
Erick

 

It's long overdue, but I finally was able to finish a few more posts on this series. Let me know if you find it helpful, and what you would like to see more of! 

Why DataView?

As mentioned in this previous post, there are numerous options for data binding with LINQ to DataSet. The post is going to talk about what I feel is the most powerful and flexible option, LinqDataView.

LinqDataView?

While I will be talking about LinqDataView as a new object, it is in fact the same old DataView that we know and love. The difference is how the DataView is created – so to keep confusion to a minimum, I will talk about DataView and LinqDataView. However, please be aware that if you look for a LinqDataView object, you will not be able to find it! It is an internal-only class that inherits from DataView that is not directly creatable by you, which is OK as you create it by way of a LINQ to DataSet query. Once you create the LinqDataView, you work with it the same way as you would a DataView, because that is what it is.

DataView – Features and benefits

What are some of the advantages of using the LinqDataView? To answer that, we need to understand some of the benefits of the DataView, as the LinqDataView is simply that, a new type of DataView.

Live

The DataView is a filtered and/or sorted version of your live data. If you update the DataTable on which the DataView is based, the changes will appear in the DataView. The same goes if you modify the data in the DataView – the underlying DataTable will be updated.

Sortable

The DataView can be sorted, and any changes to the data, or the sort, are immediately applied to the DataView. You can either specify the columns used to sort, or for LinqDataView, use an expression for the sorting.

Filterable

The DataView can also be filtered, and the filtering also benefits from the live nature of the DataView. When the data changes such that certain row no longer should be included in the DataView, or if they now should, the DataView immediately reflects this. This filtering can occur via a string expression or when using LINQ to DataSet, a lambda method.

Fast

Finally, the DataView is a concrete representation of an index, which speeds up a number of operations against the DataView. I’ll go into more detail on these operations in a later post.

When to use DataView

DataViews are generally used in one of two places. It is by no means restricted to these scenarios, but these are the main uses.

UI Binding

When a DataTable needs to be bound to UI elements, especially a rich control like a grid, this usually occurs via a DataView. When using a DataView for binding, you gain the ability to sort and filter (think of clicking on a column to sort by that column). Due to the two-way nature of the DataView, you are also able to make edits directly on the DataView and have those changes be reflected in the DataTable, and vice versa.

Fast lookup

A DataView is represented internally by an index, with the index key being the sort criteria. You can use this index to find DataRows within the DataView by using the Find methods. These methods allow you to leverage the index to increase performance of DataRow lookups.

DataView and how LinqDataView differs

The DataView has two key features, filtering and sorting. In versions previous to 3.5, these two features were configured by way of a string. You could specify the filter using the expression language, and specify the column for the sorting.

LinqDataView contains the same operations – filtering and sorting. The difference is that instead of a string based expression or a column name, you can write C# or VB code to do the filtering and sorting. This opens up a whole range of possibilities as to what you can do.

How to create a LinqDataView

To create a LinqDataView, you simply need to write a LINQ to DataSet query. Then, simply call AsDataView on your query, and you get a DataView!

var query = from order in adventureWorksDS1.SalesOrderHeader

            where order.AccountNumber.StartsWith("10-4030")

            orderby order.OrderDate descending

            select order;

DataView view = query.AsDataView();

 

However, there are some caveats. You can’t just write any LINQ to DataSet query. Because the LinqDataView is a DataView, you can only use it when the query has same structure as a DataView . What does that mean? That means you can filter and sort the query – any other LINQ operators will cause the query to be non-convertible to a DataView. The list of supported LINQ operators is shown below, along with any restrictions.

Where

No restrictions

Order By

No restrictions

Then  By

No restrictions

Cast

Can only cast to DataRow

Select

Can only be identity select

 

The wonders of languages

So how do you know when you can make a DataView representation of a query? Try it! If you can’t see the AsDataView method, then you’ve used a LINQ operator that isn’t supported. Generally, if you can call AsDataView, it should just work.

Where do we go from here?

Now that I’ve shown you how to create a LinqDataView, I’m going to spend some time talking about what you can do with it. My next post will talk about restriction joins and leveraging the index for fast lookups.

For an additional example of what you can do with the LinqDataView, check out my webcast on LINQ to DataSet, in particular the last section.

Have you ever wondered exactly what is going on inside of ADO.NET? Is Reflector the first thing you install after Visual Studio? Then you'll be interested in that fact that ScottGu just announced that the source for the .NET framework is going to be released. Not just released, but integreated with Visual Studio! Check out the announcement below.

http://weblogs.asp.net/scottgu/archive/2007/10/03/releasing-the-source-code-for-the-net-framework-libraries.aspx 

I know I've been remiss in posting the remainder of my data binding blog series, but rest assured that it's on the way. PASS and some personal issues kept me extremely busy over the last few weeks, but I'm back in the saddle for the posts. :)

Erick

I'm currently in Denver for the SQL Server PASS conference. I'll be giving a talk about the Entity Framework for DBAs tomorrow (Wednesday) afternoon. If you're there, please come by. Even if you can't make my session, stop by and say hello!

Thanks,
Erick

With the introduction of LINQ to DataSet there finally exists a full featured query language for the DataSet. Now your ability to query your data is limited only by CLR, which is no small thing! For an introduction on LINQ to DataSet, please see this post.

This is great new functionality, but for most applications, query is only half of the story. Once you have sliced and diced your data into the form that you want, you need to do something with this data. This might be something as simple as spitting it out to the console, or as complex as binding it to a read/write hierarchical DataGrid. With this wide range of needs comes a wide range of capabilities.

This blog post will talk about how to do data binding with LINQ to DataSet from a high level perspective. Future posts will go into more details on each particular technology, and how to use it in a variety of situations.

Basic data binding

If you simply bind the results of a LINQ to DataSet query to a grid (or some other control), you will see the values from your query results. However, the results will be simple values – you will not get the benefits of the DataSet, such as change tracking, transparent updating when the DataSet is updated, etc. Some features may work some of the time, but your experience will be less than stellar.

Why is this? When a LINQ to DataSet query is executed, the result is an IEnumerable<T>. How data binding will work depends on the type of the enumeration (the T). There are a number of possibilities, each with its own problem. If the query is a projection of something, such as anonymous types, then the results are “torn off” from the source DataTable. This means that when you edit the values, then source never gets updated –in essence no data binding is occurring. If the projection is a DataRow, then you will get tracking for those DataRows, but because you are not bound to the DataTable, you end up not being able to add/delete DataRows!

Using your query results with DataSet – three options

If basic binding does not always work, what are the other options?

DataView

There already exists a way to do data binding on the DataSet – the DataView.  The DataView provides a dynamic view of the data in your DataTable, with a filter and sort order. In versions prior to Visual Studio 2008, this filter used the string expression language, and the sorting was performed on a column. For details on this functionality, see the DataView documentation.

With Visual Studio 2008, you can now use the power of LINQ to DataSet to create more expressive filters and sorts with your DataView, and continue to get all the benefits of the DataView.

var query = from order in adventureWorksDS1.SalesOrderHeader

            where order.IsCreditCardIDNull() == false

            orderby order.OrderDate descending

            select order;

 

dataGridView1.DataSource = query.AsDataView();

However, this ability to create a DataView from a LINQ to DataSet query is not the end all be all of data binding. There are a strong set of restrictions as to what kind of queries can be converted into DataViews. I will go into these details, as well as ways to work around some of these restrictions, in the future LinqDataView post.

CopyToDataTable

LINQ has a whole host of operators that you can use to write queries. These queries can become quite complex, with grouping, projections, etc. Sometimes you need to bind to the results of these complex queries, but you still want the benefits of working with the DataSet. Features like change tracking are what makes the DataSet so useful. So if you want to use DataSets for binding, but have a more complex query than can be used with DataView, you can use CopyToDataTable.

var query1 = from product in adventureWorksDS1.Product

            where !product.IsColorNull() && product.Color == "Black"

            select product;

 

var query2 = from product in adventureWorksDS1.Product

             from subcat in adventureWorksDS1.ProductSubcategory

             where product.ProductSubcategoryID == subcat.ProductSubcategoryID

             where subcat.ProductCategoryRow.Name == "Bikes"

             select product;

 

var finalQuery = query1.Union(query2);

 

DataTable bindingTable = finalQuery.CopyToDataTable();

 

dataGridView1.DataSource = bindingTable;

CopyToDataTable’s only restriction is that you must project DataRows from the outer Select statement. This is because the way that it works is by loading the DataRows from your LINQ to DataSet into a DataTable. Once the query has been moved into the DataTable, you can bind it, update it, etc. In a future post, I’ll go over the exact scenarios where you want to use this approach.

BYOC

And finally, there is the old standby – Bring Your Own Code. Ultimately, a LINQ to DataSet query is an IEnumerable<T>, and so if you have an idea of how you need to map from a T into your DataTable, you can simply copy the results in!

var query = from employee in adventureWorksDS1.Employee

            where employee.BirthDate.DayOfYear == DateTime.Now.DayOfYear

            where employee.IsManagerIDNull() == false

            select new { ManagerEmail = employee.EmployeeRowParent.ContactRow.EmailAddress, employee.ContactRow.EmailAddress };

 

foreach (var result in query)

{

    bindingTable.Rows.Add(new object[] { result.ManagerEmail, result.EmailAddress });

}

There are some other neat things that can be done with a little bit of code, and I will show you some of them in a future post, including a nice little code snippet that will you to copy any IEnumerable<T> into a DataTable!

DataView

LINQ in general and LINQ to DataSet in particular are opening up a whole new world of query, giving you the ability to execute some really powerful queries. However, because they are so flexible, you lose some of the abilities to put together an end-to-end data binding story out the box. Part of the reason for this is that the flexibility of LINQ in query needs to be paired with flexibility in data binding.

This post gives you a little taste of how you can bind LINQ to DataSet query results, and I’ll be digging into more detail soon. Until then, please let me know what kind of questions you’d like to see answered, and I will do my best to answer them.

Thanks,

Erick

I was recently asked what a high level list of good starting points for the Entity Framework is. This is a tough question as there are a lot of areas to cover. I've put together a list that I think makes sense from my perspective, which is tilted towards gaining an understanding of the model and the relational mapping, as opposed to getting up and running quickly. Most of the links refer to other blog posts, as these are the resources that I am familiar with.

Understanding the model
The core the Entity Framework is the Entity Data Model. This is what makes the Entity Framework a data platform, as opposed to another ORM mapper. So it follows that the key to understanding the Entity Framework is to understand EDM. It's not always an easy model to understand, especially when you have to look at XML files (keep your eye out for a better way soon), but learning the raw does have benefits. A lot of the time the model will simply use relational ideas, but there is a lot more possible with this model. In particular, with the way that associations are defined, we can actually do some very things that require associations to be a first class idea. These two posts are a great place to start.  
Entity Data Model 101 : Part 1
Entity Data Model 101 : Part 2

Mapping and Metadata
The first version of the Entity Framework is really a system by which the EDM can interact with a database. This is where the ORMness of the EF comes in. To understand how this works, another one of the 101 posts is a good place to start. However, read these as a introduction, not a bible, as with Beta 2 of the Entity Framework, a lot will be changing with what you can do.
Mapping 101 : Part 1
Mapping 101 : Part 2

Querying
Finally, in order to play with the bits, you need a way to query the EF. There are a couple of different ways that you can do this, but from a pure model understanding perspective, I would recommend starting with Entity Client and Entity SQL.

What's missing? 
From a intro course in the Entity Framework, this is where I would start. So what about Object Services, LINQ to Entities, and actually using this technology in an application? After all, not many people want to use Entity Client all the time, and one of the cool points of the EF is the ability to actually do OOP with a database. Does this mean that I recommend using Entity SQL over LINQ to Entities? Not at all; in fact I think that most people will use LINQ to Entities in combination with Object Services. However, I think that details and issues surrounding things like code generation and ObjectQueries are actually distractions when you are trying to understand the model and mapping. Remember, the EDM is the middle layer, between your object layer and your database, and so by looking at the EDM down you get a better view then if you have look from above the object layer.

Do you have a resource that you found was really good? If so, please let me know!

Thanks,
Erick

 

There are a host of new technologies coming out, and among them are some ORM type of systems. I want to spend some time exploring how a DBA will work with these, and if they are good or bad. Given that I am on the Entity Framework team, that is the place that I want to start.

 

 

Entity Framework for DBAs

There are two new data access technologies coming out of Microsoft in the coming year that have particular interest for DBAs: the Entity Framework and LINQ to SQL. Both of these technologies are going to radically change the access patterns that you see as a DBA, and the way that you work. What is the impact of these technologies on a DBA? Are they a step in the positive direction?

 

For this post, I am going to focus on the Entity Framework, and look at some of its features and capabilities as they relate to being a DBA. If you would like an introduction to the Entity Framework as a whole, this post is a great place to start.

 

Fear and Loathing

When most DBAs first hear about LINQ and the Entity Framework, the first reaction is revulsion. In a previous life, I wore a DBA hat, so my initial reaction was not too different. Why?

 

Take a look at a LINQ to Entities statement like the following.

 

var query = from Cust in Customers

            where Cust.LastName == “Smith”

            select Cust.ID;

 

Ok, that is not so bad, and not too different from something you might see in a stored procedure. Now, take a look at this.

 

var query = Customers.GroupBy(c => c.Region)

    .Select(g => new

                 {

                     Region = g.Key, FreightTotal = g

                    .SelectMany(c2 => c2.Orders)

                    .Sum(o => o.Freight)

                 });

 

What exactly is this going to do to your database? How many table scans are going to occur? It might be a great query, or it might bring your server to its knees. What happens when you need to normalize the Customers table for performance? What is it that we gain for this uncertainty?

 

The Journey Forward

The journey from embedded SQL to where we are today has been long and arduous. We have passed a lot of mileposts: T-SQL, views, stored procedures, etc. With each step, DBAs have been able to exercise more control over the database, both in terms of administration and in usage, while allowing the application to develop in its own way. This has allowed productivity to increase, and systems to become less fragile.

 

The introduction of the Entity Framework furthers this tradition, and has the benefit of giving the DBA more power and control over the system.

 

Control

In my mind, the key attribute of a DBA is that the power of control rests with them. The DBA is the guardian of performance, security and reliability. None of these can be achieved with the ability to control what goes on; what SQL is executed, what the database structure looks like, who can run what, etc.

 

The above LINQ query looks a lot like something you would write in T-SQL (odd syntax notwithstanding), but actually, you are writing to the conceptual model. What does this mean? This means that you now have more control over the database!

 

How does this work? Now, you do not need to care how ugly that LINQ query looks, as it all works against the logical model. So, you have more control over the database, because as long as you make sure that the logical model remains consistent, you can do whatever you need to do in order to make the system work!

 

Model

Database models are nothing new. In fact, they are very old. In the beginning, there was EF Codd, who brought to bear the power of relational algebra. There was also Chen, who created the Entity Relational Model (ERM). From these, the basic database model was born. But like most things in computer science, a little seasoning was in order. And nothing seasons like a well chosen abstraction.

 

The evolution of views

First there were joins, which allowed for normalization to occur. However, as databases became larger and more complex, queries started to look like the tax code. To help, another tasty abstract layer arrives: views. Views allowed DBAs to consolidate a set of joins into a logical “table”, and, with later versions of SQL Server, even allowed for some level of updating (if you were willing to write instead-of triggers).

 

By abstracting the object on which you query data from the way in which you store data, views gave DBAs more control and flexibility over the structure of the database. Tables could be changed, normalized, partitioned, and as long the compensating changes were made in the view, life was good, if a little boring with all the CRUD SQL that needed to be written.

 

Stored Procedures – an alternative evolution

Views are not the only way to abstract the structure of a database; another common approach is to use stored procedures. This gives the DBA the ultimate in flexibility. The entire API for the database can be explicitly defined, but it is flat, and you cannot compose them, so all that powerful SQL is no longer usable by the developer.

 

In addition, you still need to maintain all those nasty novel sized SQL statements, but at least the developer cannot see them!

 

The Entity Framework

With the Entity Framework (EF) and the Entity Data Model, the DBA/Architect now has a system that has, and expands on, the benefits of both views and stored procedures. This framework currently resides outside of the database, but it can be thought of as a part of the data layer.

 

Now, instead of having to hand manage views (including those oh-so-fun instead of triggers), the Entity Framework will do the heavy lifting for you. The Entity Framework contains some very intelligent view mapping technology, so if you can declare the mapping, you can update it!

 

Now, instead of having to write thousands of stored procedures to control the API of a database, you can create a logical model that becomes the API. And when you need the benefit of stored procedures for performance or logic, you can still plug them in.

 

Does this mean that the Entity Framework will replace views and stored procedures? NO!

 

Stored procedures and views are still very powerful and useful technologies. Performance and security are two key benefits of stored procedures and views, and there is no reason to throw the baby out with the bathwater! Myself, I would not give up indexed views for the world; those things are very cool.

 

Before the Entity Framework, the API between the database and the application was like a pond. Any time you touched anything, the mud would rise in “clouds”. Likewise, any change to the database might affect the API, and potentially cause things to break.

 

 Muddy API

 

After the Entity Framework, the API is now explicit. It can still use the same technologies and techniques on the database side, but it is not cloudy any more. It gives you the flexibility to change things, plus it allows for some cool benefits for the application developer, like ORM, LINQ, and Entity SQL.

 

 Clean API

 

So the Entity Framework lets you continue to use the technologies that you love (stored procedures, views, etc) and even those you have to tolerate (dynamic SQL). Except that now you don’t need to jump through hoops to support an implicit API for accessing the database; you can now worry about more important things.

 

We now live in the land of milk and honey, right? Remember, there is no such thing as a free lunch. So what do you give up? Some measure of query control. But the gain in database control and flexibility far outweighs the loss of query control, in my opinion.

 

The best use of your time

We now have a way to cleanly define a logical API for developers, one that does not require weeks and weeks of CRUD development, and a long involved process with each change. That alone is worth the price of admission!

 

The Entity Framework is going to you as a DBA the ability to make better use of your time, and to spend some on making the system better, instead of changing random SQL queries all day. Like any version 1 of a product, I am sure that the Entity Framework will cause some problems for DBAs, but the upside is huge, and that is why my DBA side has come around to seeing the benefit of the Entity Framework.

 

 

What do you think? Am I full of hot air? Something else? What are you concerns as a DBA about the Entity Framework? I'd like to use this blog as a way to have a conversation with you, to help make a better product.

 

Thanks,

 

Erick Thompson

Progra