Introduction

The LINQ framework offers a basic pattern of sequence operators, allowing for primitive behavior such as filtering, grouping, and joining over collections and types. The support of the high level query comprehension syntax in C# 3.0 is paramount to developing an application quickly and allows for transparent development practices for developers where querying an in-memory collection or a remote database works the same.

While LINQ to SQL provides necessary services for radically simplifying how developers write applications, it has some shortcomings.

Why use LINQ to SQL?

To provide a sustainable model for future support, an object model was created to match the business entities used by the application. Having dependencies on tables and views in our relational database, the objects required “mapping” from the relational world to the object world, which was achieved using LINQ to SQL.

LINQ to SQL provides a simple mapping between tables, columns, and relationships to classes, fields, and collection properties (for relationships) that allow for “ease of use” programming at the cost of composability. Composability was lost due to the nature of databases and the limitations of LINQ to SQL such that relational databases relate tables via foreign key relationships where individual entities need not concern themselves with relationships (such as Customers to Orders); however, when mapping tables to classes the relationships are defined in advance.

Further, the use of classes mapped to tables allow for database abstraction and suit the needs of rapid application development such that the framework ensures correct database create, read, update, and delete statements are issued after changes are made to objects through the use of change tracking mechanisms.

LINQ to SQL Architecture Limitations

LINQ to SQL does not have a good story with distribution and concurrency such that new application design patterns for “cloud computing” do not translate well. Entities can be detached, attached, and moved around various layers in applications but are not truly portable (Strahl, 2007) and are really only for single-tier or partitioned multitier-esque solutions.

Multi-threading

LINQ to SQL (or LINQ for that matter) does not natively support distributed parallel queries. LINQ queries are translated to SQL statements and objects are materialized when required from the data stream. By design the LINQ to SQL DataContext class is not thread-safe and was not meant to be shared, however, compiled queries are thread safe and can be used with separate data contexts.

Forward only (reference) data

It is not uncommon for applications to include varying amounts of reference data used to assist the user in making a decision. This can commonly be seen in contact card scenarios where address details are bound to combo boxes (such as states or provinces and countries or regions).

The general guidance for using LINQ to SQL is to use short lived data contexts, where the data context is discarded after each unit of work operation. The fundamental issue encountered with this approach is entities that have data stored in the database and are materialized cannot be attached or associated to a new data context and therefore must be loaded with each new data context. The approach is valid and introduces a trivial amount of overhead when loading small or moderate amounts of data. However, in several instances our application can have large amounts of data loaded and bound, producing a noticeable delay when loading records.

To work around the short data context lifetime an object pool was created to store the values loaded from the database. All classes that used the reference data were modified to exclude the specific 1:1 foreign key attributes (effectively making a simple property) for LINQ to SQL. The contents of the property retrieved the appropriate value from the object pool, based on the foreign key identity, during retrieval and likewise updated the foreign key value during a set operation.

Concurrency

Through the use of time stamps on rows in our relational database, the system makes use of optimistic concurrency. When LINQ to SQL detects a concurrency conflict, an exception is raised where the list of conflicting objects can be retrieved.

Issues with time stamps

While using time stamps an issue arose that caused significant issues with the application:

1. Multiple entities with time stamp properties are updated in a context

2. During change submission conflicts occur for some of the entities but not all

3. The raised change conflict exception is caught and the conflicts are resolved

4. Change submission is requested now that conflicts are resolved and an exception is raised indicating the generated time stamp has been updated

The fundamental issue is one buried inside the LINQ to SQL framework—after processing the first change submission the properties are updated on the entities that are not in conflict. When the exception is raised the time stamp property is not reset to its original (due to the failure on submission) and left in a modified state. (Parra, 2008)

There are multiple workarounds for this issue:

1. Do not use time stamps for concurrency or disable automatic synchronization. Automatic synchronization can be disabled for update scenarios (or disabled completely) by changing a parameter on the property’s attribute. If synchronization is disabled for update operations, the issue is avoided; however, the time stamp property is uncoordinated with the database (Kulkarni, 2008), (Parra, 2008). This can be resolved in several different ways:

a. Resolve the concurrency conflict by merging the entity. A concurrency conflict exception is raised on the next call to submit changes and can be handled by merging the changed values with the database values (effectively refreshing the time stamp while keeping the user changes).

b. Proactively refresh the entity after the update operation. Once the update operation has completed issue a refresh for the entity to coordinate the time stamp values (Parra, 2008).

c. Discard the DataContext that materialized the entity. After the changes submit successfully the data context can be disposed. Future queries for the entity are performed in another context and will contain the correct value from the database (Kulkarni, Lifetime of a LINQ to SQL DataContext, 2008).

2. Manually update the entity’s time stamps to their original values. After the conflict exception is caught and the entities in conflict are resolved, the entities that were also not in conflict must have their time stamp properties updated.

Refreshing entities

LINQ to SQL offers a succinct means to refresh materialized entities, either singularly or in a collection; however, some behavioral anomalies were uncovered that lead to issues with the application—specifically issues with “deep refresh” or refresh of related entities:

1. The refresh method only works for objects that have already been materialized. If additional foreign keys have been added to the database they are not loaded on a refresh call--membership is fixed once the related entities are loaded (Kulkarni, LINQ to SQL Tips 6: How to refresh given DataContext won't stomp on your objects, 2008).

2. To load additional foreign keys a new select statement must be issued. New entities are added to the collection during a new selection; however, existing entities are not updated due to LINQ to SQL object pooling. A refresh is required of the entities to ensure they are up to date (Kulkarni, LINQ to SQL Tips 9: Understanding DataContext's internal caching, 2008).

This became an issue in our application where we have a single entity that contains a dozen foreign key relationships. Performing a deep refresh (having refreshed all data from all related entities from the database) issues dozens of queries to the database (specifically it is the number of related entities multiplied by two).

Query Throughput and Scalability

Our application employs the use of several static queries to retrieve data in various portions of the application (examples: work in progress, recently completed work, etc.). Upon execution of the query the results are bound to a list where the user can interact with them. The fundamental issue we had was with large amounts of objects being materialized on query execution (greater than 500). The queries took large amounts of time to complete (due to materialization overhead) and either timed out or the clients ran out of memory. In instances where objects were successfully materialized the users would encounter out of memory errors due to a memory leak with the DataContext object after two to four hours of regular usage.

To resolve this issue the application used stored procedures to return the required data quickly and bind to the list box for interaction, invoking object materialization when the user navigated to the item for interaction. The resulting read only list did not provide all of the rich interaction features the entities provided (such as updating data for multiple items in the list when it is updated in a detail screen), however, the resulting query performance was much more acceptable and netted a big usability gain. The issue with memory leaking still persisted but under normal usage conditions was not noticeable.

Concluding Remarks

LINQ to SQL provides the ability to expose database objects as they exist in the relational world as .NET objects without layers of abstraction. Providing an easy mechanism to expose LOB data at the cost of some performance, developers must focus on balancing materialized objects and their lifetimes. Our LOB tools were able to effectively accomplish this by minimizing the amount of tracked entities materialized by leveraging stored procedures for bulk data retrieval and LINQ to SQL for specific data edits resulting in a seamless experience to the end user.

Works Cited

Kulkarni, D. (2008, April 27). Lifetime of a LINQ to SQL DataContext. Retrieved from Dinesh's Cyberstation: http://blogs.msdn.com/dinesh.kulkarni/archive/2008/04/27/lifetime-of-a-linq-to-sql-datacontext.aspx

Kulkarni, D. (2008, May 23). LINQ to SQL Tips 6: How to refresh given DataContext won't stomp on your objects. Retrieved from Dinesh's Cyberstation: http://blogs.msdn.com/dinesh.kulkarni/archive/2008/05/23/linq-to-sql-tips-6-how-to-refresh-given-datacontext-won-t-stomp-on-your-objects.aspx

Kulkarni, D. (2008, July 1). LINQ to SQL Tips 9: Understanding DataContext's internal caching. Retrieved from Dinesh's Cyberstation: http://blogs.msdn.com/dinesh.kulkarni/archive/2008/07/01/linq-to-sql-tips-9-understanding-datacontext-s-internal-caching.aspx

Parra, S. (2008, June 2). Feedback: Linq. Conflict resolving records with timestamps. Retrieved from Microsoft Connect: http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=348901

Strahl, R. (2007, August 15). LINQ to SQL and Disconnected Entities Follow-up. Retrieved December 7, 2007, from Rick Strahl's Web Log: http://www.west-wind.com/weblog/posts/135659.aspx

Technorati Tags: , ,