Welcome to MSDN Blogs Sign in | Join | Help

DLinq (Linq to SQL) Performance (Part 3)

 

I’d like to start with a little housekeeping. Some readers asked me how I made the nifty table in part 2 that showed the costs broken down by major area.

It was actually pretty easy to create that table using our profiler. I did 500 iterations of the test case in sampled mode and that gave me plenty of samples. I could see which callstacks ended in mscorjit.dll – even without symbols -- which I never bothered using -- that gave me a good idea how much jit time there was. I could see the data-fetching functions from the underlying provider being called -- the same ones that appear in the no-linq version of the code (GetInt32, GetString and so forth) so I knew what the costs of actually getting the data were. I could see the path that creates the expression tree for the query and I could see stub dispatch functions. So I added up the related ones, broke it into 5 categories and then showed one more line for the bits that didn’t fit into any of those categories. Then I scaled the numbers up so that the part of the benchmark I cared about was 100% (there was other junk in my hardness not relevant to the benchmark).  That’s it :)

One more gotcha though, when I talked to Matt about this we realized that I had reported the breakdown from an internal build that included one change he had made for me from the May 2006 CTP. The breakdown you would see if you did this experiment again, on a May 2006 CTP build, would have reflection costs instead of jitting costs.  I'll discuss that more when I go into the specific improvements we made which are coming soon...

But, meanwhile, let's move on to the real topic of Part 3…

Per Row Costs

In this area there were two things that I worried about.

Entity/Identity Managment

The data binding problem used to be at the top of my list but entity creation costs started to worry me more. This query is sort of an example of a troublesome one:

var q = from o in nw.Orders
select o;

Doing foreach over the above is much less efficient than the below, and the astonishment factor for that might be pretty high.

var q = from o in nw.Orders
select new {o.everything …};

In general all these “read a bunch of data” queries are likely to suffer significant overhead because of temporary allocations and the mid-to-long life of the entities being enumerated. In the first formulation, the objects have to be stored because they might be modified as you foreach over them (or later). In the second formulation no object management is required because you’re newing up a synthetic object not associated with a table.

But, there is a strong expectation that code is just a forward-only read operation with at most one temporary business object created per iteration. Especially if you don’t modify the objects in the foreach loop is read-only. That expectation also implies a nifty solution.

Since the Connection is a property of the context you could reasonably have multiple contexts associated with one connection. In particular a read-only context could be just the thing to avoid all this entity creation. Importantly the read-only context can be connected to the same Connection which means isolation concerns go away – it looks like one logical view of the database. This is important because otherwise a thread owning two DataContext objects could self-deadlock.

So I recommend creating some kind of read-only context to avoid the overhead of object management when it was not necessary.

Data Binding

The second item is overhead associated with extracting the data. In the then current implementation there were several virtual calls between the MoveNext/GetCurrent on the enumerator and the actual data field fetches.

However, if we do query compilation as described in the last installment we have the opportunity to significantly limit the dispatches – as low as one delegate call.

To do this we need to pre-build the necessary helper for getting and storing the fields so that there is effectively straight line code calling the underlying provider. Basically the same code you would have to write if you were doing the data access manually.

Pre-compiling it also means we don’t have to jit the helper every time the query runs – we have a place to store the compiled helper function that stores the data.

At this point we had a pretty good idea what kinds of actions to take to make things a whole lot better.

Published Friday, June 29, 2007 9:01 AM by ricom
Filed under: ,

Comments

# LINQ to SQL: Go Rico Go!

Rico has his third installment on LINQ to SQL performance up on his site and he finally lets us in on what he thinks the problems are/were.

Friday, June 29, 2007 3:43 PM by The Wayward WebLog

# re: DLinq (Linq to SQL) Performance Part 3

Does someone have an example of such a read-only datacontext?

Friday, June 29, 2007 4:23 PM by Ridge

# re: DLinq (Linq to SQL) Performance (Part 3)

Rico, i realized tests with Linq in Orcas Beta 1 then developed a cache of queries compiled increasing more and more the performance, my approach was just create the entities in demand, creating a extension method called ToReader that intercepts the execution and get the internal Reader. What think?

Sunday, July 01, 2007 1:24 PM by Sidney

# re: DLinq (Linq to SQL) Performance (Part 3)

I think the reason the 'select o' is slower than the 'select new { o...' version is the extra support for tracking changes.  The LINQ to SQL runtime is wiring up events, etc. and this infrastructure is the expensive part.

Either query approach can (and should if it doesn't) use a forward-only, firehose cursor to get the data into memory.

As for connection pooling, the SqlConnection provides this by default.  If multiple connections are opened using exactly the same connection string by the same user, the connections are pooled by default, giving the associated perf gains for free.

Monday, July 02, 2007 8:27 AM by Bill Schulz

# Quick LINQ link list

Some quick links about LINQ: Articles about extension methods by the Visual Basic team Third-party LINQ

Wednesday, July 04, 2007 5:40 PM by Fabrice's weblog

# Quick LINQ link list

Some quick links about LINQ: Articles about extension methods by the Visual Basic team Third-party LINQ

Wednesday, July 04, 2007 5:40 PM by Linq in Action News

# Quick LINQ link list

Some quick links about LINQ: Articles about extension methods by the Visual Basic team Third-party LINQ

Wednesday, July 04, 2007 5:41 PM by Fabrice's weblog

# Quick LINQ link list

Some quick links about LINQ: Articles about extension methods by the Visual Basic team Third-party LINQ

Wednesday, July 04, 2007 5:48 PM by Linq in Action News

# LINQ to SQL performance

One of the things I get asked quite often is "How does LINQ to SQL affect performance compared to writing

Thursday, July 05, 2007 5:31 PM by Public Sector Developer Weblog

# LINQ to SQL performance

One of the things I get asked quite often is "How does LINQ to SQL affect performance compared to

Thursday, July 05, 2007 5:55 PM by Noticias externas

# linq to sql 的动态条件查询方法

linq to sql 的动态条件查询方法

Saturday, July 07, 2007 4:15 AM by neuhawk

# LINQ to SQL performance optimizations

Rico Mariani did a very good job analyzing performance implications of LINQ to SQL queries. He is currently

Monday, July 09, 2007 8:17 PM by Marco Russo

# LINQ to SQL and (micro) Performance

Also been catching up on Rico Mariani's notes on improvements to LINQ to SQL performance between the...

Tuesday, July 17, 2007 5:18 AM by Mike Taulty's Blog

# Community Convergence XXIX

There are several good new blogs from members of the community team. Nevertheless, the most important

Monday, August 13, 2007 2:36 AM by Charlie Calvert's Community Blog

# LINQ to SQL - compiled queries (with working example for beta 2)

I've been meaning to dig into LINQ performance for some time (actually since it came up during one of

Friday, August 24, 2007 1:03 PM by Ronan Geraghty's Weblog

# LINQ to SQL - compiled queries (with working example for beta 2)

I've been meaning to dig into LINQ performance for some time (actually since it came up during one of

Friday, August 24, 2007 1:03 PM by MSDN Ireland Blog

# LINQ to SQL - compiled queries (with working example for beta 2)

I've been meaning to dig into LINQ performance for some time (actually since it came up during one of

Friday, August 24, 2007 3:27 PM by Ronan Geraghty's Weblog

# LINQ to SQL - compiled queries (with working example for beta 2)

I've been meaning to dig into LINQ performance for some time (actually since it came up during one of

Friday, August 24, 2007 3:27 PM by MSDN Ireland Blog

# Risorse su Linq to SQL

Risorse su Linq to SQL

Monday, August 27, 2007 3:58 AM by jankyBlog

# LINQ to SQL : Some of the best BLOGs

Some of the best blogs on LINQ to SQL I found are available for great learning, Scott Guthrie The Famous

Thursday, November 01, 2007 1:31 PM by Wriju's BLOG

# DLinq (Linq to SQL) Performance (Part 1)

[ By popular demand, here are links for all 5 parts in the series Part 1 , Part 2 , Part 3 , Part 4 ,

Wednesday, November 14, 2007 8:21 PM by Rico Mariani's Performance Tidbits

# 10 Tips to Improve your LINQ to SQL Application Performance

10 Tips to Improve your LINQ to SQL Application Performance

Sunday, May 04, 2008 9:43 PM by History Viewer

# LINQ to SQL vs. ADO.NET – A Comparison

ADO.NET is our contemporary data access component and now we have written many applications. Now there

Wednesday, July 16, 2008 6:08 PM by Wriju's BLOG

# 10 Tips to Improve your LINQ to SQL Application Performance

Heythere,backagain.InmyfirstpostaboutLINQItriedtoprovideabrief(okay,bitdetailed)in...

Tuesday, November 11, 2008 12:07 AM by 吴明浩
New Comments to this post are disabled
 
Page view tracker