DLinq (Linq to SQL) Performance (Part 2)

Published 25 June 07 05:10 PM | ricom 

So after getting some high level times I started digging into the particulars of the costs more broadly and I ended up studying a very simple query like the below one. 

Northwinds nw = new Northwinds(conn);

var q = from o in nw.Orders
           where o.OrderId == orderid
           select o;

foreach (Orders o in q)
{
    …
}

It was the per-query costs that seemed to be the greatest trouble spot in the then-current profiles. Those costs would be the most problematic for complex queries which return comparatively few rows – all too common in business logic.  For small numbers of rows the rough bucketization of costs looked like this:

Category    Time
Total Benchmark 100.00%
   Query Build 24.40%
   Query Enumeration 74.55%
     Dispatch Glue 7.34%
     Jitting Costs 18.07%
     Data Reading 49.14%
   Misc 1.06%

In short the problem is that the basic Linq construction (we don’t really have to reach for a complex query to illustrate) results in repeated evaluations of the query if you ran the query more than once.

Each execution builds the expression tree, and then builds the required SQL. In many cases all that will be different from one invocation to another is a single integer filtering parameter. Furthermore, any databinding code that we must emit via lightweight reflection will have to be jitted each time the query runs. Implicit caching of these objects seems problematic because we could never know what good policy is for such a cache – only the user has the necessary knowledge.

But all is not lost... the usual parameterized query model seems to be helpful here without unduly complicating everything. You could imagine a sequence something like this:

Func<Northwinds, IQueryable<Orders>, int> q =
        CompiledQuery.Compile<Northwinds, int, IQueryable<Orders>>
                ((Northwinds nw, int orderid) =>
                            from o in nw.Orders 
                            where o.OrderId == orderid
                            select o );

Northwinds nw = new Northwinds(conn);

foreach (Orders o in q(nw, orderid))
{
     ...
}

The important thing here now is that q is a durable thing that can be applied to different data contexts and we've identified the orderid paramater.  You'll have to forgive my syntax I don't think there's a compiler in existance (old or new) that compiles precisely the above but hopefully you'll get the idea.

Importantly, upon compilation, the query can be reduced to some kind of prepared statement. At this time any helper methods that need to be code-generated are also created. Upon binding we do the minimal query formatting for the constants and no jitting. The compiled query has lifetime specified by the user, so it lives exactly as long as it needs to.

These operations would drastically reduce per query overhead while simultaneously giving us a good place to hang state with suitably lifetime – compiled queries.

That seemed to get us forward progress on the per-query costs but what about the per-row costs?

We had a couple of different ideas to help with those as well.

Stay tuned for part 3.  :)

P.S. Keep your eye on Matt Warren's Weblog as he'll likely comment on what I'm saying as the series evolves,  it was his hand that actually made the changes I'm talking about here.

Filed under: ,

Comments

# The Wayward WebLog said on June 25, 2007 10:53 PM:

Rico continues his series on LINQ to SQL performance. In this post he takes a look at the breakdown of

# Bill Graziano said on June 26, 2007 12:51 AM:

I think what you're describing is called a stored procedure in the DBA community.   :)

# Frans Bouma said on June 26, 2007 3:47 AM:

What I'd like to know is which tool is used to get the list of numbers you show? When I profile our o/r mapper framework I in general do that on debug build code to see what the true costs are of the algorithms and the code used, but when I profile the release build, I'll never be able to get e.g. JIT costs...

It might be interesting to know how exactly one should profile code to get a good overview what's REALLY going on. Profiling, like debugging, is IMHO a black art: a novice is likely stare at the profiler report for hours without knowing where to begin optimizing.

# Pablo Castilla said on June 26, 2007 9:25 AM:

could you repeat the benchmarks with orcas to see if there is any improvement?

# Anders Borum said on June 26, 2007 1:52 PM:

Converting regular LINQ expressions to compiled queries quickly becomes a trivial task, and I hope that most LINQ to SQL developers realize, that this is a pattern they should look into (it's a prime pattern candidate the LINQ to SQL team to focus on when writing docs and samples).

Looking forward to the 3rd part.

# rogerj said on June 27, 2007 9:45 PM:

Rico,

I've run some performance comparisons between LINQ to SQL parameterized and compiled queries with varying numbers of returned rows in http://oakleafblog.blogspot.com/2007/06/rico-mariani-dlinq-performance-tips.html.

--rj

# rogerj said on June 29, 2007 10:28 AM:

There's actual data that compares parameterized and compiled query performance with some updates on compiled queries and DataContext objects at http://oakleafblog.blogspot.com/2007/06/rico-mariani-dlinq-performance-tips.html.

--rj

# Rico Mariani's Performance Tidbits said on June 29, 2007 12:01 PM:

I’d like to start with a little housekeeping. Some readers asked me how I made the nifty table in part

# Ridge said on June 29, 2007 4:22 PM:

So I'm unclear here...  Are you advocating that the end-user developer create compiled queries for best perf or are you suggesting that behind-the-scenes (post beta 1) that you're doing this for us?

# Linq in Action News said on July 4, 2007 5:39 PM:

Some quick links about LINQ: Articles about extension methods by the Visual Basic team Third-party LINQ

# Fabrice's weblog said on July 4, 2007 5:40 PM:

Some quick links about LINQ: Articles about extension methods by the Visual Basic team Third-party LINQ

# Public Sector Developer Weblog said on July 5, 2007 5:31 PM:

One of the things I get asked quite often is "How does LINQ to SQL affect performance compared to writing

# Noticias externas said on July 5, 2007 5:55 PM:

One of the things I get asked quite often is &quot;How does LINQ to SQL affect performance compared to

# neuhawk said on July 7, 2007 4:15 AM:

linq to sql 的动态条件查询方法

# Marco Russo said on July 9, 2007 8:17 PM:

Rico Mariani did a very good job analyzing performance implications of LINQ to SQL queries. He is currently

# Mike Taulty's Blog said on July 17, 2007 5:18 AM:

Also been catching up on Rico Mariani's notes on improvements to LINQ to SQL performance between the...

# Charlie Calvert's Community Blog said on August 13, 2007 2:36 AM:

There are several good new blogs from members of the community team. Nevertheless, the most important

# Ronan Geraghty's Weblog said on August 24, 2007 1:03 PM:

I've been meaning to dig into LINQ performance for some time (actually since it came up during one of

# MSDN Ireland Blog said on August 24, 2007 1:03 PM:

I've been meaning to dig into LINQ performance for some time (actually since it came up during one of

# jankyBlog said on August 27, 2007 3:58 AM:

Risorse su Linq to SQL

# Wriju's BLOG said on November 1, 2007 1:31 PM:

Some of the best blogs on LINQ to SQL I found are available for great learning, Scott Guthrie The Famous

# Rico Mariani's Performance Tidbits said on November 14, 2007 8:21 PM:

[ By popular demand, here are links for all 5 parts in the series Part 1 , Part 2 , Part 3 , Part 4 ,

# Wriju's BLOG said on July 16, 2008 6:07 PM:

ADO.NET is our contemporary data access component and now we have written many applications. Now there

New Comments to this post are disabled

Search

This Blog

Syndication

Page view tracker