Exploring the Performance of the ADO.NET Entity Framework – Part 2

Published 11 February 08 02:04 PM | dpblogs 

Query Performance

During this post I’ll show a few patterns on how to improve the query performance.

A major design element for performance is the query cache. Once a query is executed, parts of the query are maintained in a global cache. Because of query and metadata caching, the second run always completes faster than the first run. For example, consider a query using Entity SQL query (don’t worry, we’ll discuss LINQ in just a second) with the following code.

using (PerformanceArticleContext ne = new PerformanceArticleContext())

{

    ObjectQuery<Orders> orders

     = ne.CreateQuery<Orders>("Select value o from Orders as o");

    foreach (Orders o in orders)

    {

        int i = o.OrderID;

    }

}

 

On the first run, the query completes in 179 milliseconds. The next time, the query completes in 15 milliseconds. The difference between the two is the cost of building the command tree that gets passed down to the provider for execution. All subsequent queries complete in or around the same time of 15 milliseconds.

LINQ Queries

LINQ queries utilize some of the same logic as Entity SQL queries, except that not all parts of the query are cached and some parts are rebuilt each time the query is executed. Take a look at the query below.

using (PerformanceArticleContext ne = new PerformanceArticleContext())

{

    var orders = from order in ne.Orders

                 select order;

    foreach (Orders o in orders)

    {

        int i = o.OrderID;

    }

}

 

Executing this query takes 202 milliseconds on the first execution and only 18 milliseconds on subsequent executions, which is still slower than using Entity SQL. Now, let’s take a look at using compiled LINQ queries to improve performance further. The advantage of compiling a LINQ query is that the expression tree is built when the query is compiled and doesn’t need to be rebuilt on subsequent executions.

Here’s the code for a compiled LINQ query that uses a PerformanceArticleContext delegate.

public static Func<PerformanceArticleContext, IQueryable<Orders>>

           compiledQuery = CompiledQuery.Compile(

                               (PerformanceArticleContext ne) =>

                                           (from o in ne.Orders

                                            select o)

                               );

 

using (PerformanceArticleContext ne = new PerformanceArticleContext())

{

    foreach (Orders o in compiledQuery(ne))

    {

        int i = o.OrderID;

    }

}

 

The times for this compiled LINQ query are 305 milliseconds on the first execution and 15 milliseconds on subsequent executions.

Here’s the standard LINQ.

using (PerformanceArticleContext ne = new PerformanceArticleContext())

{

    ne.Orders.MergeOption = MergeOption.NoTracking;

    var orders = from order in ne.Orders

                 where order.ShipCity == "London"

                 select order;

    foreach (Orders o in orders)

    {

        int i = o.OrderID;

    }

}

 

Now here is the compiled LINQ query.

public static Func<PerformanceArticleContext, IQueryable<Orders>>

           compiledQuery = CompiledQuery.Compile(

                               (PerformanceArticleContext ne) =>

                                   (from o in ne.Orders

                                    where o.ShipCity == "London"

                                    select o)

                               );

 

using (PerformanceArticleContext ne = new PerformanceArticleContext())

{

    foreach (Orders o in compiledQuery(ne))

    {

        int i = o.OrderID;

    }

}

 

In this query the result set is only 33 items. For the non-compiled query, the execution time is 207 milliseconds on the first execution and 17 milliseconds on subsequent executions. By comparison, the result for the compiled query is 268 milliseconds on the first execution and only 3 milliseconds on subsequent executions.

No Tracking/Tracking

In the previous examples, all the queries result in the creation of an object that gets added to the ObjectStateManager so that we can track updates. When it is not important to track updates or deletes to objects, then executing queries using the NoTracking merge option may be a better option. For example, NoTracking may be a good option in an ASP.NET web application that queries for a specific category name but doesn’t make updates to the returned data. In a case like this, there is a performance benefit to using NoTracking queries. Let’s demonstrate this by taking the same queries that we used previously and making them all into NoTracking queries. The following table demonstrates the cost of tracking objects in the query examples:

 

Tracking v. NoTracking on full query without parameters

Query Type

Tracking

First Run (milliseconds)

Tracking

Second Run (milliseconds)

NoTracking

First Run (milliseconds)

NoTracking

Second Run (milliseconds)

Entity SQL

179

15

149

6

LINQ

202

18

282

8

Compiled LINQ

305

15

276

6

 

Based on these numbers, the NoTracking option provides a big reduction in the amount of time, where most of this gain comes when we stop tracking changes and managing relationships. For a NoTracking query, the compiled LINQ query outperforms the standard LINQ query both in first execution and in subsequent executions. Note that the second execution of the compiled LINQ query is equivalent to the second execution of the Entity SQL query.

Below are the single parameter filtered queries for both tracking and no tracking.

Query Type

Tracking

First Run

(milliseconds)

Tracking

Second Run

(milliseconds)

NoTracking

First Run

(milliseconds)

NoTracking

Second Run

(milliseconds)

Entity SQL

126

3

140

2

LINQ

207

17

275

18

Compiled LINQ

268

3

277

2

 

Summary

When optimizing query performance in the Entity Framework, you should consider what works best for your particular programming scenario. Here are a few key takeaways:

·         Initial creation of the ObjectContext includes the cost of loading and validating the metadata.

·         Initial execution of any query includes the costs of building up a query cache to enable faster execution of subsequent queries.

·         Compiled LINQ queries are faster than Non-compiled LINQ queries.

·         Queries executed with a NoTracking merge option work well for streaming large data objects or when changes and relationships do not need to be tracked.

 

Brian Dawson,

Program Manager, ADO.NET

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

# Ryan said on February 11, 2008 6:45 PM:

Do you have any comparison between doing LINQ like

'from x in y select x;'

vs.

'x.Select()'

More complex queries would be more useful but I am interested in which type is actually faster

# Phil Bolduc said on February 11, 2008 10:01 PM:

I assume you took into consideration the database buffer cache?

# Jim Wooley said on February 11, 2008 10:05 PM:

I'm glad to finally see some EF performance information. It would be helpful to see a comparison against the similar process with a DataReader, DataSet, LINQ to SQL and other OR Mappers to see just what the performance trade-offs are. In this regard, I point to Rico Mariani's series on performance with LINQ to SQL as an example.

In the case of the examples you've presented so far, the queries are identical to the kind of 1-1 table mapping that is possible with LINQ to SQL. It would be nice to also see some performance on more complex object mapping schemes (multiple tables to single object for example).

# Cosmin said on February 12, 2008 4:22 AM:

Is there any way that we can reduce the time when the first query is run? I thought that using compiled queries would reduce that time.

Thank you.

# JuST ANOTHER GeeK said on February 12, 2008 2:33 PM:

ADO.NET Entity FrameWork Performance

# juliel said on February 12, 2008 9:34 PM:

Brian

Thank you so much for going into this detail. I look forward to more.

# Matthieu MEZIL said on February 13, 2008 8:20 AM:

J'ai quelque peu délaissé mon blog ces derniers temps mais maintenant que les techdays sont passés (du

# QuintaStation said on February 13, 2008 6:27 PM:

En este post comento una serie buenas prácticas para mejorar la eficiencia de ADO.NET Entity Framework

# Jim L said on February 15, 2008 7:29 PM:

What is the ability to persist this global cache? Given that the abstraction layer probably doesn't change a heck of a lot, there's no need to rebuild it unless it is invalidated. I'd hate to put the user through unneeded work if its not quick.

# Brian Vallelunga said on February 18, 2008 10:42 AM:

I'm very curious about best-practices for an asp.net environment. Obviously, keeping around the compiled query would be great for the performance benefits of subsequent queries, yet in a web scenario, I'm generally creating a context, executing a query, and then destroying both. Is there some advice out there about how to make this more efficient so that I don't pay the 200ms penalty each time?

# JasonBSteele said on February 18, 2008 2:14 PM:

In the stats above it appears that there is a decrease in performance for the "No Tracking" in several scenarios for the first run. For example, in the LINQ no parameters scenario the Tracking time is 202, whereas the No Tracking time is 282.

Is it true that No Tracking is actually slower in some cases, and if so why?

Many Thanks

# TerryLee said on February 20, 2008 10:23 AM:

概述春节后的第一期推荐系列文章,共有10篇文章:1.ASP.NETMVCExampleApplicationoverNorthwindwiththeEntityFramework...

# El blog de Vicente said on February 20, 2008 12:44 PM:

Como hace unos días apuntaba Unai , en el blog del equipo de desarrollo de ADO.NET están publicando una

# Jacky_Xu said on February 20, 2008 8:33 PM:

概述 春节后的第一期推荐系列文章,共有10篇文章: 1.ASP.NETMVCExampleApplicationoverNorthwindwiththeEntityFr...

# Hot Topics said on February 21, 2008 4:28 PM:

The first post, Exploring the Performance of the ADO.NET Entity Framework - Part 1 , began: Performance

# Ayman Saleh said on March 10, 2008 8:16 AM:

Do you have any performance statistics relating to using SQL Server stored procedures?

# TJE said on March 12, 2008 12:12 PM:

I love all the amazing work that has been done with the ADO.NET Entity Framework and Data Services.  There is one piece that I can't find any information on, however:  Support for multi domain models or multiple databases in the Entity Framework (and thus Data Services).

We have 10+ large enterprise databases all hosted on the same server/instance of sql server and these databases are somewhat used as "schema" conatiners, i.e. we have many stored procedures that make calls to stored procedures/tables/views in another database and there are logical relationships between tables in different databases.  This requires our Entity Model to have entity relationships across databases and thus across Entity Models (since, AFAIK, currently the Entity Framework design surface can only include tables from one database).

I believe that this is a big weakness and needs to be addressed as soon as possible.  The other issue I have is how much the Entity Framework design surface slows down (and the entity boxes become unmanageble from a visual real-estate perspective) after about 20-30 tables on it...

Any comments on this from the Entity Framework team would be very much appreciated...

# Paymon said on March 17, 2008 10:52 AM:

Great post!

It will be helpful if you could also include comparison metrics from the DBMS side.

Splitting the execution time between application (CLR) and Sql server (DBMS) for each thread will also be of a huge help in understanding and tuning performance.

Thanks

# Brian Dawson - MSFT said on March 19, 2008 2:53 PM:

Thanks for all the comments and questions!  It's great to see the interest in the Entity Framework. I'm working on another post about performance that I'll try to post soon. It addresses some of the questions.

-Brian

# Ranga said on March 20, 2008 1:48 AM:

Is there any news on the Release Date of Entity Framework?

# ADO.NET team blog said on March 27, 2008 9:48 PM:

There have been a few questions from the last performance blog post about how the Entity Framework compares

# Matthieu MEZIL said on March 28, 2008 4:53 AM:

L'ADO .Net Team vient de poster deux nouveaux posts : le premier concerne l'utilisation des procédures

# Damir Dobric Posts said on May 24, 2008 6:01 PM:

For everybody who is interested on internals of Entity Framework (ER) in relation to referential integrity

# Dating said on June 1, 2008 9:33 AM:

Query Performance During this post I’ll show a few patterns on how to improve the query performance. A major design element for performance is the query cache. Once a query is executed, parts of the query are maintained in a global cache. Because of quer

# Weddings said on June 5, 2008 12:08 PM:

Query Performance During this post I’ll show a few patterns on how to improve the query performance. A major design element for performance is the query cache. Once a query is executed, parts of the query are maintained in a global cache. Because of quer

# André Magalhães said on July 16, 2008 9:59 AM:

Analyzing the test results, to use compiled LINQ queries is better than non-compiled LINQ queries only in the second execution. In the first execution the compiled LINQ queries are always slower.

But what is exactly the second execution? Is an execution using the same instance of the objectcontext or is an execution in a new thread with a new objectcontext?

What I mean is, does the query cache survive between different threads or it is alive just during a thread execution lifetime?

# Podlipensky said on August 21, 2008 2:50 AM:

Do you have comparsion data for EF with stored procedures and standalone stored procedures? Does it have some perfomance difference?

# Jiří {x2} Činčura said on January 11, 2009 12:40 PM:

If we’ll skip exact details, we can say, that internal behavior of whole modeling and mapping is based

Leave a Comment

(required) 
(optional)
(required) 
Page view tracker