Tips, Tricks, and Advice from the SQL Server Query Optimization Team

Row Goals in Action

Today, we'll talk about row goals.  The optimizer in SQL Server has a feature that can bias plan choices to retrieve a certain number of rows quickly instead of the whole results.  This shows up in a few places, but the primary areas are in TOP N queries and in subqueries that need to check for the existance of "a" row - WHERE EXISTS, for example.  The query hint "option (fast N)" also translates into the same feature. 

 

One of the problems that can arise with row goals is that the query plan can change when you add them into a query request.  Especially with the option(fast N) hint, you can find cases where the first row may come back quickly but the whole results come back more slowly.  So, if you send option(fast N) but retrieve the whole results, your system won't perform as well.

 

Effectively, we bias the optimizer to favor plans that can return a few rows quickly compared to the minimum cost to return all rows.  In practice, this often means that joins will choose nested loops joins for row-goal limited plans and hash joins otherwise.


The following example demonstrates the issue:

 

use tempdb

 

create table A(col1 int, col2 binary(100), col3 int)

declare @i int

set @i = 0

while @i < 5000

begin

insert into A(col1, col2, col3) values (@i, 0x3333, rand()*1000)

set @i = @i + 1

end

 

create clustered index i1 on A(col1)

 

set statistics time on

-- should pick a series of hash joins

select A1.* from

A as A1 inner join A as A2 on A1.col1 = A2.col1

inner join A as A3 on A1.col1 = A3.col1

 

-- if there is a row goal, we’ll pick the loop join plan that returns one (or a few) row(s) quickly

set statistics time on

select A1.* from

A as A1 inner join A as A2 on A1.col1 = A2.col1

inner join A as A3 on A1.col1 = A3.col1

option (fast 1)

 

You can run these on your installation to see the time difference for retrieving all rows with the hash join plan vs. the loop join plan (just keep adding rows until you see the difference). 

 

This pattern is caused by the row goal logic in the optimizer.  When we have a very low row goal, the nested loops join is preferred because its initial cost (the cost for the first row) is comparatively low – it just involves a single seek for this example.  The hash join plan has a higher initial cost because it has to build a hash table before any rows can be returned.  Once built, however, the hash join plan is generally cheaper and would be picked if the estimated number of rows gets large.
Published Thursday, March 30, 2006 8:20 AM by QueryOptTeam

Comments

 

Gaurav Bindlish said:

Very cool ....
March 31, 2006 12:50 AM
 

Jerry Nixon said:

There is very little documentation about FAST and how it could be used in the context of an application. Could you point to some more information - Books Online has very little on FAST.
April 21, 2006 12:56 PM
 

Yukonism said:

May 11, 2006 3:41 PM
 

Tips, Tricks, and Advice from the SQL Server Query Processing Team said:

Since we’re starting a new blog for the whole query processing team, we think it’s appropriate to start...
September 25, 2006 8:52 PM
 

Web 2.0 - Social Media - Internet News - Blogging » Row Goals in Action said:

November 20, 2007 4:19 PM
 

FAST n Query Hint &laquo; Home of the Scary DBA said:

May 15, 2008 10:56 AM
Anonymous comments are disabled

© 2008 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Microsoft
Page view tracker