What you should know if you are using Logical IOs to measure query plan quality

What you should know if you are using Logical IOs to measure query plan quality

Rate This
  • Comments 6

One of the presenters at recent SQL Server Open World conference in Denmark (http://www.miracleas.dk/index.asp?page=168&page2=323)  suggested tuning SQL queries using logical IOs to measure quality of query plan. The IO count is returned for each object accessed in your query after you issue “set statistics IO on”. Here is an example of returned message:

Table 't1'. Scan count 10, logical reads 20, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

I have seen this approach many times in the past and I agree that in general, using less logical IOs is better and faster than using more. Therefore tuning query towards lower number of logical IOs in most cases produces query that is faster and uses less resources. But I would like to point out one potential pitfall when using this counter. In a nutshell  reading the same data may end up showing very different number of logical IOs depending on the “data access method”. This is caused by counting each new “seek” into any page (even repeatedly the same one) as new logical IO (in fact several  logical IOs because of the index navigation!) while reading all rows on the same page using “scan” counts as a single logical IO. 

If you run the following statements in SSMS after you create table t1 as shown in the appendix

set statistics io on
select * from t1 with (index=2) where a in (1,2,3,4,5,6,7,8,9,10)   -- Query 1
select * from t1 with (index=0) where a in (1,2,3,4,5,6,7,8,9,10)   -- Query 2

you will find following 2 messages in the “Messages” tab of the SSMS result window:

Table 't1'. Scan count 10, logical reads 20, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 't1'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

The first query is performing 10 index seeks showing “20 logical reads” using following query plan:   

|--Index Seek(OBJECT:([test].[dbo].[t1].[i1]), SEEK:([test].[dbo].[t1].[a]=(1) OR [test].[dbo].[t1].[a]=(2) OR [test].[dbo].[t1].[a]=(3) OR [test].[dbo].[t1].[a]=(4) OR [test].[dbo].[t1].[a]=(5) OR [test].[dbo].[t1].[a]=(6) OR [test].[dbo].[t1].[a]=(7) OR [test].[dbo].[t1].[a]=(8) OR [test].[dbo].[t1].[a]=(9) OR [test].[dbo].[t1].[a]=(10)) ORDERED FORWARD)

The reason why SQL Server counts 2 logical IOs for each seek is that it has to interrogate the “root” page of the index first to locate the correct “data” page. If an index has more levels, there will be as many logical IOs used to locate the data page as many non-leaf nodes must be accessed. Additionally, more than one “leaf” page has to be visited in some cases.

The second query returns different plan: 

|--Table Scan(OBJECT:([test].[dbo].[t1]), WHERE:([test].[dbo].[t1].[a]=(1) OR [test].[dbo].[t1].[a]=(2) OR [test].[dbo].[t1].[a]=(3) OR [test].[dbo].[t1].[a]=(4) OR [test].[dbo].[t1].[a]=(5) OR [test].[dbo].[t1].[a]=(6) OR [test].[dbo].[t1].[a]=(7) OR [test].[dbo].[t1].[a]=(8) OR [test].[dbo].[t1].[a]=(9) OR [test].[dbo].[t1].[a]=(10)))

The difference is in the “index seek” versus “table scan” operators. Accessing all rows by table scan counts each visited page as a single logical IO (one page visited in our case) while accessing rows in an index using seeks counts each “seek” and as pointed out above twice or even more times depending how many pages I need to visit to complete the seek.

Interestingly, if we remove index hint in the query, SQL Server optimizer will choose the query plan using more logical IO! To make situation even more complicated, we should be careful what counts as a single “seek” and what constitutes “multiple seeks”. If you slightly change our query to

select * from t1  where a between 1 and 10  -- Query 3

you will get following plan 

  |--Index Seek(OBJECT:([test].[dbo].[t1].[i1]), SEEK:([test].[dbo].[t1].[a] >= CONVERT_IMPLICIT(int,[@1],0) AND [test].[dbo].[t1].[a] <= CONVERT_IMPLICIT(int,[@2],0)) ORDERED FORWARD)

And the IO count for Query 3 is

Table 't1'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Query 3 is using single seek (therefore shows “scan count 1”) when returning all rows qualifying by “between 1 and 10” predicate. Similar Query 1 above used 10 seeks (“scan count 10”) because it’s seek predicate contains OR and each new “branch” of the OR predicate is using separate “seek”.

Conclusion: When you are using “logical IO” count to evaluate your query efficiency or compare various query modifications be aware that “logical reads” may have very different meaning depending on the data access methods chosen by the query optimizer.

Notes:

  1. I have used CTP6 build of SQL Server 2008 to run the examples but the conclusion is valid also for SQL Server 2000 and 2005.
  2. You can use SQL Profiler to capture Logical Reads in "Batch completed" event and the same numbers you can get using sys.dm_exec_query_stats. However, these are cummulative numbers for all accessed object as opposed to those returned by SET STATISTICS IO ON as described in the blog

 

Appendix:

create table t1 (a int, b char(10))
declare @i int
set @i=0
set nocount on

while (@i<10)
begin set @i=@i+1; insert into t1 values (@i,'x') end

create index i1 on t1 (a,b)

-- Following query shows how many pages are used by the table and its index

select OBJECT_NAME(id), indid, dpages from sys.sysindexes where id=OBJECT_ID('t1')

 

Lubor

Leave a Comment
  • Please add 7 and 5 and type the answer here:
  • Post
  • PingBack from http://www.travel-hilarity.com/airline_travel/?p=2584

  • Good one, Lubor.

    Something else regarding LIO: As with most other things in databases, the number of times something happens has no correlation to its duration.

    Anjo Kolk found that there are many different kinds of LIO, some fast, some slow, some with serialisation, some not - all depending on their purpose when called upon from the source code.

    So if you have 100 LIOs taking a total of 100 seconds (that would be a horrible day, indeed, but let's pretend for the sake of clarity) then we don't know if each of them took 1 second each or whether two of them took 42 seconds each and the rest (98 LIOs) took 16 seconds in total.

    If there's heavy skew like the above, we should be able to focus on the two really bad ones and go for their source.

    So that's ANOTHER level we have to dive into one day, my friend :-)).

    best regards,

    Mogens

    PS: This has no relevance here, but inside Oracle Anjo found close to a 1000 different LIO's - I don't think it's much different inside SQL Server.

  • It's a bit of an artificial example and of the "it depends" scenario. I usually try and make sure I use a table of several million rows and varying widths when I do this type of test. In fairness I'm usually looking to tune queries which take thousands ( and sometimes millions ) of io - I rarely get down to looking at queries under 100 io. Good points though.

  • I agree - the example is totally artificial and its goal is only to demonstrate the point. The point about tuning queries with large amounts of I/Os first is correct unless you are dealing with queries executed tens of thousands of time a minute. That is frequent scenario for web backends. There shaving off several I/Os from each execution may cause a miracle.

  • In the case of table scan the CPU time was more as it has to check for ach and every row whether that is matching with one of the passed value. Thus you could say 10 cpu opertaion. But 10 cpu opertaion should be cheaper than 18 logical IO's(through index seek). Also, I am wondering whether sql server uses 100 CPU opertaion or 10 cpu opertaion or 55 cpu opertaions for those 10 rows.SO it should be included as well. The index is covering index.If it were not things might have been different (i have not tried it yet) but I will try it later.

  • GulliMeel - you are correct. And the number of CPU operations per row with scan depends on many things - the predicate, number of data types of the columns in the row, data types in the comparison, need for implicit data type conversion, etc. Therefore it is impossible to give generic answer “it takes x CPU cycles to process each row”.

    The “seeks” cost also depends on many factors. Usually we are seeking for all matching values – so we have first “locate” the start of the match and then continue walking forward while the compared values match.

    I think it is almost impossible to come up with general rule “what is better” covering the various combinations. The best is to measure…

Page 1 of 1 (6 items)