SQL Server Storage Engine

When can allocation order scans be used?

I know this is jumping the gun a little as I haven't made it this far in my series on fragmentation, but this came up in a chalk-talk I did yesterday at TechEd Developers in Barcelona and is worth blogging about.

You'd expect a select * from mytable query on a table with a clustered index to use an allocation order scan to return the data, as that's the fastest way to read all the pages at the leaf-level of an index. In fact, the query plan will show an unordered clustered index scan. Well, guess again. What I didn't remember, and thanks to SQL Server MVP Maciej Pilecki for pointing this out during my talk (even though I didn't believe him at first), is that an allocation order scan can't be used when there's any possibility of the data changing beneath the scan.

Consider an example where an allocation order scan is progressing and a page that the scan has already read then splits - adding a newly allocated page at the end of the allocation-order list of pages. The scan will eventually come to the new page and then re-read some of the rows it has already read - producing duplicates in the scan output - clearly undesirable.

The only time such a scan will be used is when there's no possibility of the data changing (e.g. when the TABLOCK hint is specified, or when the table is in a read-only database) or when its explicitly stated that we don't care (e.g. when the NOLOCK hint is specifed or under READ UNCOMMITTED isolation level). As a further twist, there's a trade-off with setup cost of the allocation order scan against the number of pages that will b read - an allocation order scan will only be used if there's more than 64 pages to be read.

Below I've included a simple script that demonstrates the behavior. For me, the funny thing is that I should have remembered this behavior as it's exactly the same reason why DBCC SHOWCONTIG takes a shared table lock when in the default mode - it's using an allocation order scan and needs to ensure that the data doesn't change and produce duplicates. Oh well - you (re)learn something every day!

-- Drop and recreate the test database

USE master;

DROP DATABASE allocationordertest;

CREATE DATABASE allocationordertest;

USE allocationordertest;

GO

-- Create a simple table that we can fill up quickly, plus a clustered index

CREATE TABLE t1 (c1 INT, c2 VARCHAR (8000));

CREATE CLUSTERED INDEX t1c1 ON t1 (c1);

GO

-- Add some rows, making sure to produce an out-of-order dataset when scanned in allocation order.

DECLARE @a INT;

SELECT @a = 10;

WHILE (@a < 100)

BEGIN

    INSERT INTO t1 VALUES (@a, replicate ('a', 5000))

    SELECT @a = @a + 1

END;

SELECT @a = 1;

WHILE (@a < 10)

BEGIN

    INSERT INTO t1 VALUES (@a, replicate ('a', 5000))

    SELECT @a = @a + 1

END;

GO

-- Now try to do an allocation order scan - doesn't work...

SELECT * FROM t1;

GO

-- Until we add the right conditions.

SELECT * FROM t1 WITH (TABLOCK);

GO

Published Thursday, November 09, 2006 2:31 AM by Paul Randal - MSFT

Comments

 

Christian Robert said:

That's nice to have anwer to that question...

November 9, 2006 7:31 PM
 

mindycurnutt said:

Paul,

I am trying to clarify the whole 64K "stripe size" idea, and I am getting very confused.

I have been told that the 64K size that is recommended for SQL Server really corresponds to the "allocation unit" that you see when you do a chkdsk. For instance, on my laptop, the allocation unit size is 4K. This is the windows default. I have been told that for SQL to work the "best" that this size should be 64K, with an offset of 1 (can be check using diskpar).

Now, combine this with the technology of a SAN (EMC for example). When they build RAID groups, they use a stripe size of 128K. Does this make any difference to SQL? Is this in any way related to the 64K allocation unit size tha is recommended?

Would you mind shedding some light on this. I have read and googled my heart out and am still very confused. I have lots of people telling me different things, and none of it is giving me a clear picture. Do you know something I could read that would clear this up?

Mindy

November 15, 2006 4:49 PM
 

Paul Randal - MSFT said:

Hi Mindy,

There's a whitepaper you can read on Physical Database Design. The link is http://www.microsoft.com/technet/prodtechnol/sql/2005/physdbstor.mspx - it also has some links to other resources. Let me know if that doesn't help. You can also read the two whitepapers on SQL Server's IO subsystem. Links are http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/sqlIObasics.mspx for SQL Server 2000 and http://www.microsoft.com/technet/prodtechnol/sql/2005/iobasics.mspx for SQL Server 2005 specifics.

Thanks

November 16, 2006 7:41 AM
 

Cine said:

One thing I am wondering about all the defragmentation talk is that none of the possible ways to "defragment" handle all the types of fragmentation.

alter index reorganize handles pages that are out of order

alter index rebuild rebuilds the index, but it still allocates space where there is available space making it only slighty better than reorganize

dbcc emptyfile doesn't help much either

How can I force it to completly rebuild an index that is 100% continuously allocated on disk?

I have found one way, but it is incredibly annoying to work with: Create a new tablespace, drop index, recreate in new table stable. For clustered, create a new table and move over all data. And then do it all again the other way around when the old tablespace is empty.

November 22, 2006 12:00 AM
 

Paul Randal - MSFT said:

Hi Cine,

That's about the only way to guarantee 100% continuously allocated - create a new filegroup and rebuild the index using CREATE WITH DROP_EXISTING and specifying the new filegroup.

Achieving 100% contiguity is not vital - as long as there are as few fragments as possible compared to the data volume to be read during a scan then the interruption to large-IO readahead will be minimized (and hence scan performance will hardly be affected).

Thanks

December 3, 2006 7:27 PM
 

Microsoft SQL Server Development Customer Advisory Team said:

I received a question from one of our customers about using the NOLOCK hint: can it cause missing rows

February 1, 2007 2:07 PM
 

SSQA- Users & SQL tools said:

On day-to-day basis a DBA might come across with the issues on the fragmentation on the database, it

April 23, 2007 5:43 AM
 

stswordman said:

Thispostwaspublishedtostswordmanat6:09:31PM4/14/2009

tablescanagainstheapwithforward...

April 14, 2009 6:23 AM
Anonymous comments are disabled

About Paul Randal - MSFT

Paul started in the industry in 1994 working for DEC on the VMS file system and check/repair tools. In 1999 he moved to Microsoft to work on SQL Server, specifically on DBCC. For SQL Server 2000, he concentrated on index fragmentation (writing DBCC INDEXDEFRAG and DBCC SHOWCONTIG) plus various algorithms in DBCC CHECKDB. During SQL Server 2005 development Paul was the lead developer/manager of one the core dev teams in the Storage Engine, responsible for data access and storage (DBCC, allocation, indexes & heaps, pages/records, text/LOB storage, snapshot isolation, etc). He also spent several years rewriting DBCC CHECKDB and repair. For SQL Server 2008, Paul managed the Program Management team for the core Storage Engine to become more focused on customer/partner engagement and feature set definition. In 2007, after 8.5 years on the SQL Server team, Paul left Microsoft to join his wife, Kimberly Tripp, running SQLskills.com and pursuing his passion for presenting and consulting. Paul regularly presents at conferences and user groups around the world on high-availability, disaster recovery and Storage Engine internals. His popular blog is at http://www.sqlskills.com/blogs/paul/.

© 2009 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Microsoft
Page view tracker