Previously committed rows might be missed if NOLOCK hint is used

Previously committed rows might be missed if NOLOCK hint is used

Rate This
  • Comments 9

I received a question from one of our customers about using the NOLOCK hint: can it cause missing rows in scans even if the rows were committed well before my SELECT with NOLOCK starts?

 

The NOLOCK hint is employed by many users to avoid contention on tables where updates are performed concurrently with selects. The popular belief is that the only side effect of using the NOLOCK hint is that it may cause reading uncommitted rows.  These would be the rows that may be either rolled back or are part of larger concurrent transaction from which some rows may not be retrieved because they had not been written to the pages at the time the NOLOCK SELECT has been scanning them. It is much less obvious that even rows committed a long time before my NOLOCK transaction started might be skipped in the scan. (After I asked several SQL Server experts around me, I found that no one except for the development lead who owns the component in question knew this.) Additionally, a scenario theoretically exists (I don’t have a repro yet; read below) where the same record might be accounted for twice if SELECT reads a table with a NOLOCK hint.

 

This blog contains a simple repro – two scripts that you have to run on two connections concurrently showing double-counting rows. The third script is used to identify cases when the count of rows in the table decreased in the subsequent scan despite the fact we are only inserting rows into the table.

 

I will then show how you can realize the same concurrency benefit in SQL Server 2005 that is provided by NOLOCK while maintaining transactional consistency, and without using the NOLOCK hint. Later, I will explain which table scans might have this problem and which will not (even with NOLOCK), and how you can distinguish the scans by investigating their showplans. Then, I put up a challenge – create a repro that shows double-counting previously committed rows with the NOLOCK hint.

 

First, you should understand that SQL Server is using two kinds of scans for reading data from tables and indexes: allocation scans and range scans. An allocation scan is scanning the pages in physical (allocation) order as they are laid down on the disk. In this situation, no particular logical order of rows is guaranteed. If the optimizer requires rows to be scanned in a certain order, then a range scan is employed. This is usually when there is a predicate on the leading column of an index or if the optimizer requires visiting the rows in the logical order (for example, because of aggregation or ORDER BY in the query).

 

Next, you should know that if a new row is inserted into an index (clustered or not), it might cause page split. This happens if the existing page is full and the new row belongs between two existing rows (because of the key value of the new row is between the key values of the two rows) and it cannot fit physically into the same page. The page is split into two (approximately ½ empty each), and the new row is inserted where it belongs.

 

Here is a simple scenario that shows that the NOLOCK hint might miss some rows when concurrently executed updates (INSERTs in our case) are causing page splits. The INSERTs in script #2 will cause page splits because the table has a clustered key. First, you insert two rows, one with key value 1, and another with key value 10000. Then, you insert rows with key values 2, 9999, 3, 9998, etc. When the first page is full, it is split as described above. However, if the concurrently performed SELECT is using the NOLOCK hint on the same table, and the optimizer had chosen the allocation scan, the scan is moving only forward. Therefore, it might miss the ½ page of rows if the newly allocated page falls behind the scan.

Here are the scripts:

 

Script #1 – This script is trying to count the rows in the table t1. You can run three versions; see the “activate only one…” comment in the script. The first version has no hint, the second has the NOLOCK hint, and the third has both the NOLOCK hint and a BETWEEN predicate. The predicate causes the optimizer to select the range scan instead of the unordered scan in the previous two cases. You start this script and let it loop on the while statement until 10000 rows are inserted into the table t1 by the second script.

 

use test;

go

drop table tcnt;

create table tcnt(num int, cnt int);

drop table t1;

create table t1 (a int primary key, b char(500));

declare @i int, @tcnt1 int, @tcnt2 int;

set @i=1;

set @tcnt1=0;

set @tcnt2=0;

insert into tcnt values (0,0);

while (@tcnt2<10000)

begin

 

-- Activate only one of the following three SELECT statements and compare the behaviors

-- select @tcnt2=count(*) from t1; --- this will perform the correct count

-- select @tcnt2=count(*) from t1 with (NOLOCK); -- this one might skip some previously committed records

--select @tcnt2=count(*) from t1 with (NOLOCK) where a between 0 and 10001; -- this one will work correctly because the predicate is causes the optimizer to select a range scan

if (@tcnt2<>@tcnt1)

begin

insert into tcnt values (@i,@tcnt2);

set @i=@i+1;

set @tcnt1=@tcnt2;

end;

end;

 

Script #2 – You activate this script from another connection against the same database where you ran the first one.

 

use test;

go

declare @imin int, @imax int;

set @imin=1;

set @imax=10000;

 

while (@imax>@imin)

begin

insert into t1 values (@imin,'x');

insert into t1 values (@imax,'y');

set @imin=@imin+1;

set @imax=@imax-1;

end;

 

Script #3 – You run this one to investigate the result after the two previous scripts are finished. It will show all cases when the count(*) value decreased after additional committed inserts were performed. You can observe this only in the second scenario in script #1, when you run the query with NOLOCK hint and without the BETWEEN predicate. If NOLOCK is not used or the NOLOCK hint is used in conjunction with the BETWEEN predicate, the following query returns an empty result:

 

select * from tcnt t1, tcnt t2 where t1.num<t2.num and t1.cnt>t2.cnt;

 

If the statement select @tcnt2=count(*) from t1 with (NOLOCK); is executed in script #1, the result of the above select * is non-empty. The following shows some of many rows I see in my output:

 

num         cnt         num         cnt

----------- ----------- ----------- ----

622         1189        624         1187

623         1191        624         1187

687         1310        690         1309

688         1312        690         1309

689         1313        690         1309

1044        2004        1051        2003

1045        2006        1051        2003

1046        2008        1051        2003

1047        2010        1051        2003

1048        2012        1051        2003

1049        2014        1051        2003

1050        2015        1051        2003

1460        2817        1463        2815

……..

The “num” column is the sequence number of the count(*) against the table where we inserted rows. The “cnt” column is the result of the same count(*). Therefore, one can expect that if my count(*) #622 counted 1189 rows, count #624 must logically see more rows because we are only inserting. However, we see only 1187 rows. This shows that scan #624 must have missed some rows we have encountered in scan #622. An important detail is that the insert transaction of each single row was committed, which is the default behavior.

 

In SQL Server 2005, no-blocking scans can be performed by using alter database <dbname>set READ_COMMITTED_SNAPSHOT ON (for more information, see (http://msdn2.microsoft.com/en-us/library/ms189050.aspx). This is the best way to get rid of the NOLOCK statements, achieve greater concurrency, and return consistent data. But be aware that there can be negative performance implications – for example, the READ_COMMITTED_SNAPSHOT will force using range scans while the NOLOCK hint is using allocation scan. Paul Randal’s blog at http://blogs.msdn.com/sqlserverstorageengine/archive/2006/11/09/when-can-allocation-order-scans-be-used.aspx explains many cases when SQL Server might use allocation scans. However, the blog does not mention cases when the allocation scan is replaced by range scan even if NOLOCK (or TABLOCK or READ UNCOMMITED SNAPSHOT isolation) is used. You can find out if range scan is used with NOLOCK hint by investigating the graphical showplan. There you will see the following Seek Predicate property (at least one of the Start or End Range must be present for the scan to be “range scan”):

Start Range: [test].[dbo].[t1].a >= CONVERT_IMPLICIT(int,[@1],0), End Range: [test].[dbo].[t1].a <= CONVERT_IMPLICIT(int,[@2],0)

Similarly, in the XML showplan, you will find either something similar to StartRange ScanType="GE” or EndRange ScanType="LE" (note the GE is for the >= , and LE is for the <= predicate type, respectively).

 

I’m challenging readers to find similar reproductions showing twice-scanned rows. Please submit your solution here, or submit feedback to http://blogs.msdn.com/sqlcat/contact.aspx. I will reward the first correct solution by sending the author one of the books from the “Inside Microsoft® SQL Server™ 2005” series (http://www.microsoft.com/mspress/findabook/list/series_SH.aspx). 

 

After I finished the above investigation I searched the Web and found Tony Rogerson’s blog with the same theme – skipping and double-counting previously committed rows when NOLOCK hint is used. The blog is located at http://sqlblogcasts.com/blogs/tonyrogerson/archive/2006/11/10/1280.aspx. Maybe I should have searched first; but this way, I enjoyed the discovery, and creating the repro. I decided to post my own blog for several reasons - the repro is a bit different, I explain the discrepancy between the range and allocation scans, and I show how you can find out if your NOLOCK is returning unexpected row sets. Most importantly, I think there are still many users using the NOLOCK hint without realizing the potentially serious consequences that both blogs describe.

 

Lubor Kollar

SQL Server Development

Customer Advisory Team

Leave a Comment
  • Please add 2 and 5 and type the answer here:
  • Post
  • PingBack from http://sqlserverforum.org/archives/2007/02/01/previously-committed-rows-might-be-missed-if-nolock-hint-is-used/

  • Hi lubor,we think we have managed to answer your challange  and repro a solution that demonstrates twice scanned rows :

    -- first create tables,index

    Create table DemoPageSplits (a int ,b varchar(6000))

    Create unique clustered index ix_clust on DemoPageSplits(a)

    Create table SaveInsertResults (Counter int,KeyValue int,InsTime datetime)

    -- fill the table

    TRUNCATE TABLE DemoPageSplits

    GO

    set nocount on

    declare @i int ;

    set @i=1;

    while (@I<=10000)

    begin

    INSERT INTO DemoPageSplits VALUES (@I,'X')

    set @i=@i+1;

    end;

    -- RUN SCRIPT 2 and SCRIPT 3 in a sperate spid

    -- make sure you feed inside SCRIPT 3 the coorect spid of SCRIPT 2

    -- SCRIPT 2

    -- update the table and create page splits

    declare @imin int, @imax int;

    set @imin=1;

    set @imax=9999;

    While (@imax>@imin)

    begin

    update DemoPageSplits set b= replicate('x',6000)

    Where a=@imin

    update DemoPageSplits set b= replicate('y',6000)

    where a=@imAX

    set @imin=@imin+1;

    set @imax=@imax-1;

    end;

    --SCRIPT 3

    --replace SPID= with the correct SPID of SCRIPT 2

    -- we make sure each insert has a specific id (counter)

    -- so we can determine if we get duplicate values from

    -- the same INSERT ... SELECT statment

    TRUNCATE TABLE SaveInsertResults

    GO

    DECLARE @cnt INT

    SET @cnt=1

    WHILE EXISTS(SELECT STATUS FROM sys.sysprocesses

    WHERE SPID=(spid of script2 )AND STATUS<>'SLEEPING')

    BEGIN

    INSERT INTO SaveInsertResults

    SELECT @cnt,A,getdate() FROM DemoPageSplits (NOLOCK)

    SET @cnt=@cnt+1

    END

    --SCRIPT 4

    --- check the results

    --  we just get the first occurance of having count more then 10000

    --  create index just to make query run faster

    create clustered index inx_clust on SaveInsertResults(Counter,keyvalue,InsTime)

    -------------------------------------------------------

    SELECT Counter,KeyValue,InsTime,COUNT(*) CountDuplicates FROM SaveInsertResults

    WHERE Counter in (SELECT top 1 Counter FROM SaveInsertResults

     GROUP BY Counter

     HAVING COUNT(*)>10000

                     )

    GROUP BY Counter,KeyValue,InsTime

    HAVING COUNT(*)>1

    partial results :

    Counter     KeyValue    InsTime                 CountDuplicates

    ----------- ----------- ----------------------- ---------------

    24          451         2007-02-12 14:01:23.140 2

    24          562         2007-02-12 14:01:23.140 2

    24          563         2007-02-12 14:01:23.140 2

    24          564         2007-02-12 14:01:23.140 2

    24          565         2007-02-12 14:01:23.140 2

    24          566         2007-02-12 14:01:23.140 2

    24          567         2007-02-12 14:01:23.140 2

    24          568         2007-02-12 14:01:23.140 2

    24          569         2007-02-12 14:01:23.140 2

    24          570         2007-02-12 14:01:23.140 2

    24          571         2007-02-12 14:01:23.140 2

    24          572         2007-02-12 14:01:23.140 2

    24          573         2007-02-12 14:01:23.140 2

    24          574         2007-02-12 14:01:23.140 2

    24          575         2007-02-12 14:01:23.140 2

    24          576         2007-02-12 14:01:23.140 2

    24          577         2007-02-12 14:01:23.140 2

    24          578         2007-02-12 14:01:23.140 2

    24          579         2007-02-12 14:01:23.140 2

    24          580         2007-02-12 14:01:23.140 2

    24          581         2007-02-12 14:01:23.140 2

    24          582         2007-02-12 14:01:23.140 2

    24          583         2007-02-12 14:01:23.140 2

    24          584         2007-02-12 14:01:23.140 2

    24          585         2007-02-12 14:01:23.140 2

    24          586         2007-02-12 14:01:23.140 2

    24          587         2007-02-12 14:01:23.140 2

    ......

    this show duplicate keys created in the destination table becasue of twice scanned rows.

    Uri Munitz &

    Danny Ravid

    Clalit Health Services

  • Uri and Danny - your solution is correct and you will receive a book of your choice (handled separately in private mail).

    After posting my blog I was contactet by Itzik Ben-Gan who pointed me to his article in SQL Server magazine http://www.sqlmag.com/Articles/Index.cfm?ArticleID=92888&DisplayTab=Article where he provided solution to my problem even before I have posted my blog... I would like thank to Itzik for not spoiling this competition and let other try to come up with their own scripts to show the behavior!

  • Over the past month or so, I've looked at pretty much every isolation level except for read uncommitted

  • Isolation level read commited - linii duplicate

  • This post talks about talks about ways to Store Spatial Reference data in SQL in order to improve performance

  • I often get asked about the implications of using nolock query hints or read uncommitted and my response

  • Frequently I see production code, created by professional development teams, peppered with NOLOCK &amp;

  • Just to clarify: you state "might miss some rows if the new page allocated comes after the scan".  How can this be without doing any deletes?  Are the new pages/extents not allocated in file order?  Or do you mean maybe with multiple files in a filegroup?

    Or do you mean:

    a) the page split occurs

    b) half the rows are moved off the page

    c) the concurrent read scan reads the half page

    d) the concurrent read scan reads the rest of the pages, completing its scan

    e) the new page is allocated from an extent

    f) the other half of rows are put on this new page

    Is this the "typical" interleaving of events that leads to this behavior?  Or am I missing something?

    If the above is the case, then the "read duplicate rows" case is where the order of the above events is:

    c (but this time reads the full page) -> a -> b -> e -> f -> d

    right?

    Thanks!

    Steve Ash

Page 1 of 1 (9 items)