Troubleshooting deadlocks in SQL2005

Troubleshooting deadlocks in SQL2005

Rate This
  • Comments 11

Deadlock in the context of SQL Server means two or more transactions or tasks are waiting on each other to acquire the resources needed to complete. A resource can be a logical lock or can be memory grant or can be a worker thread and so on.

 

Normally, when a transaction T1 requests a lock to access a resource, it may get blocked because some other transaction T2 may have already locked the resource in a conflicting mode. In that case, T1 waits for the lock to be released before proceeding. Under normal operations, one T2 releases the lock; it is then acquired by T1. Two blocking scenarios are of interest here. First, what if T2 never releases the lock? In this case, unless a lock time out is specified, T1 will wait indefinitely. SQL Server has no way of knowing that it is indeed the case and it does not do anything. An external intervention is required to break it, typically by killing the first transaction. Second, what if T2 requests a lock on another resource that is already locked by T1 in conflicting mode? Since T1 is waiting for T2, and now T2 is waiting for T1, it results in a deadlock and in this case no amount of waiting will help. SQL Server has implemented the logic to detect such cycles in resource acquisition since the very beginning. A background task, called the deadlock monitor, checks for cycles periodically, typically every 5 seconds but it can vary depending how often the deadlock is detected. When a deadlock is discovered, it breaks the deadlock by choosing a victim that has consumed to the lowest resources so far. The victim’s session will get 1205 error and the transaction will be aborted. SQL Server provides you some control, by means of deadlock priority, which task will be chosen as deadlock victim.

 

When a session gets 1205 error, it does not provide any information on why the deadlock happened, what sessions and resources were involved in the deadlock cycle. Without this information, it is difficult to understand the causes of deadlock and to prevent or minimize it in the future. To get the detailed diagnostic information on the deadlock, it is recommended that you run SQL Server with TF-1222 (new in SQL2005) and provides you more detailed information than the related TF-1204 which also can be used for the similar purpose. Note there is no performance overhead of running SQL Server with these traces flags enabled as these trace flags don’t control deadlock monitor task. However, if your SQL Server encounters deadlocks frequently, you may see some impact as the detailed diagnostic information needs to be generated for every deadlock.

 

Let us consider an example of a simple deadlock between two transactions as shown in the table below. You will notice that T1 holds X lock on all rows with c1=5 on table t_lock1 while T2 holds X lock on all rows with C1=1 on table t_lock2. Now each of these transactions wants to update the rows previously locked by the other. This results in a deadlock.

 

Transaction T1

Transaction T2

begin tran

 

update t_lock1 set c2 = 10

where c1 = 5

 

begin tran

 

update t_lock2 set c2 = 10

where c1 = 1

update t_lock2 set c2 = 10

where c1 = 1

 

update t_lock1 set c2 = 11

where c1 = 5

commit

commit

 

 

 

Here is the output of TF-1222 (with prefix information removed for clarity) that is generated for this deadlock event. This provides lots of useful details (hi-lighted) like the sessions involved in the deadlock, the locks currently held, locks they are blocked on, and the currently executing statements. It shows that there are two sessions (listed under <process-list>) that are blocked on each other and there two resources (listed under <resource-list>) indicating lock owners and waiters. Note, there is no detail when and where the lock was acquired. For that you will need to analyze all the statements in each of the tasks. With this information, you may be able to identify the cause of the deadlock and then make appropriate changes in the application to reduce its occurrence. For the deadlock in the example here, you can possibly eliminate this deadlock in two ways. First, you can execute each statement in its own transaction (i.e. by keeping the transaction short). Second, you can change the order of update statements in one of the transaction such that they access the tables in the same order (i.e. access objects in the same order). However, this may not always be possible for a given application logic. For more details please refer to product documentation.

 

<deadlock-list>

 <deadlock victim="process699978">

  <process-list>

 

   <process id="process698d48" taskpriority="0" logused="216" waitresource="RID: 6:1:143:4" waittime="5027" ownerId="4031" transactionname="user_transaction" lasttranstarted="2005-12-28T17:44:51.830" XDES="0x3dcba00" lockMode="U" schedulerid="1" kpid="892" status="suspended" spid="53" sbid="0" ecid="0" priority="0" transcount="2" lastbatchstarted="2005-12-28T17:44:56.147" lastbatchcompleted="2005-12-28T17:44:51.830" clientapp="Microsoft SQL Server Management Studio - Query" hostname="SUNILA2" hostpid="904" loginname="REDMOND\sunila" isolationlevel="read committed (2)" xactid="4031" currentdb="6" lockTimeout="4294967295" clientoption1="671090784" clientoption2="390200">

    <executionStack>

     <frame procname="adhoc" line="1" stmtstart="38" sqlhandle="0x02000000f998880690371817254885b67f17fbc1b16c6672">

UPDATE [t_lock1] set [c2] = @1  WHERE [c1]=@2     </frame>

     <frame procname="adhoc" line="1" sqlhandle="0x02000000247b290d01bea785a1b6b540fbe5cd7f0c4ff756">

update t_lock1 set c2 = 11 where c1 = 5     </frame>

    </executionStack>

    <inputbuf>

update t_lock1 set c2 = 11 where c1 = 5    </inputbuf>

 

   </process>

   <process id="process699978" taskpriority="0" logused="216" waitresource="RID: 6:1:181:0" waittime="90" ownerId="4021" transactionname="user_transaction" lasttranstarted="2005-12-28T17:44:45.720" XDES="0x3dcade0" lockMode="U" schedulerid="1" kpid="3748" status="suspended" spid="52" sbid="0" ecid="0" priority="0" transcount="2" lastbatchstarted="2005-12-28T17:45:01.080" lastbatchcompleted="2005-12-28T17:44:45.720" clientapp="Microsoft SQL Server Management Studio - Query" hostname="SUNILA2" hostpid="3796" loginname="REDMOND\sunila" isolationlevel="read committed (2)" xactid="4021" currentdb="6" lockTimeout="4294967295" clientoption1="671090784" clientoption2="390200">

    <executionStack>

     <frame procname="adhoc" line="1" stmtstart="38" sqlhandle="0x02000000f276f80843b1ada37f08a824a0226584ce0e87fa">

UPDATE [t_lock2] set [c2] = @1  WHERE [c1]=@2     </frame>

     <frame procname="adhoc" line="1" sqlhandle="0x02000000a7311f2c650673880275b86837c003b590d4df0c">

update t_lock2 set c2 = 10 where c1 = 1     </frame>

    </executionStack>

    <inputbuf>

update t_lock2 set c2 = 10 where c1 = 1    </inputbuf>

   </process>

 

  </process-list>

 

 

  <resource-list>

 

   <ridlock fileid="1" pageid="143" dbid="6" objectname="general.dbo.t_lock1" id="lock35c6e40" mode="X" associatedObjectId="72057594041139200">

    <owner-list>

     <owner id="process699978" mode="X"/>

    </owner-list>

    <waiter-list>

     <waiter id="process698d48" mode="U" requestType="wait"/>

    </waiter-list>

   </ridlock>

 

   <ridlock fileid="1" pageid="181" dbid="6" objectname="general.dbo.t_lock2" id="lock35c7040" mode="X" associatedObjectId="72057594041270272">

    <owner-list>

     <owner id="process698d48" mode="X"/>

    </owner-list>

    <waiter-list>

     <waiter id="process699978" mode="U" requestType="wait"/>

    </waiter-list>

   </ridlock>

 

  </resource-list>

 </deadlock>

</deadlock-list>

 

You can also get a graphical representation of this deadlock by enabling deadlock_graph event. The best way to troubleshoot deadlocks is to avoid one by designing your application appropriately. We touched upon couple of those in the previous example. Here are some general guidelines:

·        Use short transactions where possible. Longer running transaction that the locks are held longer thereby increasing the likelihood of deadlock.

·        Avoid user interactions within a transaction. User interactions can potentially take a long unbounded amount of time. This can hold up the crucial resources in your application and deny other transactions from proceeding.

·        Try at access resources in the same order. Like we described in the example above, if all transactions access resources in the same order, you will minimize deadlocks.

·        Run your transactions at lower level of isolation, if possible. This minimizes the duration of the locks held. Note, this is only applicable to SH locks only.

Use row versioning based isolation levels (e.g. SI and RCSI), where possible. Queries executing under these isolation levels don’t acquire any locks on the data. Note, queries will still acquire schema stability lock under these isolation levels, just like it is done for queries running under read-uncommitted isolation level and can get blocked if there is a concurrent DDL operation on the objects referenced in the query

Leave a Comment
  • Please add 8 and 2 and type the answer here:
  • Post
  • Hello Sunil,

    Thank you for the nice article.
    How would you recommend to act in situations when input buf for one of the process is empty, like in this example:

    [skipped first process which states sql statement in the input buf]

    process id=process33b0e38 taskpriority=20 logused=189668 waitresource=KEY: 5:72057594080198656 (7800019533e2) waittime=3609 ownerId=237893973 transactionname=ftAutoInterface lasttranstarted=2006-06-08T00:23:47.577 XDES=0x361174e0 lockMode=U schedulerid=15 kpid=20240 status=background spid=17 sbid=0 ecid=0 priority=0 transcount=0
    2006-06-08 00:23:51.36 spid20s         executionStack
    2006-06-08 00:23:51.36 spid20s         inputbuf
    2006-06-08 00:23:51.36 spid20s       resource-list
    2006-06-08 00:23:51.36 spid20s        keylock hobtid=72057594080198656 dbid=5 objectname=FP.sys.fulltext_index_map_70291310 indexname=i1 id=lock3dab9d80 mode=X associatedObjectId=72057594080198656
    2006-06-08 00:23:51.36 spid20s         owner-list
    2006-06-08 00:23:51.36 spid20s          owner id=processfd55b8 mode=X
    2006-06-08 00:23:51.36 spid20s         waiter-list
    2006-06-08 00:23:51.36 spid20s          waiter id=process33b0e38 mode=U requestType=wait
    2006-06-08 00:23:51.36 spid20s        keylock hobtid=72057594080198656 dbid=5 objectname=FP.sys.fulltext_index_map_70291310 indexname=i1 id=lock53155540 mode=X associatedObjectId=72057594080198656
    2006-06-08 00:23:51.36 spid20s         owner-list
    2006-06-08 00:23:51.36 spid20s          owner id=process33b0e38 mode=X
    2006-06-08 00:23:51.36 spid20s         waiter-list
    2006-06-08 00:23:51.36 spid20s          waiter id=processfd55b8 mode=U requestType=wait

    Where do I go from here? Obviously this is related to full-text search engine, but is it a querying or indexing part of it? and I know, it's probably out of the scope of this blog entry, but how would one approach situations where 1 of the processes involved in deadlock is a system process, particulary FT indexing?

    Many thanks, Alex
  • Info from full-text team>
    Deadlock between ft work tasks or between ft work task and user statement is expected in some scenarios. Work task will always be killed and we will retry the task. You can safely ignore this kind of deadlock. Are you seeing server gets stuck?
  • Thank you for the response.
    The problem I have is that the second process involved in deadlock is a replication procedure and when it gets stuck - replication latency gets way to big with undistributed commands queue growing very quickly which leads to even to more issues.

    I have posted this quesion in usergroup http://www.microsoft.com/communities/newsgroups/list/en-us/default.aspx?dg=microsoft.public.sqlserver.fulltext&mid=e8d913a5-4756-44ed-922a-16dc556beec6   and would appreciate a lot if you can point me in the right direction.

    Alex Sibilev
    SQL Server MVP
  • SQL Server 2005 adds new capabilities to troubleshoot deadlocks. I talked about the new trace flag #1222...
  • Hi, Sunil

    So it seems like these two updates involved in the deadlock..they all came from Transaction T2

    update t_lock2 set c2 = 10

    where c1 = 1

    update t_lock1 set c2 = 11

    where c1 = 5

    However, i think it should be this update which came from transaction T1

    update t_lock2 set c2 = 11

    where c1 = 1

    Thank you,

    Bill

  • Can you please send me the deadlock graph with TF-1222 and also the schema of the two tables. Based on the above information, T1 should just cause blocking but NOT deadlock.

  • can I get your script that resulted in deadlock?

    from your 1222 info..

    transaction t2 <process id="process698d48"

    update t_lock1 set c2 = 11 where c1 = 5    </inputbuf>

    transaction t1 <process id="process699978"

    update t_lock2 set c2 = 10 where c1 = 1    </inputbuf>

    However, i think it should be this update which came from transaction T1..

    update t_lock2 set c2 = 11

    where c1 = 1

    and this update came from transaction T2 (which correclty shown)

    update t_lock1 set c2 = 11

    where c1 = 5

    unless your script actually says for transaction t1, the 2nd update stmt..

    update t_lock2 set c2 = 10 where c1 = 1

    Then it'd make sense however your example reflected the wrong script for transaction t1 though.

    I think my script below closely matches your table and returned correct deadlock info..

    CREATE DATABASE deadlocktest

          GO

          USE deadlocktest

          SET NOCOUNT ON

          DBCC TRACEON (1222, -1)

          GO

          IF OBJECT_ID ('t1') IS NOT NULL DROP TABLE t1

          IF OBJECT_ID ('t2') IS NOT NULL DROP TABLE t2

          IF OBJECT_ID ('p1') IS NOT NULL DROP PROC p1

          IF OBJECT_ID ('p2') IS NOT NULL DROP PROC p2

          GO

          CREATE TABLE t1 (c1 int, c2 int)

          GO

          DECLARE @x int

          SET @x = 1

          WHILE (@x <= 10) BEGIN

            INSERT INTO t1 VALUES (@x, @x+1)

            SET @x = @x + 1

          END

          GO

          CREATE TABLE t2 (c1 int, c2 int)

          GO

          DECLARE @x int

          SET @x = 1

          WHILE (@x <= 10) BEGIN

            INSERT INTO t2 VALUES (@x, @x+1)

            SET @x = @x + 1

          END

          GO

    -- transaction1

    begin tran

    update t1 set c2 = 10

    where c1 = 5

    WAITFOR DELAY '00:00:06'

    update t2 set c2 = 11

    where c1 = 1

    --commit

    --transaction2

    begin tran

    update t2 set c2 = 10

    where c1 = 1

    WAITFOR DELAY '00:00:08'

    update t1 set c2 = 11

    where c1 = 5

    --commit

    2007-09-07 03:00:45.98 spid15s     deadlock-list

    2007-09-07 03:00:45.98 spid15s      deadlock victim=process89ab68

    2007-09-07 03:00:45.98 spid15s       process-list

    2007-09-07 03:00:45.98 spid15s        process id=process89a988 taskpriority=0 logused=216 waitresource=RID: 10:1:171:0 waittime=11234 ownerId=490528 transactionname=user_transaction lasttranstarted=2007-09-07T03:00:28.687 XDES=0x5af6d48 lockMode=U schedulerid=1 kpid=1508 status=suspended spid=53 sbid=0 ecid=0 priority=0 transcount=2 lastbatchstarted=2007-09-07T03:00:28.590 lastbatchcompleted=2007-09-07T03:00:19.607 clientapp=Microsoft SQL Server Management Studio - Query hostname=host-06 hostpid=2876 loginname=host\bill isolationlevel=read committed (2) xactid=490528 currentdb=10 lockTimeout=4294967295 clientoption1=671090784 clientoption2=390200

    2007-09-07 03:00:45.98 spid15s         executionStack

    2007-09-07 03:00:45.98 spid15s          frame procname=adhoc line=9 stmtstart=38 sqlhandle=0x0200000032d6641a604886e839c75eb498d42358c5c6911a

    2007-09-07 03:00:45.98 spid15s     UPDATE [t2] set [c2] = @1  WHERE [c1]=@2    

    2007-09-07 03:00:45.98 spid15s          frame procname=adhoc line=9 stmtstart=168 sqlhandle=0x020000005e0c7e1c8cddc9f43edb54fd4f262e857aea4175

    2007-09-07 03:00:45.98 spid15s     update t2 set c2 = 11

    2007-09-07 03:00:45.98 spid15s     where c1 = 1    

    2007-09-07 03:00:45.98 spid15s         inputbuf

    2007-09-07 03:00:45.98 spid15s     begin tran

    2007-09-07 03:00:45.98 spid15s     update t1 set c2 = 10

    2007-09-07 03:00:45.98 spid15s     where c1 = 5

    2007-09-07 03:00:45.98 spid15s     WAITFOR DELAY '00:00:06'

    2007-09-07 03:00:45.98 spid15s     update t2 set c2 = 11

    2007-09-07 03:00:45.98 spid15s     where c1 = 1

    2007-09-07 03:00:45.98 spid15s        process id=process89ab68 taskpriority=0 logused=216 waitresource=RID: 10:1:169:4 waittime=4578 ownerId=490536 transactionname=user_transaction lasttranstarted=2007-09-07T03:00:33.390 XDES=0x5af72e8 lockMode=U schedulerid=1 kpid=5528 status=suspended spid=55 sbid=0 ecid=0 priority=0 transcount=2 lastbatchstarted=2007-09-07T03:00:33.390 lastbatchcompleted=2007-09-07T03:00:22.437 clientapp=Microsoft SQL Server Management Studio - Query hostname=host-06 hostpid=2876 loginname=host\bill isolationlevel=read committed (2) xactid=490536 currentdb=10 lockTimeout=4294967295 clientoption1=671090784 clientoption2=390200

    2007-09-07 03:00:45.98 spid15s         executionStack

    2007-09-07 03:00:45.98 spid15s          frame procname=adhoc line=9 stmtstart=38 sqlhandle=0x02000000c78ec102da3fe6a72a422a4f5672cacbea7eb738

    2007-09-07 03:00:45.98 spid15s     UPDATE [t1] set [c2] = @1  WHERE [c1]=@2    

    2007-09-07 03:00:45.98 spid15s          frame procname=adhoc line=9 stmtstart=168 sqlhandle=0x020000000493c6023c8fe018ebf64eb439f353703a6c340b

    2007-09-07 03:00:45.98 spid15s     update t1 set c2 = 11

    2007-09-07 03:00:45.98 spid15s     where c1 = 5    

    2007-09-07 03:00:45.98 spid15s         inputbuf

    2007-09-07 03:00:45.98 spid15s     begin tran

    2007-09-07 03:00:45.98 spid15s     update t2 set c2 = 10

    2007-09-07 03:00:45.98 spid15s     where c1 = 1

    2007-09-07 03:00:45.98 spid15s     WAITFOR DELAY '00:00:08'

    2007-09-07 03:00:45.98 spid15s     update t1 set c2 = 11

    2007-09-07 03:00:45.98 spid15s     where c1 = 5

    2007-09-07 03:00:45.98 spid15s       resource-list

    2007-09-07 03:00:45.98 spid15s        ridlock fileid=1 pageid=171 dbid=10 objectname=deadlocktest.dbo.t2 id=lock3ca4480 mode=X associatedObjectId=72057594038583296

    2007-09-07 03:00:45.98 spid15s         owner-list

    2007-09-07 03:00:45.98 spid15s          owner id=process89ab68 mode=X

    2007-09-07 03:00:45.98 spid15s         waiter-list

    2007-09-07 03:00:45.98 spid15s          waiter id=process89a988 mode=U requestType=wait

    2007-09-07 03:00:45.98 spid15s        ridlock fileid=1 pageid=169 dbid=10 objectname=deadlocktest.dbo.t1 id=lock3ca4680 mode=X associatedObjectId=72057594038517760

    2007-09-07 03:00:45.98 spid15s         owner-list

    2007-09-07 03:00:45.98 spid15s          owner id=process89a988 mode=X

    2007-09-07 03:00:45.98 spid15s         waiter-list

    2007-09-07 03:00:45.98 spid15s          waiter id=process89ab68 mode=U requestType=wait

  • yes, you are correct. I updated it. Thanks for catching it. I must have been trying different things and pasted the wrong TF output from errorlog

  • Have you ever seen a situation when to processes block on each other, but the resolver does not identify the deadlock?  I have that at a customer site.  Two SPIDs pointing to each other.  Quite static, but I had to kill one manually.  This happens every few days.  Does that ring a bell?

    Thanks

    -- Ran

  • Is there any function on SQL which will break the dead lock situation or any procedure available which will check for any deadlock periodically and break such situation.

    Thanks

  • How we can findout that in which Port No, SQL serevr is running??

Page 1 of 1 (11 items)