Deadlock Troubleshooting, Part 1

Deadlock Troubleshooting, Part 1

Rate This
  • Comments 79

A deadlock is a circular blocking chain, where two or more threads are each blocked by the other so that no one can proceed.  When the deadlock monitor thread in SQL Server detects a circular blocking chain, it selects one of the participants as a victim, cancels that spid’s current batch, and rolls backs his transaction in order to let the other spids continue with their work.  The deadlock victim will get a 1205 error:

 

Transaction (Process ID 52) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

 

A deadlock is a special type of blocking scenario, but blocking and deadlocking are not the same thing.  Sometimes we have people report that they are experiencing "deadlocking" when they are really only seeing blocking.

 

With very few exceptions, deadlocks are a natural side effect of blocking, not a SQL Server bug.  The typical deadlock solution is either a stored proc/app code tweak, or a schema/indexing change. 

 

Here’s how to troubleshoot deadlocks.  These steps apply to most deadlocks, and they’ll allow you to resolve many of them without even having to dig into query plans or other nitty gritty details.  What’s that?  You like digging into query plans, and have nitty grits for breakfast every morning?  OK then, we’ll look at a deadlock scenario from the inside out a bit later.  But first, here are the basics:

 

  1. Turn on trace flag 1222 with “DBCC TRACEON (1222, -1)” or by adding “-T1222” as a SQL startup parameter.  This trace flag is a new trace flag in SQL 2005, a much improved version of the tried-and-true -T1204.  If you’re running SQL 2005, you should be using 1222 instead of 1204 unless you have deep-seated masochistic tendencies. Alternatives to 1222:
    • If you are using SQL 2000 or SQL 7.0, you’ll have no choice but to fall back on the older -T1204. 
    • There’s a “Deadlock graph” Profiler trace event that provides the same info as -T1222.  Feel free to use this instead of -T1222 if you’re on SQL 2005.  But don’t waste your time with the “Lock:Deadlock” and “Lock:Deadlock Chain” trace events that are in SQL 2000, as they provide an unacceptably incomplete picture of the deadlock. 
  2. Get the -T1222 output from the SQL errorlog after the deadlock has occurred.  You’ll see output that looks like this:

deadlock-list

 deadlock victim=processdceda8

  process-list

   process id=processdceda8 taskpriority=0 logused=0 waitresource=KEY: 2:72057594051493888 (0400a4427a09) waittime=5000 ownerId=24008914 transactionname=SELECT lasttranstarted=2006-09-08T15:54:22.327 XDES=0x8fd9a848 lockMode=S schedulerid=1 kpid=4404 status=suspended spid=54 sbid=0 ecid=0 priority=0 transcount=0 lastbatchstarted=2006-09-08T15:54:22.293 lastbatchcompleted=2006-09-08T15:54:22.293 clientapp=OSQL-32 hostname=BARTD2 hostpid=3408 loginname=bartd isolationlevel=read committed (2) xactid=24008914 currentdb=2 lockTimeout=4294967295 clientoption1=538968096 clientoption2=128056

    executionStack

     frame procname=tempdb.dbo.p1 line=2 stmtstart=60 sqlhandle=0x03000200268be70bd

       SELECT c2, c3 FROM t1 WHERE c2 = @p1    

     frame procname=adhoc line=2 stmtstart=32 stmtend=52 sqlhandle=0x020000008a4df52d3

       EXEC p1 3    

    inputbuf

       EXEC p1 3

   process id=process3c54c58 taskpriority=0 logused=16952 waitresource=KEY: 2:72057594051559424 (0900fefcd2fe) waittime=5000 ownerId=24008903 transactionname=UPDATE lasttranstarted=2006-09-08T15:54:22.327 XDES=0x802ecdd0 lockMode=X schedulerid=2 kpid=4420 status=suspended spid=55 sbid=0 ecid=0 priority=0 transcount=2 lastbatchstarted=2006-09-08T15:54:22.327 lastbatchcompleted=2006-09-08T15:54:22.310 clientapp=OSQL-32 hostname=BARTD2 hostpid=2728 loginname=bartd isolationlevel=read committed (2) xactid=24008903 currentdb=2 lockTimeout=4294967295 clientoption1=538968096 clientoption2=128056

    executionStack

     frame procname=tempdb.dbo.p2 line=2 stmtstart=58 sqlhandle=0x030002005fafdb0c

       UPDATE t1 SET c1 = FLOOR (c1), c2 = FLOOR (c2) WHERE c1 = @p1    

     frame procname=adhoc line=2 stmtstart=32 stmtend=52 sqlhandle=0x020000006f878816

       EXEC p2 3    

    inputbuf

       EXEC p2 3

  resource-list

   keylock hobtid=72057594051559424 dbid=2 objectname=tempdb.dbo.t1 indexname=idx1 id=lock83642a00 mode=S associatedObjectId=72057594051559424

    owner-list

     owner id=processdceda8 mode=S

    waiter-list

     waiter id=process3c54c58 mode=X requestType=wait

   keylock hobtid=72057594051493888 dbid=2 objectname=tempdb.dbo.t1 indexname=cidx id=lock83643780 mode=X associatedObjectId=72057594051493888

    owner-list

     owner id=process3c54c58 mode=X

    waiter-list

     waiter id=processdceda8 mode=S requestType=wait

 

  1. “Decode” the -T1222 output to better understand the deadlock scenario.  The deadlock is summarized by a “process-list” and a “resource-list”.  A “process” is a spid or worker thread that participates in the deadlock.  Each process is assigned an identifier, like “processdceda8”.  A resource is a resource that one of the participants owns (usually a lock) that the other participant is waiting on.  I like to use a format like the one below to summarize the deadlock.  You can skip this step if you want, but I never do; I find it really helps me understand the deadlock situation more clearly.  I’ve highlighted in yellow each of the data points within the 1222 output that you would need to reconstruct this summary on your own.

               
    Spid 54 is running this query (line 2 of proc [p1]):
                                    SELECT c2, c3 FROM t1 WHERE c2 = @p1
                    Spid 55 is running this query (line 2 of proc [p2]):
                                    UPDATE t1 SET c1 = FLOOR (c1), c2 = FLOOR (c2) WHERE c1 = @p1
                   
                    Spid 54 is waiting for a Shared KEY lock on index t1.cidx.  
                                    (Spid 55 holds a conflicting X lock.)
                    Spid 55 is waiting for an eXclusive KEY lock on index t1.idx1.  
                                    (
    Spid 54 holds a conflicting S lock.)



    For most lock types (including KEY locks, as shown in this example), SQL will directly identify the index by name in the output.  For some lock types, though, you'll get an "associatedObjectId", but no object name.  An example: 


          pagelock fileid=1 pageid=95516 dbid=9 objectname="" id=lock177a9e280 mode=IX associatedObjectId=72057596554838016


    The attribute "associatedObjectId" isn't the type of Object ID that you're probably familiar with; it's actually a partition ID.  You can determine the database name by running "SELECT DB_NAME(9)", where the "9" in this example comes from the "dbid" attribute, highlighted in blue.  Then you can determine the index and table name by looking up the associatedObjectId/PartitionId in the indicated database: 

         SELECT OBJECT_NAME(i.object_id), i.name
         FROM sys.partitions AS p
         INNER JOIN sys.indexes AS i ON i.object_id = p.object_id AND i.index_id = p.index_id
         WHERE p.partition_id = 72057596554838016 

    For those of you on SQL 2005 who think that the -T1222 output is a bit overwhelming, you're right.  But you may also want to count your blessings and be thankful that you don’t have to wade through -T1204 output, which is a lot more difficult to interpret than -T1222 and doesn’t provide nearly as much useful information about the deadlock.  Check out the file "Decoding_T1204_Output.htm" attached to this post for annotated -T1204 output.
  2. Run the queries involved in the deadlock through Database Tuning Advisor.  Plop the query in a Management Studio query window, change db context to the correct database, right-click the query text and select “Analyze Query in DTA”.  Don’t skip this step; more than half of the deadlock issues we see are resolved simply by adding an appropriate index so that one of the queries runs more quickly and with a smaller lock footprint.  If DTA recommends indexes (it'll say “Estimated Improvement: <some non-zero>%”), create them and monitor to see if the deadlock persists.  You can select “Apply Recommendations” from the Action drop-down menu to create the index immediately, or save the CREATE INDEX commands as a script to create them during a maintenance window.  Be sure to tune each of the queries separately. 
  3. Make sure the query is using the minimum necessary transaction isolation level (-T1222 will tell you this – search the output for “isolationlevel”).  Queries run by transactional COM+ components will default to serializable, which is usually overkill.  This can be reduced by query hints (“...FROM tbl1 WITH (READCOMMITTED)...”), a SET TRANSACTION ISOLATION LEVEL command, or, in Windows 2003 and later, by configuring the object in the Component Services MMC plugin.
  4. Make sure that your transactions are as brief as they can be while still meeting the relevant business constraints.  Try not to use implicit transactions, as this model of transaction management encourages unnecessarily long transactions. 
  5. Look for other opportunities to improve the efficiency of the queries involved in the deadlock, either through query changes or through indexing improvements.  A query that locks the minimum number of resources will be much less likely to deadlock with another query.  Table scans, index scans, and large hashes or large sorts in the query plan may indicate opportunities for improvement.
  6. If one or both spids is running a multi-statement transaction, you may need to capture a profiler trace that spans the deadlock in order to identify the full set of queries that were involved in the deadlock.  Unfortunately, both -T1204 and -T1222 only print out the two queries that “closed the loop”, and it’s possible that one of the blocking locks was acquired by an earlier query run within the same transaction.

These are all general recommendations that you can apply to any deadlock without having to really roll up your sleeves and get dirty.  If after doing all of this you haven’t resolved it, though, you’ll have to dive a bit deeper and tailor a solution to the specifics of the scenario.  Here’s a menu of some common techniques that you can choose from when deciding how best to tackle a deadlock:

 

  • Access objects in the same order.   Consider the following two batches:

1. Begin Transaction

1. Begin Transaction

2. Update Part table

2. Update Supplier table

3. Update Supplier table

3. Update Part table

4. Commit Transaction

4. Commit Transaction

These two batches may deadlock frequently.  If both are about to execute step 3, they may each end up blocked by the other because they both need access to a resource that the other connection locked in step 2. 

  • If both deadlock participants are using the same index, consider adding an index that can provide an alternate access path to one of the spids.  For example, adding a covering nonclustered index for a SELECT involved in a deadlock may prevent the problem (assuming that none of the covering index keys are modified by the other deadlock participant).
  • On the other hand, if the spids are deadlocking because they took alternate paths (indexes) to a common required data row or page, consider whether one of the indexes can be removed or an index hint used to force both queries to share an access path.  Be cautious of potential performance hits as a result of this approach.
  • Deadlocks are a special type of blocking where two spids both end up blocking the other.  Sometimes the best way to prevent a deadlock is to force the blocking to occur at an earlier point in one of the two transactions.  For example, if you force spid A to be blocked by spid B at the very beginning of A’s transaction, it may not have a chance to acquire the lock resource that later ends up blocking spid B.  Doesn’t this means you are deliberately causing blocking?  Yes, but remember that you already have blocking or you wouldn’t be in a deadlock situation, and simple blocking is a big improvement over a deadlock.  As soon as B commits his transaction, A will be able to proceed.  HOLDLOCK and UPDLOCK hints can be useful for this.
  • If a high priority process is being selected as a victim in a deadlock with a lower priority process, the lower priority process could be modified to SET DEADLOCK_PRIORITY LOW.  Spids that set this will offer themselves up as the sacrificial lamb in any deadlock they encounter. 
  • Avoid placing clustered indexes on columns that are frequently updated. Updates to clustered index key columns will require locks on the clustered index (to move the row) and all nonclustered indexes (since the leaf level of NC indexes reference rows by clustered index key value). 
  • In some cases it may be appropriate to add a NOLOCK hint, assuming that one of the queries is a SELECT statement.  While this is a tempting path because it is a quick and easy solution for many deadlocks, approach it with caution as it carries with it all the usual caveats surrounding read uncommitted isolation level (a query could return a transactionally inconsistent view of the data).  If you are unfamiliar with the risks, read the "SET TRANSACTION ISOLATION LEVEL" topic in SQL Books Online. 
  • In SQL 2005 you could consider the new SNAPSHOT isolation level.  This will avoid most blocking while avoiding the risks of NOLOCK.  An even cooler new feature IMHO is the new READ COMMITTED SNAPSHOT database option (see ALTER DATABASE), which allows you to use a variant of snapshot isolation level without changing your app.  
  • If one or both locks involved in the deadlock are S/X TAB (table) locks, lock escalation may be involved.  You can reduce the likelihood of lock escalation by enabling trace flag 1224 (SQL 2005 and later) or 1211 (see KB 323630).  Note that this does not apply to "intent" TAB locks, which have a capital "I" prefix (e.g. IS / IX TAB locks).
  • If the deadlock is intermittent, sometimes the simplest solution is to add deadlock retry logic. The retry logic could be in T-SQL, as long as (a) you're on SQL 2005 or later so that you can use BEGIN TRY, and (b) your transaction is wholly-contained within a single stored proc or batch. See this article for details. If the deadlock transaction spans multiple batches you can still add deadlock retry logic, but it would need to be moved out to the client app code. If you can only add deadlock retry logic to one of the participants in the deadlock, you can use SET DEADLOCK_PRIORITY LOW to ensure that the engine prefentially aborts the transaction of the guy that has the retry logic.

In a follow-up post I’ll look at a fairly typical deadlock in detail.  This will provide an example of what you'd have to do if the 8 high-level steps listed above fail you, forcing you to understand the scenario at a deeper level so that you can craft a custom solution.  

  

(This post series is continued in Deadlock Troubleshooting, Part 2.)

  

Attachment: SQL2000_Deadlocks_T1204.htm
Leave a Comment
  • Please add 8 and 6 and type the answer here:
  • Post
  • Great Article, Can you recommend me some tips to solve a deadlock problem?
    I have a Master Stored Procedure, which internally calls other procedures as needed. This master proc is invoked from a Web Application (ASPX). Sporadically I get error messages
    Error Detail: System.Data.SqlClient.SqlException: Transaction (Process ID 60) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
      at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream)
      at System.Data.SqlClient.SqlCommand.ExecuteNonQuery()

    1. How do detect where the problem is...means which section of the code is causing deadlock?
    2. How should I rectify them?

    ?Should I wrap the master proc with begin end tran..(since its a very high called proc I left it intially)
    ?Should I wrap my internal procs with tran blocks?

    TIA
  • > 1. How do detect where the problem is...means which section of the code is causing deadlock?

    This is discussed in steps 1-3 in the post.  To recap that info: If you're on SQL 2005, turn on -T1222.  This will tell you the final 2 statements involved in the deadlock.  If you're on SQL 2000, turn on -T1204 and -T3605 and capture a profiler trace that includes the SP:StmtStarting, Lock:Deadlock, and Exception events (at a minimum).  


    > 2. How should I rectify them?
    Once you identify the queries involved in the deadlock, follow steps 4-8 in the post.  
  • Hi Bart,

    In the decoding process, you idetify one as conflicting Update lock, another one you just call Update lock. What's the difference between a conflicting lock and a ordinary lock? Does the trace flag 1222 deadlock graph provide the detail so we would know which one is a conflicting lock and which is not? Does it makes a difference when a lock is conflicting or not?

    Thanks a lot,

    Roger

  • Roger,

    By "conflicting" I simply meant that that the existing lock was incompatible with the new lock request.  "Conflicting lock" == blocking lock.  That's just my word choice, not a technical term.  If two lock requests are compatible (e.g. two shared locks), they will both be granted.  If the two requests conflict, one will be blocked and the other granted.

    The 1222 output does identify which lock requests have been granted (<owner-list>) and which are blocked (<waiter-list>).

    Bart

  • Thanks so much for the info... It helped me tremendously especially when I was trying to repeat the problem. FYI: my problem was related with indexes. We have a legacy app with stored procedures updating the same row for different purposes. And one of  the columns being updated has also a non-clustered index.  I open 2 query analyser and put while 1=1 to run 2 SPs and then Boom, I get the error in either fex seconds or 20 seconds. the only time I don't get an error iswhenI remove all the indexes which is not an option. Btw, this table doesn't have a primary key... I know I know.... It wasn't me who created and that person is no longer with the company... :) But I have to fix it.

  • Check the Companion tool at www.sqlminds.com  It will do for you all of the above steps and more. For example, if you have 3 or 4 SIDs deadlocking and each SIDs has multiple statements per DB transaction, the above approach will fail since it will report ONLY the statements, which deadlocked:

    SID1 - begin tran update t1 ... where PK = 1

    SID2 - begin tran update t1 ... where PK = 2

    SID3 - begin tran update t1 ... where PK = 3

    then

    SID1 - select * from t1 where PK = 3

    SID2 - select * from t1 where PK = 1

    SID3 - select * from t1 where PK = 2

    This is where the deadlock monitor (spid=4) will kick in and guess what, you'll be getting the last three statements in the output

    SID1 - select * from t1 where PK = 3

    SID2 - select * from t1 where PK = 1

    SID3 - select * from t1 where PK = 2

    I don't think you can figure out the deadlock given only these three statements.  You can do some tedious digging into the outstanding locks and figure out the deadlock but this can be done with the assumption that you know intimately your statements (i.e. what if you are an ASP - App service provider...).  Check out the tool I've mentioned; it will give you the blocking chain PLUS the timing.  HTH

  • The tool you describe sounds pretty cool.  

    FWIW, step 8 in the instructions above mentions that a profiler trace may be necessary if one or more of the deadlock participants are involved in a multi-batch transaction.  

  • is it possible to cause a deadlock by 2  "select" staments?

    tnks  a lot for any help !

  • Possibly.  One such case would be if the SELECT statements used a hint to change the type of locks being acquired (e.g. UPDLOCK, XLOCK).  You could also see this if the SELECT statements were part of a multi-statement transaction.  For example, these two transactions could deadlock on the SELECT statements:

      Connection 1:
         begin tran
         update t1 set ... where c1 = x
         select * from t1 where c1 = y

      Connection 2:
         begin tran
         update t1 set ... where c1 = y
         select * from t1 where c1 = x

    Troubleshoot these just as you would any other deadlock.

  • In the text above, on point #2 you have highlighted the TEXT with YELLOW back-ground, which is really cool.

    Will this feature (text with Yellow back-ground) be available out-of-box, if it can be, then it will be really cool

  • Prasanna,

    No, the SQL errorlog is just a plain text file; the yellow text highlighting is my emphasis.  I did it to call out some of the data points in the 1222 output that can be the most useful when trying to understand a deadlock.  You'll have to locate these data points in your own -T1222 output yourself.  

    Bart

  • Hi Bart thanks for your post, very helpful.  I have read through your Decoding_T1204_Output.htm and have a small question.  

    You:

    Spid 52 is running a DELETE statement on line 6 of the stored proc spClearItemStatus.  He holds an X lock on the key resource KEY: 7:2121058592:2 (a70064fb1eac).  This lock is blocking spid 52, who is waiting to acquire a U lock on the same key.

    Q: Should it say: This lock is blocking spid 51, instead?

    You:

    Spid 51 is running an UPDATE statement on line 47 of the stored proc spUpdateItemProp.  He holds an X lock on key KEY: 7:1977058079:1 (02014f0bec4e).  His X lock is blocking spid 51, who is waiting to acquire an X lock on the same key.  

    Q: Should it say: This lock is blocking spid 52, instead?

    Regards,

    Dmitrey

  • Dmitrey,

    You're absolutely right -- those were errors.  I've fixed them in HTM file attached to the post.

    Thanks!

    Bart

  • Hi, Bart

    Which profiler event(s) should I capture? I use Deadlock Graph and Blocked Process Report, are there others?

    Thank you,

    Bill

    If one or both spids is running a multi-statement transaction, you may need to capture a profiler trace that spans the deadlock in order to identify the full set of queries that were involved in the deadlock.  Unfortunately, both -T1204 and -T1222 only print out the two queries that “closed the loop”, and it’s possible that one of the blocking locks was acquired by an earlier query run within the same transaction.

  • Hi,

    Which profiler events capture such scenario?

    If one or both spids is running a multi-statement transaction, you may need to capture a profiler trace that spans the deadlock in order to identify the full set of queries that were involved in the deadlock.  Unfortunately, both -T1204 and -T1222 only print out the two queries that “closed the loop”, and it’s possible that one of the blocking locks was acquired by an earlier query run within the same transaction.

    Bill

Page 1 of 6 (79 items) 12345»