Deadlock Troubleshooting, Part 1

Deadlock Troubleshooting, Part 1

Rate This
  • Comments 79

A deadlock is a circular blocking chain, where two or more threads are each blocked by the other so that no one can proceed.  When the deadlock monitor thread in SQL Server detects a circular blocking chain, it selects one of the participants as a victim, cancels that spid’s current batch, and rolls backs his transaction in order to let the other spids continue with their work.  The deadlock victim will get a 1205 error:

 

Transaction (Process ID 52) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

 

A deadlock is a special type of blocking scenario, but blocking and deadlocking are not the same thing.  Sometimes we have people report that they are experiencing "deadlocking" when they are really only seeing blocking.

 

With very few exceptions, deadlocks are a natural side effect of blocking, not a SQL Server bug.  The typical deadlock solution is either a stored proc/app code tweak, or a schema/indexing change. 

 

Here’s how to troubleshoot deadlocks.  These steps apply to most deadlocks, and they’ll allow you to resolve many of them without even having to dig into query plans or other nitty gritty details.  What’s that?  You like digging into query plans, and have nitty grits for breakfast every morning?  OK then, we’ll look at a deadlock scenario from the inside out a bit later.  But first, here are the basics:

 

  1. Turn on trace flag 1222 with “DBCC TRACEON (1222, -1)” or by adding “-T1222” as a SQL startup parameter.  This trace flag is a new trace flag in SQL 2005, a much improved version of the tried-and-true -T1204.  If you’re running SQL 2005, you should be using 1222 instead of 1204 unless you have deep-seated masochistic tendencies. Alternatives to 1222:
    • If you are using SQL 2000 or SQL 7.0, you’ll have no choice but to fall back on the older -T1204. 
    • There’s a “Deadlock graph” Profiler trace event that provides the same info as -T1222.  Feel free to use this instead of -T1222 if you’re on SQL 2005.  But don’t waste your time with the “Lock:Deadlock” and “Lock:Deadlock Chain” trace events that are in SQL 2000, as they provide an unacceptably incomplete picture of the deadlock. 
  2. Get the -T1222 output from the SQL errorlog after the deadlock has occurred.  You’ll see output that looks like this:

deadlock-list

 deadlock victim=processdceda8

  process-list

   process id=processdceda8 taskpriority=0 logused=0 waitresource=KEY: 2:72057594051493888 (0400a4427a09) waittime=5000 ownerId=24008914 transactionname=SELECT lasttranstarted=2006-09-08T15:54:22.327 XDES=0x8fd9a848 lockMode=S schedulerid=1 kpid=4404 status=suspended spid=54 sbid=0 ecid=0 priority=0 transcount=0 lastbatchstarted=2006-09-08T15:54:22.293 lastbatchcompleted=2006-09-08T15:54:22.293 clientapp=OSQL-32 hostname=BARTD2 hostpid=3408 loginname=bartd isolationlevel=read committed (2) xactid=24008914 currentdb=2 lockTimeout=4294967295 clientoption1=538968096 clientoption2=128056

    executionStack

     frame procname=tempdb.dbo.p1 line=2 stmtstart=60 sqlhandle=0x03000200268be70bd

       SELECT c2, c3 FROM t1 WHERE c2 = @p1    

     frame procname=adhoc line=2 stmtstart=32 stmtend=52 sqlhandle=0x020000008a4df52d3

       EXEC p1 3    

    inputbuf

       EXEC p1 3

   process id=process3c54c58 taskpriority=0 logused=16952 waitresource=KEY: 2:72057594051559424 (0900fefcd2fe) waittime=5000 ownerId=24008903 transactionname=UPDATE lasttranstarted=2006-09-08T15:54:22.327 XDES=0x802ecdd0 lockMode=X schedulerid=2 kpid=4420 status=suspended spid=55 sbid=0 ecid=0 priority=0 transcount=2 lastbatchstarted=2006-09-08T15:54:22.327 lastbatchcompleted=2006-09-08T15:54:22.310 clientapp=OSQL-32 hostname=BARTD2 hostpid=2728 loginname=bartd isolationlevel=read committed (2) xactid=24008903 currentdb=2 lockTimeout=4294967295 clientoption1=538968096 clientoption2=128056

    executionStack

     frame procname=tempdb.dbo.p2 line=2 stmtstart=58 sqlhandle=0x030002005fafdb0c

       UPDATE t1 SET c1 = FLOOR (c1), c2 = FLOOR (c2) WHERE c1 = @p1    

     frame procname=adhoc line=2 stmtstart=32 stmtend=52 sqlhandle=0x020000006f878816

       EXEC p2 3    

    inputbuf

       EXEC p2 3

  resource-list

   keylock hobtid=72057594051559424 dbid=2 objectname=tempdb.dbo.t1 indexname=idx1 id=lock83642a00 mode=S associatedObjectId=72057594051559424

    owner-list

     owner id=processdceda8 mode=S

    waiter-list

     waiter id=process3c54c58 mode=X requestType=wait

   keylock hobtid=72057594051493888 dbid=2 objectname=tempdb.dbo.t1 indexname=cidx id=lock83643780 mode=X associatedObjectId=72057594051493888

    owner-list

     owner id=process3c54c58 mode=X

    waiter-list

     waiter id=processdceda8 mode=S requestType=wait

 

  1. “Decode” the -T1222 output to better understand the deadlock scenario.  The deadlock is summarized by a “process-list” and a “resource-list”.  A “process” is a spid or worker thread that participates in the deadlock.  Each process is assigned an identifier, like “processdceda8”.  A resource is a resource that one of the participants owns (usually a lock) that the other participant is waiting on.  I like to use a format like the one below to summarize the deadlock.  You can skip this step if you want, but I never do; I find it really helps me understand the deadlock situation more clearly.  I’ve highlighted in yellow each of the data points within the 1222 output that you would need to reconstruct this summary on your own.

               
    Spid 54 is running this query (line 2 of proc [p1]):
                                    SELECT c2, c3 FROM t1 WHERE c2 = @p1
                    Spid 55 is running this query (line 2 of proc [p2]):
                                    UPDATE t1 SET c1 = FLOOR (c1), c2 = FLOOR (c2) WHERE c1 = @p1
                   
                    Spid 54 is waiting for a Shared KEY lock on index t1.cidx.  
                                    (Spid 55 holds a conflicting X lock.)
                    Spid 55 is waiting for an eXclusive KEY lock on index t1.idx1.  
                                    (
    Spid 54 holds a conflicting S lock.)



    For most lock types (including KEY locks, as shown in this example), SQL will directly identify the index by name in the output.  For some lock types, though, you'll get an "associatedObjectId", but no object name.  An example: 


          pagelock fileid=1 pageid=95516 dbid=9 objectname="" id=lock177a9e280 mode=IX associatedObjectId=72057596554838016


    The attribute "associatedObjectId" isn't the type of Object ID that you're probably familiar with; it's actually a partition ID.  You can determine the database name by running "SELECT DB_NAME(9)", where the "9" in this example comes from the "dbid" attribute, highlighted in blue.  Then you can determine the index and table name by looking up the associatedObjectId/PartitionId in the indicated database: 

         SELECT OBJECT_NAME(i.object_id), i.name
         FROM sys.partitions AS p
         INNER JOIN sys.indexes AS i ON i.object_id = p.object_id AND i.index_id = p.index_id
         WHERE p.partition_id = 72057596554838016 

    For those of you on SQL 2005 who think that the -T1222 output is a bit overwhelming, you're right.  But you may also want to count your blessings and be thankful that you don’t have to wade through -T1204 output, which is a lot more difficult to interpret than -T1222 and doesn’t provide nearly as much useful information about the deadlock.  Check out the file "Decoding_T1204_Output.htm" attached to this post for annotated -T1204 output.
  2. Run the queries involved in the deadlock through Database Tuning Advisor.  Plop the query in a Management Studio query window, change db context to the correct database, right-click the query text and select “Analyze Query in DTA”.  Don’t skip this step; more than half of the deadlock issues we see are resolved simply by adding an appropriate index so that one of the queries runs more quickly and with a smaller lock footprint.  If DTA recommends indexes (it'll say “Estimated Improvement: <some non-zero>%”), create them and monitor to see if the deadlock persists.  You can select “Apply Recommendations” from the Action drop-down menu to create the index immediately, or save the CREATE INDEX commands as a script to create them during a maintenance window.  Be sure to tune each of the queries separately. 
  3. Make sure the query is using the minimum necessary transaction isolation level (-T1222 will tell you this – search the output for “isolationlevel”).  Queries run by transactional COM+ components will default to serializable, which is usually overkill.  This can be reduced by query hints (“...FROM tbl1 WITH (READCOMMITTED)...”), a SET TRANSACTION ISOLATION LEVEL command, or, in Windows 2003 and later, by configuring the object in the Component Services MMC plugin.
  4. Make sure that your transactions are as brief as they can be while still meeting the relevant business constraints.  Try not to use implicit transactions, as this model of transaction management encourages unnecessarily long transactions. 
  5. Look for other opportunities to improve the efficiency of the queries involved in the deadlock, either through query changes or through indexing improvements.  A query that locks the minimum number of resources will be much less likely to deadlock with another query.  Table scans, index scans, and large hashes or large sorts in the query plan may indicate opportunities for improvement.
  6. If one or both spids is running a multi-statement transaction, you may need to capture a profiler trace that spans the deadlock in order to identify the full set of queries that were involved in the deadlock.  Unfortunately, both -T1204 and -T1222 only print out the two queries that “closed the loop”, and it’s possible that one of the blocking locks was acquired by an earlier query run within the same transaction.

These are all general recommendations that you can apply to any deadlock without having to really roll up your sleeves and get dirty.  If after doing all of this you haven’t resolved it, though, you’ll have to dive a bit deeper and tailor a solution to the specifics of the scenario.  Here’s a menu of some common techniques that you can choose from when deciding how best to tackle a deadlock:

 

  • Access objects in the same order.   Consider the following two batches:

1. Begin Transaction

1. Begin Transaction

2. Update Part table

2. Update Supplier table

3. Update Supplier table

3. Update Part table

4. Commit Transaction

4. Commit Transaction

These two batches may deadlock frequently.  If both are about to execute step 3, they may each end up blocked by the other because they both need access to a resource that the other connection locked in step 2. 

  • If both deadlock participants are using the same index, consider adding an index that can provide an alternate access path to one of the spids.  For example, adding a covering nonclustered index for a SELECT involved in a deadlock may prevent the problem (assuming that none of the covering index keys are modified by the other deadlock participant).
  • On the other hand, if the spids are deadlocking because they took alternate paths (indexes) to a common required data row or page, consider whether one of the indexes can be removed or an index hint used to force both queries to share an access path.  Be cautious of potential performance hits as a result of this approach.
  • Deadlocks are a special type of blocking where two spids both end up blocking the other.  Sometimes the best way to prevent a deadlock is to force the blocking to occur at an earlier point in one of the two transactions.  For example, if you force spid A to be blocked by spid B at the very beginning of A’s transaction, it may not have a chance to acquire the lock resource that later ends up blocking spid B.  Doesn’t this means you are deliberately causing blocking?  Yes, but remember that you already have blocking or you wouldn’t be in a deadlock situation, and simple blocking is a big improvement over a deadlock.  As soon as B commits his transaction, A will be able to proceed.  HOLDLOCK and UPDLOCK hints can be useful for this.
  • If a high priority process is being selected as a victim in a deadlock with a lower priority process, the lower priority process could be modified to SET DEADLOCK_PRIORITY LOW.  Spids that set this will offer themselves up as the sacrificial lamb in any deadlock they encounter. 
  • Avoid placing clustered indexes on columns that are frequently updated. Updates to clustered index key columns will require locks on the clustered index (to move the row) and all nonclustered indexes (since the leaf level of NC indexes reference rows by clustered index key value). 
  • In some cases it may be appropriate to add a NOLOCK hint, assuming that one of the queries is a SELECT statement.  While this is a tempting path because it is a quick and easy solution for many deadlocks, approach it with caution as it carries with it all the usual caveats surrounding read uncommitted isolation level (a query could return a transactionally inconsistent view of the data).  If you are unfamiliar with the risks, read the "SET TRANSACTION ISOLATION LEVEL" topic in SQL Books Online. 
  • In SQL 2005 you could consider the new SNAPSHOT isolation level.  This will avoid most blocking while avoiding the risks of NOLOCK.  An even cooler new feature IMHO is the new READ COMMITTED SNAPSHOT database option (see ALTER DATABASE), which allows you to use a variant of snapshot isolation level without changing your app.  
  • If one or both locks involved in the deadlock are S/X TAB (table) locks, lock escalation may be involved.  You can reduce the likelihood of lock escalation by enabling trace flag 1224 (SQL 2005 and later) or 1211 (see KB 323630).  Note that this does not apply to "intent" TAB locks, which have a capital "I" prefix (e.g. IS / IX TAB locks).
  • If the deadlock is intermittent, sometimes the simplest solution is to add deadlock retry logic. The retry logic could be in T-SQL, as long as (a) you're on SQL 2005 or later so that you can use BEGIN TRY, and (b) your transaction is wholly-contained within a single stored proc or batch. See this article for details. If the deadlock transaction spans multiple batches you can still add deadlock retry logic, but it would need to be moved out to the client app code. If you can only add deadlock retry logic to one of the participants in the deadlock, you can use SET DEADLOCK_PRIORITY LOW to ensure that the engine prefentially aborts the transaction of the guy that has the retry logic.

In a follow-up post I’ll look at a fairly typical deadlock in detail.  This will provide an example of what you'd have to do if the 8 high-level steps listed above fail you, forcing you to understand the scenario at a deeper level so that you can craft a custom solution.  

  

(This post series is continued in Deadlock Troubleshooting, Part 2.)

  

Attachment: SQL2000_Deadlocks_T1204.htm
Leave a Comment
  • Please add 5 and 7 and type the answer here:
  • Post
  • FrankG, you may have meant it tongue-in-cheek :), but yes, I think your best bet may be to contact MS support for assistance with your deadlock involving MS repl-created tables.  The only option available to you without modifying system procs or system tables would be to force a different plan with a plan guide, and that approach to a solution may not "stick" across service packs or QFEs if the change updates the merge trigger or the MSMerge stored proc involved in the deadlock.  

  • I have already described several deadlock scenarios that involve only one table in another post. This

  • This is a really great article.

    I wish that I had access to something like this resource 5 years ago when I had to solve some spectacular deadlocking issues on a SQL Server 2000 app.

    I especially like the index tuning advisor hint - that is sooooo true.

  • Can 2 processes acquire rowlock on same row?

    I'm facing a deadlock where 2 processes have acquired row lock on same row and waiting for eachother. Here is the deadlock graph.

    2009-07-28 14:13:29.50 spid18s        ridlock fileid=1 pageid=10089 dbid=5 objectname=dcmdb.dcmdbuser.fs_payaccount id=lock1254ce80 mode=X associatedObjectId=72057595031977984

    2009-07-28 14:13:29.50 spid18s         owner-list

    2009-07-28 14:13:29.50 spid18s          owner id=process93af28 mode=X

    2009-07-28 14:13:29.50 spid18s         waiter-list

    2009-07-28 14:13:29.50 spid18s          waiter id=process93a988 mode=U requestType=wait

    2009-07-28 14:13:29.50 spid18s        ridlock fileid=1 pageid=10089 dbid=5 objectname=dcmdb.dcmdbuser.fs_payaccount id=lock1257ba00 mode=X associatedObjectId=72057595031977984

    2009-07-28 14:13:29.50 spid18s         owner-list

    2009-07-28 14:13:29.50 spid18s          owner id=process93a988 mode=X

    2009-07-28 14:13:29.50 spid18s         waiter-list

    2009-07-28 14:13:29.50 spid18s          waiter id=process93af28 mode=U requestType=wait

    As you can see above the associatedObjectId is same for both ridlocks.

    When can this happen? From the logic of my program 2 threads will never update the same row in fs_payaccount table.

  • Rashmi,

    "associatedObjectId" isn't a row identifier; it's the heap or B-tree identifier (HoBT ID -- see http://msdn.microsoft.com/en-us/library/ms178104.aspx for documentation of the 1211 output fields that aren't doc'ed in this post).  In other words, it just identifies the table or index.  You can look up the object name associated with a HoBT ID using "SELECT OBJECT_NAME(object_id) FROM sys.partitions WHERE hobt_id = xxx", but it would just tell you that the row lock is on table "dcmdb.dcmdbuser.fs_payaccount", which you can already tell from the rest of the 1211 output.  

    In other words, there are two different rows (from the same table) involved in this deadlock.  There's not enough information here to be sure, but my guess would be that one or both of the queries involved are scanning this table to locate the rows to update.  This would acquire Update locks on all rows. 

    Bart

  • Thank you very much.Great article and it saved my day....

  • Great article!  Fixed my deadlock problems by creating a new index.

    The DTA tool does seem a little index happy though.

    What is the downside of adding too many indexes?

    Thanks!

  • Hello Bart,

    I am new to this deadlock theory. Just gone through your article and it is explained very nice way.

    Actually I am facing a deadlock problem while I transfer database from SQL Server 2005 Express Edition to SQL Server 2008 Express Edition with SP1 (both have different instances and I am transferring database from 2005 instance to 2008 instance).

    Actually I have posted this problem on msdn site. I would appreciate if   you can spare some minutes and take a look at the below given link. I have explained whole problem in that thread.

    Regards,

    Jigs

  • @Chad -

    The most common downside of too many indexes is incrementally more expensive updates, inserts, and deletes.  For most scenarios this cost won't be noticible, though.  If an index is effective in eliminating scans and preventing a deadlock, I'd only pause before adding it in one of these two scenarios:

    - the workload is OLTP-like, and the table has relatively high sustained modification rates (say, >50 transactions/sec that modify the table)

    - the table is frequently reindexed, or large amounts of data are regularly inserted into the data (e.g. in a warehouse with a regular data feed from other systems), and the insert/reindex job already threatens to exceed a reasonable execution time.

  • Thanks for the good article.hope will see more article from you

  • Bart,

    You should shown great insight into Deadlocks...I have learned tremendously from your 3 articles...

    Please keep it up!

    Thanks!

  • Thank you so much for this article, great info and very (very!) helpful

  • Actually I am facing a deadlock problem while I transfer database from SQL Server 2005 Express Edition to SQL Server 2008 Express Edition with SP1 (both have different instances and I am transferring database from 2005 instance to 2008 instance).

  • Hi,

    This is a great blog... Thanks !!

    As a beginner DBA, how can we conclude at the first instance that a deadlock has occured in the server ?

    (When neither the trace flags are turned on nor the profiler is set..)

    What are the symptoms of a deadlock..?

  • @Pastille, if you don't have either trace flag enabled and you're not running profiler, then the only symptom of most deadlocks is an error message returned to the application.  (See the first paragraph of this blog post.)

Page 4 of 6 (79 items) «23456