Here’s an example of the classic scenario that is usually used to introduce the concept of a deadlock in a database:
1. Begin Transaction
2. Update Part table
2. Update Supplier table
3. Update Supplier table
3. Update Part table
4. Commit Transaction
If Process A and Process B each reached step #3 in their respective transactions at approximately the same time, it’s easy to see how they could end up blocking each other. The most obvious solution to this deadlock is to change the order of the UPDATE statements in one of the transactions, so that lock resources are acquired in a consistent order.
Instead of this overly simplistic deadlock, let’s take a closer look at the deadlock scenario demonstrated in Deadlock Troubleshooting, Part 2. In that case, these two stored procedures ended up deadlocked:
CREATE PROC p1 @p1 int AS
SELECT c2, c3 FROM t1 WHERE c2 BETWEEN @p1 AND @p1+1
CREATE PROC p2 @p1 int AS
UPDATE t1 SET c2 = c2+1 WHERE c1 = @p1
UPDATE t1 SET c2 = c2-1 WHERE c1 = @p1
There’s no UPDATE, DELETE, or INSERT in the first proc; it consists of a single SELECT. And even the two UPDATE statements in the second proc aren’t wrapped in an outer BEGIN TRAN/COMMIT TRAN. Both UPDATEs ran within their own autocommit transaction, which means that only one of them could have been involved in the deadlock. Clearly this doesn’t fit the stereotypical “modify A then modify B / modify B then modify A” deadlock model described above. This isn’t an edge case, by the way. We actually see this type of deadlock – where one or both of the participants are in the middle a single-query, autocommit transaction – more often than easy-to-understand deadlock scenarios involving two multi-statement transactions that just modify two tables in a different order.
So, what would you do if DTA hadn’t automagically recommended a new index that prevented this deadlock? To craft your own solution by hand, you need a deeper understanding of the deadlock than we have at the moment.
What caused this deadlock?
We’ll need to refer back to the deadlock summary that was distilled from the -T1222 output (see Deadlock Troubleshooting, Part 1 for a refresher on decoding -T1222):
Spid X is running this query (line 2 of proc [p1], inputbuffer “… EXEC p1 4 …”): SELECT c2, c3 FROM t1 WHERE c2 BETWEEN @p1 AND @p1+1 Spid Y is running this query (line 2 of proc [p2], inputbuffer “EXEC p2 4”): UPDATE t1 SET c2 = c2+1 WHERE c1 = @p1 The SELECT is waiting for a Shared KEY lock on index t1.cidx. The UPDATE holds a conflicting X lock. The UPDATE is waiting for an eXclusive KEY lock on index t1.idx1. The SELECT holds a conflicting S lock.
First, let’s look at the query plan for the SELECT query. To view the plan, execute “SET STATISTICS PROFILE ON”, then run “EXEC p1 4”.
SELECT c2, c3 FROM t1 WHERE c2 BETWEEN @p1 AND @p1+1
|--Nested Loops(Inner Join, OUTER REFERENCES:([Uniq1002], [t1].[c1]))
|--Index Seek(OBJECT:([t1].[idx1]), SEEK:([t1].[c2] >= [@p1] AND [t1].[c2] <= [@p1]+(1)) ORDERED FORWARD)
|--Clustered Index Seek(OBJECT:([t1].[cidx]), SEEK:([t1].[c1]=[t1].[c1] AND [Uniq1002]=[Uniq1002]) LOOKUP ORDERED FORWARD)
A Nested Loop join executes its first child operator once, and executes the second child operator for each row returned by the first child (see this post for details). In this case, the first child is a nonclustered Index Seek to find the rows “WHERE c2 BETWEEN @p1 AND @p1+1”. For each qualifying row in the nonclustered index, a second seek is done on the clustered index to look up the whole data row. This clustered index seek is necessary because the nonclustered index does not cover the query. If you’re running SQL 2000, you’ll see a different-looking plan:
SELECT c2, c3 FROM t1 WHERE c2 BETWEEN @p1 AND @p1+1
|--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([t1]))
|--Index Seek(OBJECT:([t1].[idx1]), SEEK:([t1].[c2] >= [@p1] AND [t1].[c2] <= [@p1]+1) ORDERED FORWARD)
For practical purposes, these two plans are identical. The purpose of the Bookmark Lookup operator in the SQL 2000 plan is to visit the clustered index to retrieve the full set of columns for a row identified by a nonclustered index. In SQL 2005 this same operation is expressed as a loop join between the nonclustered index and the clustered index. For this deadlock, it’s simply important to note that both plans calls for a seek from the nonclustered index, then a seek from the clustered index.
Now let’s look at the UPDATE:
UPDATE t1 SET c2 = c2+1 WHERE c1 = @p1
|--Clustered Index Update(OBJECT:([t1].[cidx]), OBJECT:([t1].[idx1]), SET:([t1].[c2] = [Expr1004]))
|--Compute Scalar(DEFINE:([Expr1004]=[t1].[c2]+(1), [Expr1013]=CASE WHEN CASE WHEN ...
|--Top(ROWCOUNT est 0)
|--Clustered Index Seek(OBJECT:([t1].[cidx]), SEEK:([t1].[c1]=[@p1]) ORDERED FORWARD)
The UPDATE has a fairly simple query plan. The two most significant operators are the first and the last one. The Clustered Index Seek locates the rows that quality for the “WHERE c1 = @p1” predicate. Once a qualifying row has been found, the Clustered Index Update operator acquires an eXclusive key lock on the clustered index and modifies the row.
We now have a full understanding of how the UPDATE blocks the SELECT: the UPDATE acquires an X lock on a clustered index key, and that lock blocks the SELECT’s bookmark lookup on the clustered index. But the other half of the deadlock – the reason that the SELECT blocks the UPDATE – isn’t quite so obvious. The -T1222 told us “The UPDATE is waiting for an eXclusive KEY lock on index t1.idx1. The SELECT holds a conflicting S lock.” It’s not very apparent from the plan, but the UPDATE needs an X lock on the nonclustered index [idx1] because the column it is updating ([c2]) is one of the non-clustered index’s key columns. Any change to an index key column means that a row in the index must be relocated, and that relocation requires an X lock.
This is a key point to remember when trying to understand many deadlocks: the access path to find the qualifying rows is important, but index updates implied by the columns being modified can be just as important. To make things more confusing, sometimes you’ll see explicit “Index Update” or “Index Delete” operators in the plan for each nonclustered index that needs to be updated, while other times these don’t show up in the plan. (For more info on this check out Wide vs. Narrow Plans.)
To summarize: the SELECT used the nonclustered index to find a qualifying row. While holding a Shared lock on the nonclustered index, it needs to jump over to the clustered index and retrieve some columns that aren’t part of the nonclustered index. While it’s doing this, the UPDATE is busy doing a seek on the clustered index. It finds a row, locks it and modifies it. But because one of the columns being modified is a key column in the nonclustered index, it then has to move to the nonclustered index and update that index, too. This requires a second X key lock on the nonclustered index. So, the SELECT ends up blocked waiting for the UPDATE to release his X lock on the clustered index, while the UPDATE winds up blocked and waiting for the SELECT to release his S lock on the nonclustered index.
Hopefully it’s clear that even though each participant in this deadlock is just a single query, this is still a problem caused by out-of-order resource access patterns. The SELECT statement locks a key in the nonclustered index, then locks a key in the clustered index. The problem is that the UPDATE needs to lock the same two resources, but because of its query plan, it tries to lock them in the opposite order. In a sense, it’s really the same problem as the simple deadlock scenario described at the beginning of this post.
The locks acquired by a query aren’t acquired all at once. A query plan is like a little program. It wouldn’t be terribly inaccurate, for example, to think of a nested loop join as a FOR loop. Each iteration of the loop acquires a key lock on the outer table, then holds that lock while looking up (and locking) matching rows in the inner table. Deadlocks like this one are a little harder to figure out because the order of resource access within a single query depends on the query plan, and can’t be determined just by looking at the T-SQL.
How did DTA’s new index avoid the deadlock?
Here’s an index that will prevent this deadlock:
CREATE INDEX idx2 ON t1 (c2, c3)
This index “covers” the query “SELECT c2, c3 FROM t1 WHERE c2 BETWEEN @p1 AND @p1+1”, which is just another way of saying that the index includes all of the columns referenced by the query. SQL will use this index instead of the [idx1] index because the plan based on the covering index is cheaper. The fact that the index covers the query means that the bookmark lookup against the clustered index is no longer necessary. And since the SELECT no longer needs to access the clustered index, it won’t get blocked by the UPDATE’s lock on the clustered index.
What other solutions are available?
All deadlocks boil down to out-of-order resource access patterns. In the simple deadlock scenario described at the beginning of this post, the solution is obvious: just reverse the two UPDATE statements in one of the transactions, and you won’t end up deadlocked. But in the more complex scenario that we just explored, it’s not so clear how to change the order in which locks are acquired. Each deadlock participant is running a single-query, autocommit transaction, so you can’t just swap the order of two queries to acquire resources in a different order. SQL is a language designed to express high-level set operations; the specifics of how the database should go about retrieving and updating the specified set of data is generally left up to the SQL engine, with good reason. However, you do have some options for either influencing which lock resources a query needs, or modifying the order in which it acquires the locks. Below are six different possible solutions to this deadlock. Some of these are not ideal for this particular deadlock, but they are still worth exploring since the approach to deadlock avoidance that they illustrate may be the best possible solution for some other deadlock you encounter.
If you can think of any other solutions, please share them in a comment.
Yes this is probably the case, but these are different processes-- Wouldnt that imply different connections/spids and therefore different transactions? And if they were in the same transaction why would they block--the transaction would already hold the lock, right?
I took some of the attributes out of the deadlock trace for space-- I will look at the transaction id on the processes.
Thanks so much for the feedback! Any other thoughts?
(I wasnt logged in before, so sorry if I am double posting this response)
> Wouldnt that imply different connections/spids and therefore different transactions? > And if they were in the same transaction why would they block--the transaction would already hold the lock, right?
Yes, the different spids would have different transactions. That's exactly why they would block each other. Your scenario is probably something like:
1. spid A and spid B both start (separate) transactions2. spid A modifies row #1, acquiring and holding an X lock on this row3. spid B modifies row #2, acquiring and holding an X lock on this row4. spid A tries to modify row #2, but gets blocked because spid B still holds an X lock on this row5. spid B tries to modify row #1, but gets blocked because spid A still holds an X lock on this row
In the deadlock graph, you see the two spids immediately following step #5.
Right- I am completely stupid. I was thinking that these two were updating the exact same row (from the application standpoint), but yea- these are probably updating different rows from previous statements in transactions (hits head for overlooking the obvious). This is the easiest of all deadlock scenarios! I am really sorry for wasting your time!
I guess natural follow ups would be:
1) Does the execution stack always only show the last statement or is it supposed to show all of the statements in the transaction?
2) Is there any way to find out what key value (i.e. the actual PK value from the ID field) that these locks were guarding? I.e. from the hash in the waitresource or from the "id=lockeebe0200" in the owner list? Is there anything that can be inferred from the lockeebe0200 or it that just an identifier for the logical lock object in memory?
Ahh- this makes me so mad that I posted such a stupid question!
No worries :).
1) It only shows the last statements, the ones that closed the loop. Right now, the server does not keep a record of which statement acquired which lock, so the info that would allow going "back in time" to see past statements that also acquired some of the locks isn't available in memory. Often, you can infer the statement by the lock types, especially if you have some knowledge of the app or can crack open the stored procs that were running (the most common begin trans/commit trans are within a single stored procedure). If this fails, you have to fall back on a profiler trace, which requires waiting for a reoccurrence of the problem.
2) Unfortunately not. This is a (one-way) hash of the key value. If you needed to see the statement parameter values, you'd have to use profiler here, too. Luckily, it is rarely necessary to know the param values or specific rows in order to understand the deadlock.
I understand the scenarios in Part #1, 2 and 3 but I am failing to understand why the following scenario causes a deadlock.
Suppose we have the following table:
CREATE TABLE Widgets (
constraint pk_widgets primary key nonclustered ( widgetid )
CREATE UNIQUE INDEX IX_Widgets ON Widgets ( WidgetNumber )
Widgets ( WidgetID, WidgetNumber, WidgetValue )
VALUES ( 1, 10, 100 )
1. Process A and B start a transaction
2. Process A updates the row in the Widgets table
3. Process B attempts to update the same row and has to wait for Process A to finish
4. Process A attempts to update the same row and causes a deadlock
SQL server terminates Process B!
Process B should continue to wait until the transaction is complete!
You mention the option of forcing one of the transactions to block early on the process, by doing something like:
SELECT @x = COUNT(*) FROM t1 WITH (HOLDLOCK, UPDLOCK) WHERE c1 = @p1
Would this in general:
a) lock a single set of rows (i.e. the ones where C1=@p1) in the index cidx?
b) effectively place a index wide lock?
If it is a), then the issue I have is that the two processes are independent, (different users logged into different clients!) so the runtime values of @p1 for the two procedures are more than likely to be completely different - in fact the values apply to different columns in the table, so there is no relationship between the parameters for the procedures at all.
In addition, the count(*) is likely to be expensive in some cases - what if there are millions of rows?. If the locking is an index wide lock, would this lock still be held, even if the value for @p1 was chosen such that there we no rows at all where c1=@p1?
These may be silly questions !
This is such black art stuff.
Bart, thank you for this excellent primer.
I am repeatedly getting a deadlock that I'm not sure how to solve, even after reading your tips. I think I understand what is causing the deadlock but I'm just clueless as to the best way to solve it. Let me explain:
I have a table that has a TEXT column as well as a couple of CHAR columns. The "text in row" table option is currently off, which means, if I understand correctly, that the TEXT column's data will be stored in a separate page than the rest of the data for a given row. I realize that the TEXT datatype is supposed to be phased out in favor of VARCHAR(MAX) but unfortunately, I cannot change the schema for the table in question because it is not under our control (it was created by an application we purchased).
Looking at the deadlock graph, we have one process that does a SELECT on that table, selecting all columns, including the TEXT column. The other process in the graph does an INSERT on that table, inserting a single row, including the TEXT column.
The SELECT process owns a page lock (mode S) on one page and is requesting a page lock (mode S) on a second page. On the other hand, the INSERT process owns a page lock (mode IX) on the second page and is requesting a page lock (mode IX) on the first page. Clearly, this is a deadlock.
I presume that one of the pages holds the data for the TEXT column and the other page holds the data for the other columns. Is this a correct assumption?
The SELECT process seems to always be the loser because it is the least expensive (i.e. it has Log Used: 0). Our current "fix" is to detect the deadlock in the SELECT process and try it again when it occurs. Unfortunately, as we add more users to the system, the deadlocks seem to be happening more frequently and the system also seems to be less responsive.
To improve this situation, I'm considering the following options:
1) Rewrite the SELECT process so it does NOT select the TEXT column. This is for sort of a "ticker tape" use-case and the users might be able to live without seeing the data in the TEXT column. If they want to see it for a particular row, perhaps they can "drill down". But I don't think our UI developer likes this option.
2) Use the "WITH (NOLOCK)" option in the SELECT process. However, what happens when you set this option and the INSERT process has only written one page and not the other when the SELECT goes to read the same record? Will the SELECT only see part of the record (e.g. either the TEXT column or the non-TEXT columns but not both)??? This worries me...
3) Set the isolation level of the SELECT process to read uncommitted. But I have the same worry as in option 2.
4) Set the "text in row" table option to ON, so that all of a row's data is (hopefully) stored on the same page. However, if the vendor of the application ever updates the schema for that table, I guess I'll have to remember to reapply that table option?
Any recommendations or wisdom would be appreciated.
>> I presume that one of the pages holds the data for the TEXT column and the other page holds the data for the other columns. Is this a correct assumption?
Maybe. Send me the -T1222 output (you can use the blog contact form, or I'll send you an offline msg w/my email address) and there might be some clues to confirm this. Do the pagelock resources in the -T1222 output identify an index ID for each page? Text/image data is always on index ID 255. Alternatively, there should be an associatedObjectId attribute. This is a partition ID, and you can figure out which index it belongs to by querying sys.partitions as described in blogs.msdn.com/.../deadlock-troubleshooting_2c00_-part-1.aspx.
The fact that the SELECT is acquiring non-intent Shared page locks indicates that it is probably doing a full table or index scan. That usually points to a lack of indexes on the table that would support a more efficient plan. (Either that, or the SELECT returns a large % of the data in the table.) My recommendation would generally be to eliminate all unnecessary table/index scans before doing anything else, but it sounds like you may be reluctant to add any indexes because you don't "own" the schema...
>> 1) Rewrite the SELECT process so it does NOT select the TEXT column.
That might work, assuming that the text data is typically small enough to fit on page, and assuming that the two pages involved in the deadlock are a text page and a non-text clustered index page.
>> 2) Use the "WITH (NOLOCK)" option in the SELECT process. ...
>> what happens when you set this option and the INSERT process has only written
>> one page and not the other when the SELECT goes to read the same record?
>> Will the SELECT only see part of the record (e.g. either the TEXT column or the
>> non-TEXT columns but not both)??? This worries me...
I'm not sure if the scenario you describe is possible, but NOLOCK / dirty reads can definitely cause odd behavior. (I have never seen a case where an inconsistent version of a single row was returned for a table that used text/image... the typical symptom is the query simply failing with a strange error.)
>> 4) Set the "text in row" table option to ON
That might work, again assuming that the text data is typically small enough to fit on page, and assuming that the two pages involved in the deadlock are a text page and a non-text clustered index page.
Have you considered enabling the read committed snapshot database option? That would probably address this along with most other blocking problems where one of the participants is only doing reads.
I posted a comment earlier, but am not sure it got through as I was not logged in. In a nutshell, we use cursors, the "Fetch" cursor is being marked as the deadlock victim when we are updating the same table, the same result set, sometimes. Currently, removing the cursors would be an unfeasible solution. The SQL uses a distinct index and selects all columns for the given table. Is there any good way to find the reason for the deadlock in this kind of scenario? Using the snapshot feature seems kind of dangerous in our application, and the indexes seem OK. None are clustered, by the way. Don't know if that matters or not. Any ideas? If you need more information, and did not get my other comment post, then tell me. Was doing it from memory, so I might have forgotten some details. Thanks.
tbaldarelli - A dynamic cursor essentially runs a slightly modified very of your query anew for each fetch. Other cursors run the query once, pausing the scan while the code processes the row in between each fetch call. In either case, it's mostly the same as other deadlocks. The data collection process is the same, and the same deadlock avoidance techniques generally apply. I would capture the -T1211 output and decode it as discussed in Part 1 (blogs.msdn.com/.../747119.aspx) to better understand why the two spids are deadlocked. Then try index tuning, which is always worth a shot even if the existing indexes seem reasonable. If all else fails, you'll need to apply one of the deadlock avoidance techniques mentioned above. If you post the -T1211 output I can try to help you with the first step.
Thanks a lot for the clear and concise explanation..... Most sites explain the classic scenario involving out of order updates on different tables thru different update statements... What you have explained here is a more practical and subtle variation of the same scenario...