A post on how to troubleshoot blocking clients.

 

The scenario is that you have many users that complain that all of a sudden the applications hang and no data is returned.

The setup for demoing this scenario:

 

Machine 1: This is the SQL Server 2008 machine, called MIKESPIKE\SPIKE2008 in this case.

Machine 2: A client machine called MIKESPIKELAPTOP in this case (which will be the one that creates the blocking).

 

Step one is to create the uncommitted transaction in order to create the blocking.

On machine 2, start a command prompt and connect to machine 1 using SQLCMD, switch to Northwind and execute an uncommitted transaction.

All steps below:

 

C:\>sqlcmd -S MIKESPIKE\SPIKE2008

1> USE Northwind

2> GO

Changed database context to 'Northwind'.

1> BEGIN TRAN

2> UPDATE Shippers SET CompanyName = 'abcdef' WHERE ShipperID = 1

3> GO

 

(1 rows affected)

 

Then on machine 1, start a command prompt and connect locally to the same server using SQLCMD, switch to Northwind and issue a SELECT against the Shipper table.

All steps below:

 

C:\>sqlcmd -S .\Spike2008

1> USE Northwind

2> GO

Changed database context to 'Northwind'.

1> SELECT * FROM Shippers

2> GO

 

This will now hang since we have the uncommitted transaction from machine 2.

So, how to gather more information on this (we are assuming of course that we have no clue on the actions performed above)?

First thing would be to open SQL Server Management Studio on Machine 1. Then issue sp_who2

 

exec sp_who2

 

this should give an output similar to this:

 

SPID           Status         Login                        HostName                     BlkBy          DBName    Command          CPUTime DiskIO LastBatch      ProgramName

---------------------------------------------------------------------------------------------------------------------------------------------

...

54             sleeping       EUROPE\maspeng                MIKESPIKELAPTOP              .              Northwind AWAITING COMMAND 0       0      05/03 13:14:17 SQLCMD

55             SUSPENDED      EUROPE\maspeng                MIKESPIKE                    54             Northwind SELECT           0       0      05/03 13:16:16 SQLCMD

 

Here we can see that the local connection (MIKESPIKE in the hostname column) is blocked by SPID 54.

We know that this is from our client machine (since we know the client is called MIKESPIKELAPTOP) but if we didn’t,

the above would not tell us much except that the call from MIKESPIKE is blocked by MIKESPIKELAPTOP.

 

Armed with the knowledge of these two SPIDs (54, 55) we can execute the following:

 

select session_id, connect_time, net_transport, auth_scheme,

client_net_address, client_tcp_port, local_net_address, local_tcp_port

from sys.dm_exec_connections where session_id = 54 or session_id = 55

 

In this case, we will get something like the following:

 

session_id  connect_time            net_transport         auth_scheme    client_net_address           client_tcp_port local_net_address            local_tcp_port

----------- ----------------------- ---------------------------------------- ---------------------------------------- -----------------------------------

54          2010-05-03 13:13:45.127 TCP                    KERBEROS       157.xxx.xxx.93               65500           157.xxx.xxx.187              56644

55          2010-05-03 13:16:09.907 Shared memory         NTLM           <local machine>              NULL            NULL                             NULL

 

So what does this tell us?

We can see that SPID 54 is using TCP and Kerberos.

This makes sense since the connection is remote, while SPID 55 uses Shared Memory and NTLM which is also expected since this connection is local.

We can also see that the client IP address and port for the owner of the blocking SPID is 157.xxx.xxx.93 and 65500 but for the local connection these are all NULL. Also expected.

 

So know we know what machine, and potentially what user on that machine, to investigate for whatever code is blocking and deal with it accordingly.

Here we can simply close the command window on machine 2 (the transaction will not be committed) and we should immediately see the rows returned on machine 1.

We could of course commit the transaction, but when (and if) the user who instantiated this is not present, this may be the only option.

 

I’ve seen issues where a user on a client machine created an uncommitted transaction and then left for lunch.

This then had the effect that all other users of this database couldn’t select anything from the table in question, effectively preventing them doing their tasks.

 

The “sys.dm_exec_connection” is also useful for other things as well, for example (as seen above) we can clearly see that the connection is using Kerberos,

so we can verify that this works and that the connection isn’t falling over to NTLM.

 

See more here:

 

“sys.dm_exec_connections (Transact-SQL)”

http://msdn.microsoft.com/en-us/library/ms181509.aspx