I want to share this troubleshooting scenario since it may help you:
After upgrading SQL Server 2000 SP4 to SQL Server 2008 SP2, SQL Server experienced performance problems and many a times, connection failures
SQL Server errorlog shows this message during the problem period:
Error: 18056, Severity: 20, State: 29.
The client was unable to reuse a session with SPID xxx, which had been reset for connection pooling. The failure ID is 29. This error may have been caused by an earlier operation failing. Check the error logs for failed operations immediately before this error message.I was thinking that this is the same issue as described in http://support.microsoft.com/kb/2543687 but as per this KB, issue is currently being investigated and scenario described in the KB is not matching the issue I'm troubleshooting.
There is also a fix released for SQL Server 2005 through http://support.microsoft.com/kb/937745 and again it is not applicable here.
Started looking into the data collected through PSSDiag (http://diagmanager.codeplex.com/) and looked into wait stats along with other information. Wait stats showed this:
Threadpool worried me since this is not expected to be the Topper.
Looked into sp_configure:
name minimum maximum config_value run_value
max worker threads 128 32767 128 128
As per http://msdn.microsoft.com/en-us/library/ms187024(v=SQL.100).aspx, Upgrading an instance of the SQL Server 2000 Database Engine to a newer version retains the configuration value for max worker threads.
SQL Server 2008 instance with 8 processors will create 576 worker threads but in sp_configure it is hardcoded to 128 threads which caused contention in this case.
So we then changed this value to
EXEC sp_configure 'max worker threads',0
RECONFIGURE WITH OVERRIDE
and restarted instance, now we longer have the 18056 error.
THREADPOOL waits only occur when a task is waiting to get assigned to a worker thread. This wait means server is out of worker threads and can’t bind the new connection request to a worker thread. In most cases the cause of THREADPOOL waits is that existing workers created by SQL Server are tied up with a long blocking chain or running long running pre-emptive work such as extended stored procedures. So when the connection (login) cannot get a worker thread, sp_reset_connection command timed out and it recorded the Error: 18056, Severity: 20, State: 29. The client was unable to reuse a session with SPID 350, which had been reset for connection pooling. The failure ID is 29.
For other 18056 State numbers, please refer this blog for the description http://blogs.msdn.com/b/psssql/archive/2010/08/03/how-it-works-error-18056-the-client-was-unable-to-reuse-a-session-with-spid-which-had-been-reset-for-connection-pooling.aspx
Also, the 18056 error could be noise in certian scenarios, refer http://blogs.msdn.com/b/psssql/archive/2010/05/05/error-18056-can-be-unwanted-noise-in-certain-scenarios.aspx for more information
We have the max workerthreads running value and it is pointing to 0 but still
gettting 18056 error
Reply from Sakthi:Then it should be a performance problem which needs details bottleneck analysis.