In the past couple of months, SQL Server support team has come across some customers running into performance issues attributed to TokenAndPermUserStore in SQL Server 2005. This blog post attempts to compile all the information we have so far regarding this problem.
TokenAndPermUserStore is one of the many caches present in the SQL Server 2005 memory architecture. As the name implies, this cache stores various security related information used by the SQL Server Engine. If you are curious to know the different types of tokens currently cached in this cache on your server, the following query will provide you the information:
SELECT COUNT(*) as TokenCount, *
FROM
(SELECT
x.value('(//@name)[1]', 'varchar (100)') AS [Token Name],
x.value('(//@class)[1]', 'bigint') AS [Class],
x.value('(//@subclass)[1]', 'int') AS [SubClass]
FROM
(SELECT CAST (entry_data as xml)
FROM sys.dm_os_memory_cache_entries
WHERE type = 'USERSTORE_TOKENPERM')
AS R(x)
) a
GROUP BY [Token Name],[Class],[SubClass]
NOTE: Please do not run this command on a production server during peak load as it could take a while to finish depending upon the number of entries present in the cache.
The problems we have seen are specific to token type TokenAccessResult with a class of 65535. These tokens represent information about cumulative permission checks for queries. For more information on this, refer to the KB article: Queries take a longer time to finish running when the size of the TokenAndPermUserStore cache grows in SQL Server 2005.
There are several indicators you can monitor to determine if you are running into this class of problems.
1. The amount of memory used by this security token cache
2. The number of entries present in this security token cache
3. The extent of contention on this security token cache
To find out the amount of memory consumed by this token cache, you can query the DMV’s as follows:
SELECT SUM(single_pages_kb + multi_pages_kb) AS "SecurityTokenCacheSize(kb)"
FROM sys.dm_os_memory_clerks
WHERE name = 'TokenAndPermUserStore'
There is no specific threshold for this size beyond which the problem starts to happen. The characteristic you need to monitor is the rate at which this cache size is growing. If you are encountering problems with this cache, then you will notice that as the size of the cache grows, the nature of the problems you experience becomes worse. On a sample server that experienced this problem, the cache grew at a rate approximately 1MB per min to reach close to 1.2 GB. We have seen the problem starting to show up even when the size of this cache reaches several hundred MB.
Next, to understand if there is contention while accessing this cache, you have to execute commands like the following:
Servers running builds < SQL 2005 SP2
SET NOCOUNT ON
CREATE TABLE #spins([Spinlock Name] varchar(50),Collisions numeric,Spins numeric,[Spins/Collision] float)
INSERT INTO #spins EXECUTE ('DBCC SQLPERF (''SPINLOCKSTATS'')')
SELECT TOP 20 * FROM #spins ORDER BY Collisions DESC
DROP TABLE #spins
Servers running builds >= SQL 2005 SP2
SET NOCOUNT ON
CREATE TABLE #spins([Spinlock Name] varchar(50),Collisions numeric,Spins numeric,[Spins/Collision] float,[Sleep Time (ms)] numeric,Backoffs numeric)
INSERT INTO #spins EXECUTE ('DBCC SQLPERF (''SPINLOCKSTATS'')')
SELECT TOP 20 * FROM #spins ORDER BY Collisions DESC
DROP TABLE #spins
In this output, the row you need to pay attention to is the following one:
|
Spinlock Name |
Collisions |
Spins |
Spins/Collision |
Sleep Time (ms) |
Backoffs |
|
MUTEX |
7,714,984 |
1,020,994,305,589 |
132,339 |
55,000,343 |
101,024,820 |
|
MUTEX |
8,143,896 |
1,055,469,534,588 |
129,603 |
61,055,906 |
104,387,627 |
This sample output was captured across 30 minute duration on a server experiencing this problem. Basically this output shows that various threads inside the SQL Server process is contending for a spinlock named MUTEX. The values above are various attributes that indicate the extent and nature of contention. Spinlock is a very lightweight synchronization mechanism used with the SQL Server engine. Depending upon the data structure that a particular spinlock protects, it is given a unique name within the SQL Engine. MUTEX is the name of the spinlock which protects the security token cache (among a few other things) that we are discussing here. Very similar to the size, what you want to monitor is the rate at which the various values here increase.
The symptoms that you want to correlate with the above data points include a combination of the following:
1. Queries which normally finish faster take a long time
2. CPU usage of SQL Server process is relatively higher. CPU usage could come down after remaining high for a period of time.
3. Connections from your applications keep increasing (specifically in connection pool environments)
4. You encounter connection or query timeouts
Once you have analyzed all this data and made a determination, then we recommend you to ensure that you have applied the following fixes on a SQL 2005 SP2 instance. Here is a short description on what fix each build is going to offer you:
SQL Server 2005 SP2 build [9.00.3042.00]
TokenAccessResult security cache entries for ad hoc queries are cached when a specific ad hoc query is executed for the second time
KBA: 927396: Queries take a longer time to finish running when the size of the TokenAndPermUserStore cache grows in SQL Server 2005
SQL Server 2005 post SP2 hotfix build [9.00.3153.00]
Fix to prevent duplication of TokenAccessResult entries due to change in the protection timestamp used.
KBA: 933564: A gradual increase in memory consumption for the USERSTORE_TOKENPERM cache store occurs in SQL Server 2005
SQL Server 2005 post SP2 hotfix build [9.00.3171.00]
Fix to prevent memory allocation contention for TokenAccessResult entries
KBA: 937745: You may receive error messages when you try to log in to an instance of SQL Server 2005 and SQL Server handles many concurrent connections
SQL Server 2005 post SP2 hotfix build [09.00.3179.00]
Fix to prevent Memory consumption increase by the TokenandPermUserStore even if the number of entries does not increase
KBA: 939871: Not yet published to support.microsoft.com
In order to get all of these fixes, you can install the Cumulative update package 3 for SQL Server 2005 Service Pack 2. This will take you to build 9.00.3186.00. You might also install a later Cumulative Update package and that will include all of these fixes as well.
Microsoft SQL Server Best Practices Analyzer will generate the following warning:
Apply Cumulative Update to Improve Query Performance (SQL Server BPA)
On versions of SQL Server before 9.0.3171.0, a known issue can degrade query performance on 32-bit and 64-bit systems with 2 gigabytes (GB) or more of memory. When you execute queries under the context of a login that is not a member of the sysadmin fixed server role, you might encounter performance degradation symptoms that arise from a large and growing Security Token cache. These issues can include performance degradation of queries, high CPU usage for