This article describes how query memory grant works in Microsoft SQL Server. It applies to both SQL2005 and SQL2008. Written by Jay Choe, Microsoft SQL Server Engine.
Query memory grant (a.k.a. query work buffer) is a part of server memory used to store temporary row data while sorting and joining rows. It is called "grant" because the server requires those queries to "reserve" before actually using memory. This reservation improves query reliability under server load, because a query with reserved memory is less likely to hit out-of-memory while running, and the server prevents one query from dominating entire server memory. In the following sections, I will describe how SQL server estimates query memory requirement, and how it throttles memory grants when multiple queries are competing.
When SQL server receives a user query, it follows well-defined steps to produce a result for user. First, it creates a compiled plan, which is a set of logical instructions such as how to join rows. Next, it creates an execution plan based on the compiled plan. This execution plan contains instructions with all logical references translated to actual objects, and tracking mechanisms for query execution. Finally, the server starts execution from the top of instruction tree. Creating a compiled plan is expensive because the server needs to find the optimal plan out of hundreds of potential candidates. Distinguishing between compile and execution helps overall server performance because each compiled plan can be cached and shared among multiple execution plans. The memory grant estimate follows the same overall sequence. It has parameters saved in compiled plan, and a mechanism to calculate actual grant size at execution time.
Memory consumersNow that I briefly covered the lifetime of query execution, I would like to show where query memory grant fits in overall server memory usage. As hinted in the previous paragraph, a successful query execution involves 3 major memory consumers: compile, cache, and memory grant.
SQL Server maintains the balance between these 3 consumers with internal facility called "memory broker". Based on the usage and physical memory available, the memory broker sets the usage limit and tells each component to trim its memory usage if it anticipates a shortage. Generally, a well behaving server would have about the same contributions from these consumers.
Grant parametersWhen SQL server creates a compiled plan, it calculates two memory grant parameters called "required memory" and "additional memory".
For example, let's consider a simple query which needs to sort 1 million rows of 10 bytes each in size. The required memory for this query is 512KB because this is the minimum amount SQL server needs to construct internal data structures to handle one sort. Since it would take 10MB to store all rows, the additional memory would be 10MB (slightly higher when overhead is included). This calculation becomes complicated if the compiled plan has multiple sorts and joins because SQL server also considers the lifetime of each operator for more efficient memory usage. You would generally see smaller estimate than the sum of all sorts and joins. If you are interested in the relative memory usages among operators, you can check out <MemoryFractions> tag in Showplan XML. The following sections show how these parameters are used when calculating grant size at runtime.
DOP dependenceIf SQL server has more than 1 CPU, it can run a query in parallel mode for improved performance by sharing work among parallel workers. These workers run independent of each other, and use "parallelism operator (a.k.a. exchange)" to transfer processed rows. This parallel mode increases memory usage because each worker needs its own copy of sort or hash join, and the parallelism operator needs buffers for temporary storage of transferred rows. Since DOP N would use N parallel workers, the query would need N times more required memory. On the other hand, the total number of rows to handle (and memory cost to store them) does not change with DOP. This means that the additional memory would stay the same regardless of DOP setting. Starting with SQL 2008, the buffer memory used by parallelism operator is also counted as a required part of memory grant.
Memory grant processIn the previous section, we discussed how parallelism affects the query memory requirement. In this section, we will discuss how SQL server takes server memory and number of concurrent queries into considerations. The server needs to consider such dynamic factors to avoid committing memory beyond its physical limit. This is done in 2 distinct steps. First, the server calculates how much memory to grant for given query. Then it uses the internal facility called Resource Semaphore to reserve actual memory, or throttle if too many queries ask for memory. First, the following steps show how the request size is determined.
Resource SemaphoreResource Semaphore is responsible for satisfying memory grant requests while keeping overall memory grant usages within the server limit.
Debugging memory grant related issuesSQL server supplies a few dynamic management views (DMV) to help investigate memory grant related issues. Please refer BOL (books-on-line) for detailed information on DMVs discussed below.
sys.dm_exec_query_resource_semaphoresThis DMV shows the current status of Resource Semaphore mentioned in the previous section. On SQL 2005, you will find 2 of them. One with non-null max_target_memory_kb column is called Regular Resource Semaphore, and the other is called Small Resource Semaphore. As its name implies, Regular Resource Semaphore is used by all queries under normal condition, while Small Resource Semaphore is used by small size queries (less than 5 MB) when they have to wait (see step 6 in the previous section). This would improve response time of small queries which are expected to finish very fast. Note max_target_memory_kb column shows the server memory limit used in step 2 of previous section.
sys.dm_exec_query_memory_grantsThis DMV shows all queries consuming memory grants including those waiting in Resource Semaphore queue. Waiting queries would have null grant_time column. Resource Semaphore uses internal query cost estimate to prioritize memory grants, and is_next_candidate column shows which query to wake up when memory is available.
sys.dm_os_wait_statsThis DMV shows wait statistics of all server objects. Memory grant uses "RESOURCE_SEMAHORE" wait type. If you see significant waits on this wait type, you may have an issue with big queries.
Sample queriesThe following sample queries show how memory grant DMV are used.
Find all queries waiting in the memory queue:
SELECT * FROM sys.dm_exec_query_memory_grants where grant_time is null
Find who uses the most query memory grant:
SELECT mg.granted_memory_kb, mg.session_id, t.text, qp.query_plan FROM sys.dm_exec_query_memory_grants AS mgCROSS APPLY sys.dm_exec_sql_text(mg.sql_handle) AS tCROSS APPLY sys.dm_exec_query_plan(mg.plan_handle) AS qpORDER BY 1 DESC OPTION (MAXDOP 1)
Search cache for queries with memory grants:
SELECT t.text, cp.objtype,qp.query_planFROM sys.dm_exec_cached_plans AS cpJOIN sys.dm_exec_query_stats AS qs ON cp.plan_handle = qs.plan_handleCROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) AS qpCROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS tWHERE qp.query_plan.exist('declare namespace n="http://schemas.microsoft.com/sqlserver/2004/07/showplan"; //n:MemoryFractions') = 1