In this post, I’ll give an overview of three interesting properties of iterators that can affect query performance: memory, non-blocking vs. blocking, and dynamic cursor support.
All iterators require some small fixed amount of memory to store state, perform calculations, and so forth. We do not track this fixed memory or try to reserve this memory before executing a query. When we cache an executable plan, we cache this fixed memory so that we do not need to allocate it again and to speed up subsequent executions of the cached plan.
However, some iterators which we refer to as memory consuming iterators require additional memory to execute. This additional memory is used to store row data. The amount of memory required by a memory consuming operator is generally proportional to the number of rows processed. To ensure that the server does not run out of memory and that queries containing memory consuming iterators do not fail, we estimate how much memory we need and we reserve a memory grant before we execute such a query.
Memory consuming iterators can affect performance in a few ways.
The main memory consuming iterators are sort, hash aggregate, and hash join.
Non-blocking vs. blocking iterators
Most iterators can be classified into two categories:
The compute scalar iterator is a simple example of a non-blocking iterator. It read an input row, computes a new output value using the input values from the current row, immediately outputs the new value, and continues to the next input row.
The sort iterator is a good example of a blocking iterator. The sort cannot determine the first output row until it has read and sorted all input rows. (The last input row could be the first output row; there is no way to know until we have seen it.)
Blocking iterators often but not always consume memory. For example, as I noted above, sort consumes memory. The count(*) example which I used my earlier posts (and other scalar aggregates such as sum(*), min(*), max(*), etc.) do not consume memory yet are blocking. It is not possible to know the number of rows until we’ve read and counted them all.
If an iterator has two children, the iterator may be blocking with respect to one and non-blocking with respect to the other. Hash join (which I’ll cover in a future post) is a good example of such an iterator.
Non-blocking iterators are generally optimal for OLTP queries where response time is important. They are especially desirable for TOP N queries or queries with a FAST N hint. Since the goal is to return rows as quickly as possible, we’d like to avoid blocking operators which might process more data than necessary before returning the first rows. Non-blocking iterators can also be useful when evaluating an EXISTS subquery where we again would like to avoid processing more data than necessary to conclude that at least one output row exists.
Hint: If a query requires ordered output, creating the right index can enable the optimizer to find a query plan that uses the index to produce the desired order instead of introducing a blocking sort. This may dramatically improve response time. Of course, there are other reasons to use indexes too.
Dynamic cursor support
The iterators used in a dynamic cursor query plan have special properties. Among other things, a dynamic cursor must be able to return a subset of the result set at a time, must be able to scan forward or backward, and must be able to acquire scroll locks. To support this functionality, an iterator must be able to save and restore its state, scan forward or backward, must process one input row for each output row it produces, and must be non-blocking. Not all iterators have all of these properties.
For a query to be executed using a dynamic cursor, the optimizer must be able to find a query plan that uses only iterators that support dynamic cursors. This is not always possible and this is why some queries cannot be executed using a dynamic cursor.
Hint: Just a creating the right index can enable the optimizer to eliminate a sort, for the same reason, it can sometimes (though not always) enable the optimizer also to find a dynamic cursor query plan. Unfortunately, this is not always possible. For example, there is no way to perform aggregation without violating the one input row for each output row property (though it is sometimes possible to work around this problem using an indexed view).
There are other ways to categorize iterators and there are many nuances that I have not covered, but I believe that the three properties that I mentioned are among the most useful to understand.
After reading Hugo’s post about when snapshot isolation doesn’t really live up to its promise , I decided
In a prior post , I introduced the notion that update plans consist of two parts: a read cursor that