12.1 Trace Events
12.1.1 Performance Statistics Trace Event
The performance statistics trace event introduced in SQL Server 2005, gives persisted plan and runtime statistics information. With the information part of the trace event data when combined with the information available through the plan cache DMV sys.dm_exec_query_stats the performance history of any query can be reconstructed. The advantage of capturing this information via trace along with other plan cache trace events is that the trace data can be persisted by saving it to a trace file on the disk. The trace file contains a chronological list of trace events fired including query plans that can be analyzed for poor plans causing performance problems.
In SQL Server 2000, monitoring applications would have to capture SQL:BatchStarting, SQL:BatchCompleted and the ShowPlanXML events typically and analyze the trace data from all these events for performance problems. The batch starting and completed events are fired every time batch is executed and can be high volume events causing some performance degradation. In SQL Server 2005, the performance statistics trace event essentially obliterates the need to monitor these three different events. The trace data from the performance statistics trace event can be analyzed for poor plans causing performance problems. Performance statistics trace event is also a relatively lower volume trace (compared to the batch started and completed events) event since it is fired for unique queries and not every query.
The performance statistics trace event has 4 subclasses:
select distinct * from sys.trace_subclass_values
where trace_event_id = 165
and trace_column_id = 21
order by trace_event_id, subclass_value
go
Trace_event_id
Trace_column_id
Subclass_name
Subclass_value
165
21
SQL
0
SP:Plan
1
Batch:Plan
2
QueryStats
3
Sql batch text event (Subclass_name: SQL) is fired when sql text is inserted into the SQLMGR cache. This event subclass has batch scope; it fired once for every batch. The sql batch text event is not fired for object plans.
The object plan (Subclass_name: SP:Plan) and adhoc plan (Subclass_name: Batch:Plan) events are scoped at the statement level and are fired whenever the query is compiled or recompiled. They are not fired for queries with recompile hints. The adhoc plan events are fired only for DML and select queries and not for DDL, declare statements, while loop statements etc.
The run time stats event (Subclass_name: QueryStats) is fired when a compiled plan is removed from cache due to reasons including recompilation or memory pressure. The TextData column here contains the query execution statistics.
BOL has information regarding the trace event columns and how to interpret the trace data in the various columns for each even subclass.
Let us look at an example to see how to use performance statistics trace event:
Create a file trace with the performance statistics trace event (event id = 165) on the server:
Now suppose we have table that contains the all sales information and has three columns: transaction id, country id, and total_amount. The transaction id is the primary key column. In addition, the table has a non clustered index on the country id column. We also know that majority of the sales occur in country with id = 1, and country with id = 2 has far fewer rows in the sales table. If we had a stored procedure that reported sales by country id as below:
create proc sales_report @country_id int
as begin
select * from sales where country_id = @country_id
end
Now let us suppose this procedure is executed with parameter value 2 to generate the sales report for country id 2:
exec sales_report 2
The parameter value is sniffed in this case and we do an index seek (more details availbel in this previous posting).
Now consider the scenario where the system after achieving certain predictable performance for a period of time, now shows a sudden deterioration(an unexpected) in performance when the same stored procedure is re-executed with same parameter value (country id 2). Analyzing the performance statistics trace data in conjunction with the query statistics reported in sys.dm_exec_query_stats give important clues in understanding the reason for this unexpected drop in performance. Now let us check the DMVs to find get the query plan and the execution time statistics:
select total_worker_time/execution_count as avg_cpu_time,
substring(st.text, (qs.statement_start_offset/2) + 1,
((case statement_end_offset
when -1
then datalength(st.text)
else
qs.statement_end_offset
- qs.statement_start_offset)/2) + 1) as statement_text
, cast(query_plan as xml)
from sys.dm_exec_query_stats qs
cross apply sys.dm_exec_sql_text(qs.sql_handle) st
cross apply sys.dm_exec_text_query_plan(qs.plan_handle, qs.statement_start_offset, qs.statement_end_offset)
order by total_worker_time/execution_count desc;
Avg_
Cpu_
Time
Statement_
Text
Query_plan
126408
<ShowPlanXML
.
PhysicalOp=
"Clustered Index Scan"
LogicalOp=
<ParameterList>
<ColumnReference
Column="@country_id"
ParameterCompiledValue="(1)" />
</ShowPlanXML>
From the query plan we can see that we are now doing an index scan and the sniffed parameter value is 1. This means the plan that was cached after the initial execution of the stored procedure with parameter value country id = 2 was probably removed from cache. Further, it a new plan for the stored procedure was inserted into cache, this time with parameter value country id = 2.
The performance statistics trace event data can confirm if indeed this was what happened. The stored procedure below analyzes the performance statistics trace data looking for object plans that are first inserted into cache and later deleted by matching the sql and plan handles. From the run time stats event, the average CPU time of the deleted plan is computed. The deleted query plan is obtained from the TextData column of the object plan event. The average CPU time of the new cached plan is computed from sys.dm_exec_query_stats. The query plan is available in sys.dm_exec_text_query_plan.
create procedure analyze_perf_stats_trace_data
as
begin
if object_id('trace_data', 'U') is not null
drop table trace_data;
if object_id('perf_stats_data_inserted', 'U') is not null
drop table perf_stats_data_inserted;
if object_id('perf_stats_data_deleted', 'U') is not null
drop table perf_stats_data_deleted;
if object_id('perf_stats_inserted_with_corresponding_deleted_object_events', 'U') is not null
drop table perf_stats_inserted_with_corresponding_deleted_object_events;
if object_id('perf_stats_deleted_with_corresponding_inserted_object_events', 'U') is not null
drop table perf_stats_deleted_with_corresponding_inserted_object_events;
if object_id('perf_stats_removed_plan', 'U') is not null
drop table perf_stats_removed_plan;
create table perf_stats_inserted_with_corresponding_deleted_object_events (TextData xml, EventClass int,
EventSubClass int, SqlHandle varbinary(64),
PlanHandle varbinary(64), StartTime datetime,
EventSequence int, PlanGenerationNumber bigint,
IntegerData2 int, Offset int);
create table perf_stats_deleted_with_corresponding_inserted_object_events (TextData xml, EventClass int,
---read all the trace data from file
select * into trace_data from
::fn_trace_gettable ('c:\temp\perfstats.trc', default);
---select all rows corresponding to all object events
---cast the Sql and Plan Handle into varbinary(64)
select convert(xml, TextData) as TextData, EventClass,
EventSubClass, convert(varbinary(64), SqlHandle) as SqlHandle,
convert(varbinary(64), PlanHandle) as PlanHandle,
StartTime, EventSequence, BigintData1,
IntegerData2, Offset
into perf_stats_data_inserted
from trace_data
where EventClass = 165 and
EventSubClass = 1;
---select all rows corresponding to query stats events
---these events are generated when an plan is removed from cache
---for object and adhoc plans
into perf_stats_data_deleted
EventSubClass = 3;
/* for every query statistics (plan deleted event) find the corresponding plan inserted events (object plans only). The plan inserted event should have lower event sequence number tha the QE stats event so we find plan insert events that occured earlier than the QE stats events. It should also have the
same plan handle as the QE stats event, statement start and end offset as the QE stats event. The QE stats and object plan events should also have the same plan generation number (bigintdata1). By matching plan generation numbers
we avoid duplicate rows from recompiles.
*/
declare @TextData xml, @EventClass int, @EventSubClass int, @SqlHandle varbinary(64), @PlanHandle varbinary(64), @StartTime datetime, @EventSequence int, @PlanGenerationNum bigint,
@IntegerData2 int, @Offset int;
declare event_cursor cursor for select * from perf_stats_data_deleted;
open event_cursor;
fetch next from event_cursor into @TextData, @EventClass, @EventSubClass, @SqlHandle, @PlanHandle, @StartTime, @EventSequence, @PlanGenerationNum, @IntegerData2, @Offset;
while @@fetch_status =0
insert into perf_stats_inserted_with_corresponding_deleted_object_events
select * from perf_stats_data_inserted where EventSequence < @EventSequence
and PlanHandle = @PlanHandle and BigIntData1 = @PlanGenerationNum
and IntegerData2 = @IntegerData2 and Offset = @Offset;
if exists(select * from perf_stats_data_inserted where EventSequence < @EventSequence
and PlanHandle = @PlanHandle)
insert into perf_stats_deleted_with_corresponding_inserted_object_events values(@TextData, @EventClass, @EventSubClass, @SqlHandle,
@PlanHandle, @StartTime, @EventSequence, @PlanGenerationNum, @IntegerData2, @Offset)
fetch next from event_cursor into @TextData, @EventClass, @EventSubClass, @SqlHandle,
@PlanHandle, @StartTime, @EventSequence, @PlanGenerationNum, @IntegerData2, @Offset;
close event_cursor;
deallocate event_cursor;
---compute the CPU time of the deleted plan from the QE stats
---event Text Data, plan handle, sql handle, and query plan that ---was removed and save the data into a table
select
d.TextData.value('(/QueryExecutionStats/WorkerTime/@Total)[1]', 'bigint') /
d.TextData.value('(/QueryExecutionStats/GeneralStats/@ExecutionCount)[1]', 'bigint')
as avg_cpu_time_removed,
i.TextData as query_plan_removed,
d.PlanGenerationNumber as plangenerationnum_removed,
d.PlanHandle as planhandle_removed,
d.SqlHandle as sqlhandle
into perf_stats_removed_plan
from
perf_stats_deleted_with_corresponding_inserted_object_events d, perf_stats_inserted_with_corresponding_deleted_object_events i
where i.PlanHandle = d.PlanHandle;
---note that plan generation numbers >2 indicate recompiles
select * from perf_stats_removed_plan;
/* Check if sys.dm_exec_query_stats has any rows with same Sql Handle and plan generation one greater than the deleted plan
If yes, get the CPU time, sql text, query plan, plan handle, sql handle compare results from this query with rows from perf_stats_removed_plan to indentify queries that are running slower */
declare @Sql_Handle varbinary(64), @PlanGenNum bigint;
declare event_cursor cursor for
select distinct sqlhandle, plangenerationnum_removed from perf_stats_removed_plan;
fetch next from event_cursor into @Sql_Handle, @PlanGenNum;
if exists(select * from sys.dm_exec_query_stats where sql_handle = @Sql_Handle and plan_generation_num <= @PlanGenNum + 1)
select total_worker_time/execution_count as avg_cpu_time_current,
- qs.statement_start_offset)/2) + 1) as statement_text,
cast(query_plan as xml) as query_plan_current,
plan_generation_num as plan_generation_num_current,
plan_handle as planhandle_current,
sql_handle as sqlhandle
where qs.sql_handle = @Sql_Handle and
qs.plan_generation_num <= @PlanGenNum + 1 ;
exec analyze_perf_stats_trace_data
The results returned by the stored procedure as below.
cpu_
time_
Removed
Query_
Plan_
Plan
gene
ration
num_
removed
handle_
Sql
Handle
229
PhysicalOp="Index Seek" LogicalOp="Index Seek"
Column="@country_id" ParameterCompiledValue="(2)" />
0x050001
00894682
0FB8010C
05000000
00000000
00
0x030001
0F639506
01B89800
00010000
current
text
"Clustered
Index Scan"
Column=
"@country_id" Parameter
CompiledValue
="(1)" />
0FB8C118
From the result set returned by the above stored procedure we can explain the slow down in execution time due to a new query plan. The plan handles are different while the sql handles are the same indicating we have a different plan for the exact same batch text. The plans before and after have a plan generation number of 1 indicating no recompiles occurred.
In the old query plan the parameter value was sniffed and we did an index seek to get all rows with country id 2. After the plan is deleted from cache (due to memory pressure or cache flush), the stored procedure is now executed with parameter value (country id = 1). The query plan does an index scan for sniffed parameter value 1. Now, when the stored procedure is executed with country id 2, we get poor performance because we re-use the cached plan which is not the optimal plan for country id 2.
Now let us look an example with recompiles:
create table t1 (a int, b int)
insert into t1 values (1, 1)
create procedure RowCountDemo
declare @i int;
set @i = 0;
while (@i < 1000)
insert into t1 values (@i, 2*@i - 50);
select a from t1 where a < 10 or ((b > 20 or a >=100) and (a < 10000)) group by a ;
set @i = @i + 1;
exec RowCountDemo
As described in this whitepaper, after 500 modifications to a table, any stored procedure referencing that table will need to be recompiled because the table statistics needs to be refreshed.
Now execute procedure analyze_perf_stats_trace_data and examine the result sets generated:
Remo
ved
gen
era
tion
remo
3558
<RelOp
NodeId="0" PhysicalOp
="Sort" LogicalOp
="Distinct Sort"
0047E572
35B82199
04000000
35DD9000
01BA9800
53
insert into #t1 values (@i, 2*@i - 50);
NodeId="0" PhysicalOp=
"Table
Insert"
LogicalOp
="Insert"
1166
select a from #t1 where a < 10 or ((b > 20 or a >=100) and (a < 10000)) group by a ;
NodeId="1" PhysicalOp
="Table
Scan"
There are several things to note here:
1. Now the removed plan (from the query stats event) has a plan generation number of 1, while the current plan has a plan generation number of 2, and this indicates a recompile of the select query occurred. Also notice that both the sql and plan handles are the same because the query was recompiled.
2. A stored procedure can have several queries. For each one of these queries an object plan event is fired. To find the exact query that caused the recompile and therefore the query stats event to be fired, we would need to search for:
a. Object plan events that occurred earlier than the query stats event (i.e. object plan events with a lower EventSequence than the query stats events.
b. Object plan events that have the same plan handle as the query stats event.
c. Object plan events that have the same statement offsets (trace columns IntegerData2 and Offset) as the query stats event.
This would identify the object plan event corresponding to the query stats event.
12.1.2 Cache Hit, Miss, Insert and Remove Events
The SP:Cache Hit, Miss, Insert and Remove events are fired for compiled plans only. None of these events are part of the default trace. In this section we will describe briefly when these trace events are fired and changes made to plan cache trace events in SQL Server 2005.
It was observed that 99% of cases when a SP:Cache Miss event was fired, it was followed immediately by a SP:Cache Insert event with the exact same information. In the interest of keeping trace event volume low and eliminating duplicate trace information, in SQL Server 2005 we fire the SP:Cache Miss event only if the cache miss is not immediately followed by an cache insert. In other words, we defer the decision to fire the SP:Cache Miss event until we determine if we will fire an SP:Cache Insert event.
Consider a case when the compiled plan for the query below is not found in cache, and new compiled plan is inserted into cache:
select * from t1 where col1 = 5
TextData
Event
Class
Data
base
Id
Name
Object
Type
SPID
(@1 tinyint)SELECT * FROM [t1] WHERE [col1]=@1
SP:Cache
Insert
master
2573
0849
20816 - PQ
58
9039
31361
20801 - AQ
When this query is executed for the first time, since a compiled plan for the query is not found in cache, the query is compiled and both the parameterized and the shell query are inserted into cache. In this case we fire the SP:Cache Insert event two times: once for the parameterized query and once for the shell query. Notice that we don’t fire the SP:Cache Miss event at all in this case. The firing of the SP:Cache Miss event is deferred until we determine whether or not to fire an SP:Cache Insert event. If we do fire the cache insert event, we will not fire the cache miss event. This optimization significantly reduces event volume without compromising on information we provide to the user via trace events. Another interesting thing to note here is that for prepared and adhoc compiled plans, the objectid column does not contain the ID of the object in the query as reported in sys.objects. It is a hash of the sql text. The DatabaseId column gives the id of the database from which the query was executed.
We adopt a similar strategy for stored procedures/functions as well as demonstrated in the example below:
exec sales_report @country_id = 2
sales_report
2636
71987
8272 - P
When a stored procedure is executed for the first time, the SP:Cache Insert event is fired and the text data of the trace event contains the stored procedure name, and not the entire body of the stored procedure. Notice that in this case too the SP:Cache Miss event is not fired. The object ID in this case is that of the stored procedure as reported in sys.objects.
Now let us consider an example where we will fire the SP:Cache Miss Event:
create table t1(col1 int primary key, col2 int, col3 int)
Miss
6993
70779
Certain DDL statements are not cached e.g. create table DDL. When the create table DDL is executed, we fire the SP:Cache Miss event because we are sure that it will not be followed immediately by a SP:Cache Insert event.
There are 2 conditions under which a compiled plan is removed from cache: memory pressure or procedure cache flush. In both these cases we fire the SP:Cache Remove event. In SQL Server 2005 we distinguish these two by providing different eventsubclasses for them: 1 - Compplan Remove, 2 - Proc Cache Flush. To help identify database operations flushing the plan cache in SQL Server 2005 SP2 an Errorlog event is fired with the following text data: "SQL Server has encountered %d occurrence(s) of cachestore flush for the '%s' cachestore (part of plan cache) due to some database maintenance or reconfigure operations". Note that the event if fired in 5 minute intervals as long as there was a cache flush operation in that time interval. For example:
dbcc freeproccache
Sub
Entire Procedure Cache Flushed
Remove
2 - Proc Cache Flush
Consider a case we query a table in an auto-close database which results in plan cache flush:
alter database test_db set auto_close on
select * from test_db.dbo.test_table
2 - Proc
Cache
Flush
18
2006-12-31 16:17:13.93 spid61 Setting database option AUTO_CLOSE to ON for database test_db.
ErrorLog
61
2006-12-31 16:17:14.24 spid18s SQL Server has encountered 1 occurrence(s) of cachestore flush for the 'Object Plans' cachestore (part of plan cache) due to some database maintenance or reconfigure operations.
2006-12-31 16:17:14.24 spid18s SQL Server has encountered 1 occurrence(s) of cachestore flush for the 'SQL Plans' cachestore (part of plan cache) due to some database maintenance or reconfigure operations.
2006-12-31 16:17:14.24 spid18s SQL Server has encountered 1 occurrence(s) of cachestore flush for the 'Bound Trees' cachestore (part of plan cache) due to some database maintenance or reconfigure operations.
The SP:Cache Hit event is fired when we find the compiled plan and/or the MXC in the cache for query submitted for execution. If we find the MXC in the cache from previous execution, then we will fire SP:Cache Hit event with eventsubclass: 1- Execution Context Hit. For example:
select * from t1 where col1 = 5;
In this case, the after the first query is compiled the parameterized query is inserted into the cache. After query execution the execution context is as cached as well. Since the exact same query is re-executed and the execution context was cached, we get an execution context cache hit event.
5610
17432
Hit
1 -
Exec
ution Context
If the MXC has been deleted due to memory pressure, or it was never cached, or if its not locally available on the same node in a multi-node machine, then we just fire the SP:Cache Hit event with Eventsubclass 2- Compplan Hit. For example:
select * from t1 where col1 = 6;
In this case after the compilation of the first query the parameterized query is cached and this fires the cache insert event. For the second query in the batch we have a cache hit for the compiled plan but not an execution context hit since the parameter values are different in the second query.
20816
- PQ
2 -
Comp
plan
The SP:Recompile and SQL:StmtRecompile trace event is discussed in this whitepaper. The plan cache related trace events and their different eventsubclasses can be found using:
where trace_event_id in (34, 35, 36, 37, 38)
Trace_
event_
id
column_
Subclass_
name
value
36
Compplan Remove
Proc Cache Flush
37
Schema changed
Statistics changed
Deferred compile
Set option change
4
Temp table changed
5
Remote rowset changed
6
For browse permissions changed
7
Query notification environment changed
8
PartitionView changed
9
Cursor options changed
10
Option (recompile) requested
11
38
Execution Context Hit
Compplan Hit
12.2 Performance Counters
Plan cache counters defined in SQL Server 2005 include Cache Hits Ratio, Cache Pages, Cache Objects Count, and Cache Objects in Use. Plan cache performance counters in SQL Server 2005 are available under the performance object ‘SQLServer: Plan Cache’. The plan cache performance counters have one instance for each of five cache stores: Bound Trees, Extended Stored Procedures, Object Plans, Sql Plans and Temporary Tables and Table Variables and one additional _Total instance. In other words, in SQL Server 2005 we now report hit ratio, cache pages, object counts etc per cache store instead of the individual types of cached objects.
Examine the SQL Plans and Object Plans instances of the Cache Hit Ratio counter to see if compiled plans are being re-used. Low cache hit ratio (<20%) along with a sustained query execution rate (SQLServer: SQL Statistics\Batch Requests/sec) indicates that compiled plans are not being re-used. It should be noted that the hit ratio counter may be skewed by internal lookups performed.
The Cache Pages counter is useful counter to watch when performance degradation observed on the server. For a workload with adhoc queries, typically the database pages increase along with the plan cache pages. It in SQL Server 2005 RTM And SP1 since the plan cache can grow to about ~80% of the buffer pool without being under memory pressure, this would leave very little room for database pages. This results in the database pages being evicted and subsequent performance degradation. To verify if this is indeed what is happening track the following performance counters: 'SQL Server: Plan Cache\Cache Pages(_Total)' and 'SQLServer: BufferManager\Database pages' and look for a significant decrease in the database pages counter as the plan cache pages counter increases and reaches the threshold value after which memory pressure sets in. This problem has been addressed in SQL Server 2005 SP2 by limiting the cache store sizes (more details will be available in blog post titled ‘Improvements made to Plan Cache behavior in SQL Server 2005 SP2’). Other potential solutions would be to reduce the amount of caching by examining the workload to see if queries have been parameterized, or can be rewritten with stored procedures.
Other performance counters that provide useful plan cache related information include SQL Server:SQL Statistics\Unsafe Auto-Params/sec. An increment in the SQL Server:SQL Statistics\Unsafe Auto-Params/sec performance counter indicates that the query was unsafe auto-parameterized. Likewise the counters SQL Server: SQL Statistics\Forced Parameterizations/sec and SQL Server: SQL Statistics\Safe Auto-Params/sec give then rate of forced parameterizations and safe auto parameterizations.
The counters SQLServer: SQL Statistics\SQL Compilations/ sec and SQLServer: SQL Statistics\SQL Re-Compilations/ sec give the number of compilations and recompilations per second. In order to benefit from caching, ideally the rate of compilations and recompilations should be low.
One additional thing to note here is that queries compiled with a RECOMPILE hint are not actually considered recompiles. The word recompile in these hints is a misnomoer because the new plan is generated before query execution begins these would more accurately be classified as compiled and not recompiles. Hence we see an increment in the SQL Server:SQL Statistics\SQL Compilations/sec counter and not the SQL Server: SQL Statistics\SQL Re-compilations/sec counter. Consider the example below: Sales table has three columns: transaction id, country id, and total_amount. We also know that majority of the sales occur in country with id = 1, and country with id = 2 has far fewer rows in the sales table. If we had a stored procedure that reported sales by country id as below:
If we now executed the stored proc with country id 2 first, then the query_plan chosen based on parameter sniffing is a sub-optimal plan for country id 1. To avoid parameter sniffing, we could adopt one of three approaches:
Approach 1: Create the procedure by adding WITH RECOMPILE to stored procedure definition as below.
create proc sales_report @country_id int with recompile
Before taking approach one, evaluate if the compilation time of the stored procedure is relatively small because this approach would mean compilation of all queries inside the stored procedure. Also the compiled plan is not inserted into cache.
Approach 2: Add the WITH (RECOMPILE) hint to the query inside the stored procedure
select * from sales where country_id = @country_id option (recompile)
Before taking approach two, evaluate if additional compilation time of the query relative to the execution time savings is relatively small then this approach is may provide the more suitable solution.
Approach 3: Execute the stored procedure WITH RECOMPILE.
Approach 3 implies a new plan will be compiled for every execution of the stored procedure. The compiled plan generated will not be inserted into cache. However, a previously cached plan for the stored procedure will not be removed.
exec sales_report @country_id = 2 with recompile
For all three approaches, the compilations counter is incremented, and not the recompilations counter.