12.0 Plan Cache Trace Events and Performance Counters
12.1 Trace Events
12.1.1 Performance Statistics Trace Event
The performance statistics trace event introduced in SQL Server 2005, gives persisted plan and runtime statistics information. With the information part of the trace event data when combined with the information available through the plan cache DMV sys.dm_exec_query_stats the performance history of any query can be reconstructed. The advantage of capturing this information via trace along with other plan cache trace events is that the trace data can be persisted by saving it to a trace file on the disk. The trace file contains a chronological list of trace events fired including query plans that can be analyzed for poor plans causing performance problems.
In SQL Server 2000, monitoring applications would have to capture SQL:BatchStarting, SQL:BatchCompleted and the ShowPlanXML events typically and analyze the trace data from all these events for performance problems. The batch starting and completed events are fired every time batch is executed and can be high volume events causing some performance degradation. In SQL Server 2005, the performance statistics trace event essentially obliterates the need to monitor these three different events. The trace data from the performance statistics trace event can be analyzed for poor plans causing performance problems. Performance statistics trace event is also a relatively lower volume trace (compared to the batch started and completed events) event since it is fired for unique queries and not every query.
The performance statistics trace event has 4 subclasses:
select distinct * from sys.trace_subclass_values
where trace_event_id = 165
and trace_column_id = 21
order by trace_event_id, subclass_value
go
|
Trace_event_id |
Trace_column_id |
Subclass_name |
Subclass_value |
|
165 |
21 |
SQL |
0 |
|
165 |
21 |
SP:Plan |
1 |
|
165 |
21 |
Batch:Plan |
2 |
|
165 |
21 |
QueryStats |
3 |
Sql batch text event (Subclass_name: SQL) is fired when sql text is inserted into the SQLMGR cache. This event subclass has batch scope; it fired once for every batch. The sql batch text event is not fired for object plans.
The object plan (Subclass_name: SP:Plan) and adhoc plan (Subclass_name: Batch:Plan) events are scoped at the statement level and are fired whenever the query is compiled or recompiled. They are not fired for queries with recompile hints. The adhoc plan events are fired only for DML and select queries and not for DDL, declare statements, while loop statements etc.
The run time stats event (Subclass_name: QueryStats) is fired when a compiled plan is removed from cache due to reasons including recompilation or memory pressure. The TextData column here contains the query execution statistics.
BOL has information regarding the trace event columns and how to interpret the trace data in the various columns for each even subclass.
Let us look at an example to see how to use performance statistics trace event:
Create a file trace with the performance statistics trace event (event id = 165) on the server:
Now suppose we have table that contains the all sales information and has three columns: transaction id, country id, and total_amount. The transaction id is the primary key column. In addition, the table has a non clustered index on the country id column. We also know that majority of the sales occur in country with id = 1, and country with id = 2 has far fewer rows in the sales table. If we had a stored procedure that reported sales by country id as below:
create proc sales_report @country_id int
as begin
select * from sales where country_id = @country_id
end
go
Now let us suppose this procedure is executed with parameter value 2 to generate the sales report for country id 2:
exec sales_report 2
go
The parameter value is sniffed in this case and we do an index seek (more details availbel in this previous posting).
Now consider the scenario where the system after achieving certain predictable performance for a period of time, now shows a sudden deterioration(an unexpected) in performance when the same stored procedure is re-executed with same parameter value (country id 2). Analyzing the performance statistics trace data in conjunction with the query statistics reported in sys.dm_exec_query_stats give important clues in understanding the reason for this unexpected drop in performance. Now let us check the DMVs to find get the query plan and the execution time statistics:
select total_worker_time/execution_count as avg_cpu_time,
substring(st.text, (qs.statement_start_offset/2) + 1,
((case statement_end_offset
when -1
then datalength(st.text)
else
qs.statement_end_offset
end
- qs.statement_start_offset)/2) + 1) as statement_text
, cast(query_plan as xml)
from sys.dm_exec_query_stats qs
cross apply sys.dm_exec_sql_text(qs.sql_handle) st
cross apply sys.dm_exec_text_query_plan(qs.plan_handle, qs.statement_start_offset, qs.statement_end_offset)
order by total_worker_time/execution_count desc;
go
|
Avg_
Cpu_
Time |
Statement_
Text |
Query_plan |
|
126408 |
select * from sales where country_id = @country_id |
<ShowPlanXML
.
.
PhysicalOp=
"Clustered Index Scan"
LogicalOp=
"Clustered Index Scan"
.
<ParameterList>
<ColumnReference
Column="@country_id"
ParameterCompiledValue="(1)" />
.
</ShowPlanXML> |
From the query plan we can see that we are now doing an index scan and the sniffed parameter value is 1. This means the plan that was cached after the initial execution of the stored procedure with parameter value country id = 2 was probably removed from cache. Further, it a new plan for the stored procedure was inserted into cache, this time with parameter value country id = 2.
The performance statistics trace event data can confirm if indeed this was what happened. The stored procedure below analyzes the performance statistics trace data looking for object plans that are first inserted into cache and later deleted by matching the sql and plan handles. From the run time stats event, the average CPU time of the deleted plan is computed. The deleted query plan is obtained from the TextData column of the object plan event. The average CPU time of the new cached plan is computed from sys.dm_exec_query_stats. The query plan is available in sys.dm_exec_text_query_plan.
create procedure analyze_perf_stats_trace_data
as
begin
if object_id('trace_data', 'U') is not null
drop table trace_data;
if object_id('perf_stats_data_inserted', 'U') is not null
drop table perf_stats_data_inserted;
if object_id('perf_stats_data_deleted', 'U') is not null
drop table perf_stats_data_deleted;
if object_id('perf_stats_inserted_with_corresponding_deleted_object_events', 'U') is not null
drop table perf_stats_inserted_with_corresponding_deleted_object_events;
if object_id('perf_stats_deleted_with_corresponding_inserted_object_events', 'U') is not null
drop table perf_stats_deleted_with_corresponding_inserted_object_events;
if object_id('perf_stats_removed_plan', 'U') is not null
drop table perf_stats_removed_plan;
create table perf_stats_inserted_with_corresponding_deleted_object_events (TextData xml, EventClass int,
EventSubClass int, SqlHandle varbinary(64),
PlanHandle varbinary(64), StartTime datetime,
EventSequence int, PlanGenerationNumber bigint,
IntegerData2 int, Offset int);
create table perf_stats_deleted_with_corresponding_inserted_object_events (TextData xml, EventClass int,
EventSubClass int, SqlHandle varbinary(64),
PlanHandle varbinary(64), StartTime datetime,
EventSequence int, PlanGenerationNumber bigint,
IntegerData2 int, Offset int);
---read all the trace data from file
select * into trace_data from
::fn_trace_gettable ('c:\temp\perfstats.trc', default);
---select all rows corresponding to all object events
---cast the Sql and Plan Handle into varbinary(64)
select convert(xml, TextData) as TextData, EventClass,
EventSubClass, convert(varbinary(64), SqlHandle) as SqlHandle,
convert(varbinary(64), PlanHandle) as PlanHandle,
StartTime, EventSequence, BigintData1,
IntegerData2, Offset
into perf_stats_data_inserted
from trace_data
where EventClass = 165 and
EventSubClass = 1;
---select all rows corresponding to query stats events
---these events are generated when an plan is removed from cache
---for object and adhoc plans
select convert(xml, TextData) as TextData, EventClass,
EventSubClass, convert(varbinary(64), SqlHandle) as SqlHandle,
convert(varbinary(64), PlanHandle) as PlanHandle,
StartTime, EventSequence, BigintData1,
IntegerData2, Offset
into perf_stats_data_deleted
from trace_data
where EventClass = 165 and
EventSubClass = 3;
/* for every query statistics (plan deleted event) find the corresponding plan inserted events (object plans only). The plan inserted event should have lower event sequence number tha the QE stats event so we find plan insert events that occured earlier than the QE stats events. It should also have the
same plan handle as the QE stats event, statement start and end offset as the QE stats event. The QE stats and object plan events should also have the same plan generation number (bigintdata1). By matching plan generation numbers
we avoid duplicate rows from recompiles.
*/
declare @TextData xml, @EventClass int, @EventSubClass int, @SqlHandle varbinary(64), @PlanHandle varbinary(64), @StartTime datetime, @EventSequence int, @PlanGenerationNum bigint,
@IntegerData2 int, @Offset int;
declare event_cursor cursor for select * from perf_stats_data_deleted;
open event_cursor;
fetch next from event_cursor into @TextData, @EventClass, @EventSubClass, @SqlHandle, @PlanHandle, @StartTime, @EventSequence, @PlanGenerationNum, @IntegerData2, @Offset;
while @@fetch_status =0
begin
insert into perf_stats_inserted_with_corresponding_deleted_object_events
select * from perf_stats_data_inserted where EventSequence < @EventSequence
and PlanHandle = @PlanHandle and BigIntData1 = @PlanGenerationNum
and IntegerData2 = @IntegerData2 and Offset = @Offset;
if exists(select * from perf_stats_data_inserted where EventSequence < @EventSequence
and PlanHandle = @PlanHandle)
insert into perf_stats_deleted_with_corresponding_inserted_object_events values(@TextData, @EventClass, @EventSubClass, @SqlHandle,
@PlanHandle, @StartTime, @EventSequence, @PlanGenerationNum, @IntegerData2, @Offset)
fetch next from event_cursor into @TextData, @EventClass, @EventSubClass, @SqlHandle,
@PlanHandle, @StartTime, @EventSequence, @PlanGenerationNum, @IntegerData2, @Offset;
end
close event_cursor;
deallocate event_cursor;
---compute the CPU time of the deleted plan from the QE stats
---event Text Data, plan handle, sql handle, and query plan that ---was removed and save the data into a table
select
d.TextData.value('(/QueryExecutionStats/WorkerTime/@Total)[1]', 'bigint') /
d.TextData.value('(/QueryExecutionStats/GeneralStats/@ExecutionCount)[1]', 'bigint')
as avg_cpu_time_removed,
i.TextData as query_plan_removed,
d.PlanGenerationNumber as plangenerationnum_removed,
d.PlanHandle as planhandle_removed,
d.SqlHandle as sqlhandle
into perf_stats_removed_plan
from
perf_stats_deleted_with_corresponding_inserted_object_events d, perf_stats_inserted_with_corresponding_deleted_object_events i
where i.PlanHandle = d.PlanHandle;
---note that plan generation numbers >2 indicate recompiles
select * from perf_stats_removed_plan;
/* Check if sys.dm_exec_query_stats has any rows with same Sql Handle and plan generation one greater than the deleted plan
If yes, get the CPU time, sql text, query plan, plan handle, sql handle compare results from this query with rows from perf_stats_removed_plan to indentify queries that are running slower */
declare @Sql_Handle varbinary(64), @PlanGenNum bigint;
declare event_cursor cursor for
select distinct sqlhandle, plangenerationnum_removed from perf_stats_removed_plan;
open event_cursor;
fetch next from event_cursor into @Sql_Handle, @PlanGenNum;
while @@fetch_status =0
begin
if exists(select * from sys.dm_exec_query_stats where sql_handle = @Sql_Handle and plan_generation_num <= @PlanGenNum + 1)
select total_worker_time/execution_count as avg_cpu_time_current,
substring(st.text, (qs.statement_start_offset/2) + 1,
((case statement_end_offset
when -1
then datalength(st.text)
else
qs.statement_end_offset
end
- qs.statement_start_offset)/2) + 1) as statement_text,
cast(query_plan as xml) as query_plan_current,
plan_generation_num as plan_generation_num_current,
plan_handle as planhandle_current,
sql_handle as sqlhandle
from sys.dm_exec_query_stats qs
cross apply sys.dm_exec_sql_text(qs.sql_handle) st
cross apply sys.dm_exec_text_query_plan(qs.plan_handle, qs.statement_start_offset, qs.statement_end_offset)
where qs.sql_handle = @Sql_Handle and
qs.plan_generation_num <= @PlanGenNum + 1 ;
fetch next from event_cursor into @Sql_Handle, @PlanGenNum;
end
close event_cursor;
deallocate event_cursor;
end
go
exec analyze_perf_stats_trace_data
go
The results returned by the stored procedure as below.
|
Avg_
cpu_
time_
Removed |
Query_
Plan_
Removed |
Plan
gene
ration
num_
removed |
Plan
handle_
removed |
Sql
Handle |
|
229 |
<ShowPlanXML
.
.
PhysicalOp="Index Seek" LogicalOp="Index Seek"
.
.
Column="@country_id" ParameterCompiledValue="(2)" />
.
.
</ShowPlanXML> |
1 |
0x050001
00894682
0FB8010C
05000000
00000000
00000000
00 |
0x030001
00894682
0F639506
01B89800
00010000
00000000
00 |
|
Avg_
cpu_
time_
current |
Statement_
text |
Query_
Plan_
current |
Plan
gene
ration
num_
current |
Plan
handle_
current |
Sql
Handle |
|
126408 |
select * from sales where country_id = @country_id |
<ShowPlanXML
.
.
PhysicalOp=
"Clustered
Index Scan"
LogicalOp=
"Clustered
Index Scan"
.
.
Column=
"@country_id" Parameter
CompiledValue
="(1)" />
.
.
</ShowPlanXML> |
1 |
0x050001
00894682
0FB8C118
05000000
00000000
00000000
00 |
0x030001
00894682
0F639506
01B89800
00010000
00000000
00 |
From the result set returned by the above stored procedure we can explain the slow down in execution time due to a new query plan. The plan handles are different while the sql handles are the same indicating we have a different plan for the exact same batch text. The plans before and after have a plan generation number of 1 indicating no recompiles occurred.
In the old query plan the parameter value was sniffed and we did an index seek to get all rows with country id 2. After the plan is deleted from cache (due to memory pressure or cache flush), the stored procedure is now executed with parameter value (country id = 1). The query plan does an index scan for sniffed parameter value 1. Now, when the stored procedure is executed with country id 2, we get poor performance because we re-use the cached plan which is not the optimal plan for country id 2.
Now let us look an example with recompiles:
create table t1 (a int, b int)
go
insert into t1 values (1, 1)
go
create procedure RowCountDemo
as
begin
declare @i int;
set @i = 0;
while (@i < 1000)
begin
insert into t1 values (@i, 2*@i - 50);
select a from t1 where a < 10 or ((b > 20 or a >=100) and (a < 10000)) group by a ;
set @i = @i + 1;
end
end
go
exec RowCountDemo
go
As described in this whitepaper, after 500 modifications to a table, any stored procedure referencing that table will need to be recompiled because the table statistics needs to be refreshed.
Now execute procedure analyze_perf_stats_trace_data and examine the result sets generated:
|
Avg_
cpu_
time_
Remo
ved |
Query_
Plan_
Removed |
Plan
gen
era
tion
num_
remo
ved |
Plan
handle_
removed |
Sql
Handle |
|
3558 |
<ShowPlanXML
.
.
<RelOp
NodeId="0" PhysicalOp
="Sort" LogicalOp
="Distinct Sort"
.
.
</ShowPlanXML>
|
1 |
0x050001
0047E572
35B82199
04000000
00000000
00000000
00 |
0x030001
0047E572
35DD9000
01BA9800
00010000
00000000
00 |
|
Avg_
cpu_
time_
current |
Statement_
text |
Query_
Plan_
current |
Plan
gene
ration
num_
current |
Plan
handle_
current |
Sql
Handle |
|
53 |
insert into #t1 values (@i, 2*@i - 50); |
<ShowPlanXML
.
.
<RelOp
NodeId="0" PhysicalOp=
"Table
Insert"
LogicalOp
="Insert"
.
.
</ShowPlanXML>
|
1 |
0x050001
0047E572
35B82199
04000000
00000000
00000000
00 |
0x030001
0047E572
35DD9000
01BA9800
00010000
00000000
00 |
|
1166 |
select a from #t1 where a < 10 or ((b > 20 or a >=100) and (a < 10000)) group by a ; |
<ShowPlanXML
.
|