The other day we got a call from a puzzled friend. He has rewritten a stored procedure using table variable instead of temp table, it makes the stored procedure code look more tidy. The puzzling part is the same stored procedure now running a lot slower. The estimated plans looked the same using temp table and table variable, but the execution plans are very different. Unbeknownst to him, rewriting the procedure using table variable bumped into a query optimizer blind spot. While the “Estimated” and “Actual” query plans are same in most cases, there are some exception. If a table (temporary or permanent) is created in the same batch with the query, SQL Serve has to recompile the query since the query definition is unknown when the batch is compiled the first time. This is not the case for table variables (and that was the main reasons SQL Server 2005 introduce them – to reduce recompilations when unnecessary). In order to explain in detail, let’s use the following example:
--0. create test data
use test
if (OBJECT_ID('Test1') is not null)
DROP TABLE Test1
go
CREATE TABLE Test1 (ID int)
DECLARE @i int
SET @i = 0
SET NOCOUNT ON
WHILE @i < 20000
BEGIN
INSERT INTO Test1 (ID) Values (@i)
SET @i = @i + 1
END
CREATE CLUSTERED INDEX IX_Test1 ON dbo.Test1 (ID)
--1. Query using table variable
DECLARE @Tmp1 TABLE (ID int)
INSERT INTO @Tmp1(ID)
SELECT ID
FROM Test1
SELECT *
FROM Test1
WHERE ID NOT IN (SELECT ID FROM @Tmp1)
--2. Query using temp table
CREATE TABLE #Tmp1(ID int)
INSERT INTO #Tmp1(ID)
SELECT ID
FROM Test1
SELECT *
FROM Test1
WHERE ID NOT IN (SELECT ID FROM #Tmp1)
DROP TABLE #Tmp1
The estimated plans for both SELECT queries in the example above look the same – Nested Loops (Left Anti Semi Join) with the outer (top in the graphical display) table the Clustered index on Test1 and the inner table is the table variable in the first case and the temporary table in the second case. But if you run both scripts you will see that the first runs considerably slower – a minute or two depending on your processor speed, memory available to SQL Server and disk I/O capacity. How is this possible?
When the queries are compiled neither the table variable nor temporary table are populated and the query optimizer assumes “minimum” number of rows and that is always 1 in SQL Server. This estimation is used when SQL Server generates the “Estimated” query plan. While the “Estimated” and “Actual” query plans are same in most cases, there are some exception. If a table (temporary or permanent) is created in the same batch with the query, SQL Serve has to recompile the query since the query definition is unknown when the batch is compiled the first time. This is not the case for table variables.
Now turn on “Display Actual Execution Plan” in SSMS and re-run the scripts 1. in one query window and 2. in another one. You will see that the actual plan for script 1 is the same as estimated plan and took substantially longer than the script 2 to complete. The actual execution plan for Script 2 is very different from its estimated plan. The actual execution plan for script 2 uses Hash Match to execute the Anti Semi Join at the top of the plan. There is a table scan of the temporary table #Tmp1 at the top of the plan building the hash table. If you examine the properties of this scan you will find out that the number of Estimated rows is 20000 (was 1 in the Estimated plan as well as it is 1 for the table variable plan).
Figure 1. Estimated plan

Figure 2. Actual plan of SELECT query using temp table
![clip_image001[10]](http://blogs.msdn.com/blogfiles/sqlcat/WindowsLiveWriter/Tablevariablevs.temptabl.actualqueryplan_A74C/clip_image001%5B10%5D_thumb.jpg)
When examining query plans be careful to consider possible plan change during the batch execution due to recompiles. Bear in mind that while table variables are not causing recompiles, you may need just the opposite to get the best query plan. You should always use temporary tables in your queries unless you intentionally want to avoid query recompilation and you are confident query variable will give you a good plan. Introduction of statement level recompilation in SQL Server 2005 substantially lowered the need to use table variables in your queries.
You will find some great information about table variables and temporary tables in
http://databases.aspfaq.com/database/should-i-use-a-temp-table-or-a-table-variable.html .
Please note that the actual execution plans for the above batches will change if you merge 1 and 2 into to a single batch and will change again if you merge all three into single batch… There are some interesting learnings surrounding such modifications and we will talk about them in some later blog entries! Stay tuned!
Lubor Kollar, Lindsey Allen
Under Windows Server 2000 and 2003 (RTM), the interrupts from a single network adaptor (NIC) cannot be handled by multiple CPU Cores. A Deferred Procedure Call (DPC) call gets scheduled to run as a consequence of the NIC firing an interrupt. The DPC will deliver the received packets from the NIC to the networking subsystem of the OS. The NIC driver will block interrupts from the network card until the DPC has been handled. If your system makes heavy use of network bandwidth, sending all interrupts and DPC to one CPU may overwhelm the system and cause a bottleneck. In extreme cases the system may even become unresponsive as the amount of DPC traffic consumes all CPU cycles.
One workaround, if you run Windows 2000 and 2003 Server, is to use the Interrupt Affinity Filter Tool. This tool, which applies only to x86 version of for Windows 2000/2003 is documented here: KB 252867. Using the Interrupt Affinity Filter Tool, you can affinitize different network adaptors to different CPU cores. This overcomes parts of the problem and increases scalability by distributing interrupts between CPU cores. Be aware that the Windows 2008 tool for network affinity is not the same tool as under Windows 2000/2003 Server. You can specify network affinity in Window 2008 by editing the registry directly or by using the new Interrupt-Affinity Policy Tool.
The Windows 2003 Scalable Networking Pack (SNP) improves network scaling further by introducing Receive Side Scaling (RSS). RSS is also included in both Windows 2003 SP2 and Windows 2008. This improvement includes the ability to concurrently and asynchronously handle DPCs from a single network adaptor. The improvements are documented here: Scalable Networking with RSS. Do be aware that your network card must support this feature to gain the full advantage of the implementation.
Even with RSS, high bandwidth workloads can still benefit from affinitizing the network cards to specific CPU cores. For example, using interrupt affinity and RSS was a crucial optimization for the ETL World Record. When scaling Integration Services to 64 cores, we dedicated every 8th core – one core and NIC for each NUMA node – to network traffic. Using once NIC per NUMA node keeps the memory local to the node. The rest of the cores we affinitized to SQL Server. At full bulk insert speed on the 64 cores Unisys ES/7000, our CPU load looked like this:
Notice the highlighted cores – these cores are dedicated to handling network traffic. The system had throughput of over 800 MB/sec at the point above.
Other uses of network affinity and RSS include: Analysis Services Process Data, Data Mart Loading, Heavy ETL activity and other scenarios where the network cards on the server are under heavy load.
/Thomas Kejser
Reviewers: Stuart Ozer, Carl Rabeler, Denny Lee, Prem Mehra, Ashit Gosalia, Ahmed Talat, Kun Cheng, Burzin Patel, Special Thanks to Rade Trimceski
Thomas Kejser and I are doing some work to produce the SQL Server 2008 Analysis Services Performance Guide. Among a bunch of other things, it discusses the IIF function. But because I recently mentioned MDX query hints in a recent podcast, I wanted to get this out quickly. Some references are a bit hazy (expensive vs inexpensive query plans, default values and the like) and I’ll elaborate either in future blogs or in the white paper itself.
Anyway, the IIF mdx function is a commonly used expression that can be very costly to evaluate. It takes 3 arguments:
iif(<condition>, <then branch>, <else branch>)
Where the condition evaluates to true, the value from the “then branch” is used otherwise the “else branch” expression is used.
Note the term “used” – one or both branches may be evaluated even if its value is not used. It may be cheaper for the engine to evaluate the expression over the entire space and use it when needed (termed an eager plan) rather than chop up the space into a potentially enormous number of fragments and evaluate only where needed (a strict plan).
The first consideration is whether the query plan is expensive or inexpensive. An expensive query plan is one that is evaluated in cell-by-cell mode (more on this in a future post) or one that requires reading a cube data (most commonly because a member navigation function is used). Most IIF condition query plans are inexpensive – but nasty nested conditions with more IIFs can force it to go to cell-by-cell (functions that are not cell by cell are listed in the sql documentation).
The next consideration the engine makes is what value the condition takes most. This is driven by the condition’s default value. If the condition’s default value is true, then the “then branch” is the default branch – the branch that is evaluated over most of the subspace. Simple investigation and knowing a few simple rules on how the condition is evaluated helps to determine the default branch:
· In sparse expressions most cells are empty. So the default value of the isempty function on a sparse expression is true.
· Comparison to zero of a sparse expression is true
· Default value of IS operator is false
· If the condition cannot be evaluated in subspace mode, there is no default branch
For example, one of the most common uses of the IIF function is to check whether the denominator is non-zero:
iif([Measures].[Internet Sales Amount]=0, null, [Measures].[Internet Order Quantity]/[Measures].[Internet Sales Amount])
There is no calculation on Internet Sales Amount so it is sparse. Therefore the default value of the condition is true and therefore the default branch is the then branch with the null expression.
The table below shows how each branch of an IIF function is evaluated:
|
Branch Query Plan |
Branch is default branch |
Branch expression sparsity |
Evaluation |
|
Expensive |
n/a |
n/a |
Strict |
|
Inexpensive |
Unknown |
n/a |
Eager |
|
Inexpensive |
True |
n/a |
Eager |
|
Inexpensive |
False |
Dense |
Strict |
|
Inexpensive |
False |
Sparse |
Eager |
In SQL Server 2008 Analysis Services, you can overrule the default behavior with query hints:
Iif( <condition>, <then branch> [hint [Eager|Strict]], <else branch> [hint [Eager | Strict]] )
When would you want to override the default behavior? The most common scenarios where you might want to change the default behavior are:
· Engine determines the query plan for the condition is expensive and evaluates each branch in strict mode
· Condition is evaluated in cell by cell mode and each branch is evaluated in eager mode
· Branch expression is dense but easily evaluated.
For example, consider the simple expression below taking the inverse of a measure:
with member
measures.x as
iif(
[Measures].[Internet Sales Amount]=0
, null
, (1/[Measures].[Internet Sales Amount]))
select {[Measures].x} on 0,
[Customer].[Customer Geography].[Country].members on 1
from [Adventure Works]
cell properties value
The query plan is not expensive, the else branch is not the default branch and the expression is dense, so it is evaluated in strict mode. This forces the engine to materialize the space over which it is evaluated. (This can be seen in profiler with query subcube verbose events selected – but it’s a time consuming investigating each event’s subspace definition).
To prevent the query plan from partitioning the space, the query can be modified as follows:
with member
measures.x as
iif(
[Measures].[Internet Sales Amount]=0
, null
, (1/[Measures].[Internet Sales Amount]) hint eager)
select {[Measures].x} on 0,
[Customer].[Customer Geography].[Country].members on 1
from [Adventure Works]
cell properties value
OVERVIEW: One of the common complaints in using database snapshots is how to get queries and reports to switch over and start using the new snapshot as soon as it is available. There are three basic ways to accomplish this:
1. Delete the old and rename the new database snapshot. This works well if no users are using either snapshot.
2. Changing the connection string. This works pretty well and has the advantage that long running reports can continue to run while new reports/queries get directed to the new database snapshot. But doesn’t work well if you have many places to change the connection string or have to deploy a new application version.
3. Use Synonyms in for the base objects to point to a database snapshot. That’s what this blog is all about.
Steps to take to implement synonyms with database snapshots:
1. Create db snapshot
2. Create synonyms in main db that point to snapshot objects
3. Sometime in the future, create another snapshot
4. Update the synonyms in a DDL transaction. Warning: Blocking can occur so read the blocking section below.
Important Notes:
· You may get some application timeouts if synonym updates get blocked for too long.
· READ_UNCOMMITTED could be problem if you implanted scenario where a query would join one table from one snapshot to a table in a different snapshot.
· Could combine synonyms with shared scalable database. This is a scenario where you could use multiple servers to attach to SAN snapshots and then use synonyms to point to the current SAN snapshot.
Potential Blocking Scenario
The DDL transaction to update all the synonyms to point to another snapshot will update system tables. This transaction will be blocked by any readers accessing the synonym table, which can cause problems. And it will block any new queries from accessing the synonym table while the transaction is running. It could mean that the application can experience timeouts if it is blocked for too long.
Trying to work around the blocking by creating the reader sessions with READ_UNCOMMITTED isolation level would probably work but is not recommended. I can imagine a scenario where a query can access a table in the new snapshot and join to a table in the old snapshot, which is the main reason why the synonyms are updated in a transaction. So the blocking is actually a good thing as long as it doesn’t cause excessive application time outs.
The DDL transaction runs quickly. On my laptop, I updated 300 synonyms in about 1 second. There are certain ERP applications out there with 10,000+ tables so this synonym idea would be a bit more painful to implement.
One clarification: Queries that are already running will continue to run. New ones are only blocked from resolving synonyms. Once they have resolved the synonym name and start reading the tables then there will not be blocking.
Sample Code:
To use this technique, we need to look at the original, or source database, three different ways; 1) the actual source database, which we’ll call SourceDB, 2) a snapshot of the source database, which we’ll call SnapshotDB, and 3) the reporting database, which we’ll call the ReportDB. The SnapshotDB represents a point-in-time version of SourceDB, and is created using SQL similar to the following:
USE SourceDB
CREATE DATABASE SourceDB_<timestamp>_Snapshot ON
( NAME = SourceDB, FILENAME =
'C:\SourceDB\SourceDB_<timestamp>_Snapshot.ss' )
AS SNAPSHOT OF SourceDB;
GO
The ReportDB contains no source data, only synonyms. The synonyms point to the relevant tables and views in SnapshotDB, and are created using SQL similar to the following:
USE ReportDB
CREATE SYNONYM dbo.TableName FOR
SourceDB_<timestamp>_Snapshot.dbo.TableName
With this basic setup, all report-type queries can be executed against the ReportDB, using the same schema and object names defined in SourceDB, and the returned data will be datetime-consistent within the point-in-time source database snapshot, SnapshotDB. Additional objects, such as views, stored procedures, etc., can be created in ReportDB that refer to the synonyms, and will behave as expected.
In most cases, it will be desired to periodically drop and re-create SnapshotDB to have a more current point-in-time version of SourceDB. However, to maintain 24/7 application access while not requiring the application to use different connection strings, we cannot simply drop SnapshotDB, as queries against the synonyms in ReportDB would immediately fail, and dropping any database requires there are no open connections to it. We therefore must create a second, more current SnapshotDB, transactionally drop and re-create all the synonyms updated to point to the new SnapshotDB, then drop the old SnapshotDB. There may be queries still running against the first snapshot so you may want to comment out the DROP DATABASE command and have a separate job to do it at a later time. Beware that the more database snapshots you have active the more of a performance hit you will notice. See the whitepaper at the end.
The attached store procedure, usp_ReportingSynoSnap, implements this logic, using the appropriate metadata from SourceDB. It will create any schemas in ReportDB that exist in SourceDB, creating the corresponding synonyms in the correct schema. If access to SnapshotDB is only through synonyms in ReportDB, then there will effectively by no open connections to SnapshotDB, only to ReportDB. If those connections are active, executing queries against the synonyms, they will block the stored procedure from completing until the blocking queries finish resolving the synonym names, but once they have the stored procedure will drop the old SnapshotDB.
The stored procedure assumes the ReportDB already exists, and will only contain synonyms pointing to SnapshotDB, as it always drops all synonyms to ensure there are no orphans. The template used for naming SnapshotDB is <SourceDB>_Rpt_YYYYMMDD_HHMMSS_ss, where YYYYMMDD_HHMMSS is replaced with an actual timestamp of when the snapshot was created. Creating snapshots require enumerating all the logical data files for SourceDB, declaring a corresponding snapshot physical file for each. The stored procedure creates a snapshot physical file in the same location as the source physical file, named identically except for a suffix with the following format .Rpt_YYYMMDD_HHMMSS_ss (ex; AdventureWorks.mdf.Rpt_20080101_010101_ss), where YYYYMMDD_HHMMSS will be identical to the timestamp within the SnapshotDB name. Lastly, the stored procedure will drop the old SnapshotDB, determining the name of the old snapshot by finding a snapshot database with the same name as the new snapshot, but with the highest (in sort order) YYYYMMDD_HHMMSS. To use the stored procedure, create in either the SourceDB or ReportDB, and call with “EXEC usp_ReportingSynoSnap ‘SourceDbName’, ‘ReportDbName’”; ex: “EXEC usp_ReportingSynoSnap ‘AdventureWorks’, ‘AdventureWorks_Reporting’” (AdventureWorks_Reporting must exist before calling). If you need to keep the old snapshot around for a while, just remove or comment out the DROP DATABASE command near the end of the script.
The HHMMSS suffix in the example above may be misleading. It may take seconds or minutes depending on what is happening in the source database. And the data in the snapshot is not current as of the start or the end of the CREATE DATABASE command. All of this is explained in the whitepaper links at the end.
It should be noted that Windows will report the logical snapshot file size as identical to the corresponding source file size, however as the snapshot file is actually a sparse file, a feature of NTFS, the actual size consumed on disk will initially be zero, and will grow incrementally by 64k page sizes as data is changed in SourceDB, and SQL saves the original (before the changes are applied) SourceDB page to the snapshot (note this only happens the first time data is changed on the source page). This has two consequences, one is that there is a normally a slight write performance hit when using snapshots, and second, that if snapshots are not dropped and lots of new changes happen in the source, you may eventually end up consuming near twice the physical space actually needed for the source. Beware that there are some performance consequences that you need to be aware of when running snapshots. See the whitepapers below for more detail.
Copy the following script and run it in a query window to create the stored procedure in one of your application databases.
SOURCE CODE
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[usp_ReportingSynoSnap]
@sourceDbName NVARCHAR( MAX )
,@reportDbName NVARCHAR( MAX )
,@scriptOnly BIT = 0
AS
BEGIN
DECLARE
@sqlStmt NVARCHAR( MAX )
,@dtStamp NVARCHAR( MAX )
,@prevRptSsName NVARCHAR( MAX )
SET @dtStamp = REPLACE( REPLACE( REPLACE( CONVERT( NVARCHAR, GETDATE(), 120 ), '-', '' ), ' ', '_' ), ':', '' )
SELECT @prevRptSsName = MAX( ssdb.name )
FROM sys.databases ssdb
JOIN sys.databases db
ON db.database_id = ssdb.source_database_id
WHERE db.name = @sourceDbName
AND ssdb.name LIKE @sourceDbName + '_Rpt[_]%[_]ss'
SET @sqlStmt =
'
DECLARE srcSchemaCursor CURSOR LOCAL FAST_FORWARD FOR
SELECT DISTINCT
TABLE_SCHEMA
FROM ' + @sourceDbName + '.INFORMATION_SCHEMA.TABLES
DECLARE srcDbFileCursor CURSOR LOCAL FAST_FORWARD FOR
SELECT f.name
,f.physical_name
FROM sys.master_files f
JOIN sys.databases d
ON d.database_id = f.database_id
WHERE d.name = ''' + @sourceDbName + '''
AND f.type = 0
DECLARE rptSynonymCursor CURSOR LOCAL FAST_FORWARD FOR
SELECT sc_name = sc.name
,syn_name = syn.name
FROM ' + @reportDbName + '.sys.schemas sc
JOIN ' + @reportDbName + '.sys.synonyms syn
ON syn.schema_id = sc.schema_id
DECLARE srcTableCursor CURSOR LOCAL FAST_FORWARD FOR
SELECT TABLE_SCHEMA
,TABLE_NAME
FROM ' + @sourceDbName + '.INFORMATION_SCHEMA.TABLES
DECLARE
@schema_name NVARCHAR( MAX )
,@table_name NVARCHAR( MAX )
,@syn_name NVARCHAR( MAX )
,@file_name NVARCHAR( MAX )
,@file_physical_name NVARCHAR( MAX )
,@tmpSqlStmt NVARCHAR( MAX )
,@synSqlStmt NVARCHAR( MAX )
,@isFirstFile BIT
SET @tmpSqlStmt =
''
CREATE DATABASE [' + @sourceDbName + '_Rpt_' + @dtStamp + '_ss] ON''
SET @isFirstFile = 1
OPEN srcDbFileCursor
GOTO srcDbFileCursor_fetchFirst
WHILE @@FETCH_STATUS = 0
BEGIN
SET @tmpSqlStmt = @tmpSqlStmt + CASE WHEN @isFirstFile = 0 THEN '','' ELSE '''' END +
''
(NAME = ['' + @file_name + ''], FILENAME = '''''' + @file_physical_name + ''.Rpt_' + @dtstamp + '_ss'''')''
SET @isFirstFile = 0
srcDbFileCursor_fetchFirst:
FETCH NEXT FROM srcDbFileCursor INTO
@file_name
,@file_physical_name
END
CLOSE srcDbFileCursor
DEALLOCATE srcDbFileCursor
SET @tmpSqlStmt = @tmpSqlStmt +
''
AS SNAPSHOT OF ' + @sourceDbName + '''
SET @synSqlStmt =
''
EXEC sys.sp_executesql N'''''' + REPLACE( @tmpSqlStmt, '''''''', '''''''''''' ) + ''''''
USE ' + @reportDbName + '''
OPEN srcSchemaCursor
GOTO srcSchemaCursor_fetchFirst
WHILE @@FETCH_STATUS = 0
BEGIN
IF NOT EXISTS (SELECT * FROM ' + @reportDbName + '.sys.schemas WHERE name = @schema_name)
SET @synSqlStmt = @synSqlStmt +