I wrote the following stored procedure which creates a table with the specified number of partitions:
CREATE PROCEDURE CreateTable @N INTASBEGIN DECLARE @Cmd VARCHAR(8000) DECLARE @I INT SET @Cmd = 'CREATE PARTITION FUNCTION PF(INT) AS RANGE FOR VALUES (1' SET @I = 2 WHILE @I < @N BEGIN SET @Cmd = @Cmd + ',' + CONVERT(VARCHAR(4),@I) SET @I = @I + 1 END SET @Cmd = @Cmd + ')' EXECUTE (@Cmd) CREATE PARTITION SCHEME PS AS PARTITION PF ALL TO ([PRIMARY]) CREATE TABLE T1 (A INT, B INT) CREATE CLUSTERED INDEX T1A ON T1(A) ON PS(A)END
I then created a table with either 2, 10, 100, or 1000 partitions. For each table, I ran the following batch:
SET NOCOUNT ONDECLARE @T DATETIMEDECLARE @I INT, @J INT, @K INTSET @I = 0SET @J = 0SET @T = GETDATE()WHILE @J < 10000 BEGIN SELECT @K = B FROM T1 WHERE A < @I SET @J = @J + 1 ENDSELECT DATEDIFF (MS, @T, GETDATE())
The main SELECT statement in this batch scans a single partition but, because the WHERE clause references a variable, SQL Server does not know this information at compile time. This query is too fast to measure a single execution accurately. Thus, I magnify the cost by repeating the query 10,000 times in a WHILE loop. The SET NOCOUNT ON statement and the assignment in the SELECT statement merely serve to minimize client communication costs which would otherwise dominate the batch cost.
The table is empty which makes this test a worst case scenario. The cost of the query is dominated by the cost of locating the correct partition.
Looking at the SQL Server 2005 STATISTICS PROFILE output for the 1000 partition table, we can see that SQL Server tests each partition before scanning only the one that is not eliminated:
0 1 |--Nested Loops(Inner Join, OUTER REFERENCES:([PtnIds1004]) PARTITION ID:([PtnIds1004])) 1 1 |--Filter(WHERE:([PtnIds1004]<=RangePartitionNew([@I],(0),(1),(2),(3),... 1000 1 | |--Constant Scan(VALUES:(((1)),((2)),((3)),... 0 1 |--Clustered Index Seek(OBJECT:([T1].[T1A]), SEEK:([T1].[A] < [@I]) ORDERED FORWARD PARTITION ID:([PtnIds1004]))
By comparison, SQL Server 2008 seeks directly to the correct partition:
0 1 |--Clustered Index Seek(OBJECT:([T1].[T1A]), SEEK:([PtnId1000] >= (1) AND [PtnId1000] <= RangePartitionNew([@I],(0),(1),(2),(3),...
Here are the results. In each case, before collecting any data, I ran the batch once to eliminate compilation overhead and warm up costs. I then took the median of 5 runs. Times are in milliseconds for the full 10,000 executions of the query. Note that we cannot compare the SQL Server 2005 results directly to the SQL Server 2008 results as I ran these tests on two different machines. However, the relative cost of adding partitions in SQL Server 2005 vs. SQL Server 2008 is clear.
# Partitions
SQL Server 2005
SQL Server 2008
2
233
216
10
313
203
100
1216
1000
18060
It is important to remember that these are times for 10,000 executions of the query. Thus, even on SQL Server 2005 with 1000 partitions, the cost of a single execution of the query is only 1.8 milliseconds. Moreover, if the table contains any real data and if the query returns even a modest number of rows, the cost of the query will be dominated by the actual data processing.
Thanks Craig for the concrete information...
This is good information, thanks
Nice way to show the difference.
Thanks,