An ISV recently found an anomoly in their implementation of a "drop table partition" function which could lead to unexpected partitions being switched out.
Typically, to do this "drop partition logic" SQL Server partitions are manipulated using the following operations:
· switch-out the partition which isn't needed anymore (or being archived) into an empty target table
· merge / reset the partition function ranges of the partitioned source table. For example; move some of the remaining data into a new partition / filegroup
· possibly drop the target table
The simple, standard way to switch-out a partition is to specify a partition number. The ISV implemented a select statement which used a boundary value of a partition range to return the corresponding partition number. Then the TSQL switch command was used to switch out this partition. This works fine as long as there is only one process doing the "switching".
But now let's assume that two processes would try this at the same time. Both processes will run the selects to get the partition numbers. Then the first one does the switch-out and the merge of the partition function. Unfortunately, the latter command will change the partition numbers as they are dynamically maintained by SQL Server. An activity like merge or splits of partitions will trigger a re-enumeration of all or parts of the partitions of a the table.
Afterwards the partition number which the second process got before the re-enumaration (tiggered by the first process) might not be accurate any more. A switch-out using the old partition number could result in switching out the wrong partition.
The solution is simple. We recommend using the $PARTITION function for the switch-out which allows you to specify a partition boundary value instead of a partition number.
The boundary values are not dynamic and therefore this issue won't come up.
The repro script below does the following:
· create a partitioned test table with 5 boundary values: 1960, 1970, 1980, 1990, 2000
· insert 1 row into the 1970 range, 2 rows into the 1980 range and 5 rows into the 1990 range
· now switch out the 1970 range and the 1980 range by specifying the boundary values
· as expected, three ranges remain: 1960, 1990 with 5 rows and 2000
· now repeat the same test by specifying a partition number instead of the boundary value
· the output after creating the test table shows partition number 2 for boundary 1970 and 3 for boundary 1980
· using "hard-coded" partition numbers for the "switch partition" function simulates the select mentioned above
· however, now the results look different. Like before the boundaries 1960, 1990 and 2000 remain as expected. But the number of rows are not at all what we would expect!
· instead of 5 rows for boundary 1990 we see only 2 ! What happened ?
· well - the first process did the merge of the partition function. This changed the partition number of boundary 1990 from 4 to 3 and the one of boundary value 1980 from 3 to 2. The call of the "switch partition" function with partition number 3 will now switch-out the rows of boundary value 1990. Therfore the 5 rows are gone and the 2 rows of 1980 will be kept. The merge function would still use the correct boundary value. So the list of boundary values looks ok but the content is wrong !
The solution to use the $PARTITION function also works in case two processes interfere between switch-out and the partition function merge.
if exists ( select * from sys.procedures where name = 'dp_reset_test' )
drop procedure dp_reset_test
if exists ( select * from sys.procedures where name = 'dp_list_partitions' )
drop procedure dp_list_partitions
if exists ( select * from sys.procedures where name = 'dp_switch_partition_via_boundary' )
drop procedure dp_switch_partition_via_boundary
if exists ( select * from sys.procedures where name = 'dp_switch_partition_via_partno' )
drop procedure dp_switch_partition_via_partno
set nocount on
-- stored procedure to create partitioned test table
create procedure dp_reset_test
if exists ( select * from sys.objects where name = 'dp_test1' and type = 'U' )
drop table dp_test1
if exists ( select * from sys.objects where name = 'dp_test1_clone' and type = 'U' )
drop table dp_test1_clone
if exists ( select * from sys.partition_schemes where name = 'ps_year' )
drop partition scheme ps_year
if exists ( select * from sys.partition_functions where name = 'pf_year' )
drop partition function pf_year
CREATE PARTITION FUNCTION pf_year (int)
RANGE LEFT FOR VALUES ( 1960, 1970, 1980, 1990, 2000 )
CREATE PARTITION SCHEME ps_year AS PARTITION pf_year ALL TO ([PRIMARY])
create table dp_test1
( col1 int default 99 ,
col3 int primary key
) on ps_year(col3)
create table dp_test1_clone
insert into dp_test1 values ( 1966,1966,1966 )
insert into dp_test1 values ( 1971,1971,1971 )
insert into dp_test1 values ( 1972,1972,1972 )
insert into dp_test1 values ( 1984,1984,1984 )
insert into dp_test1 values ( 1985,1985,1985 )
insert into dp_test1 values ( 1986,1986,1986 )
insert into dp_test1 values ( 1987,1987,1987 )
insert into dp_test1 values ( 1988,1988,1988 )
-- stored procedure to print partitions info
create procedure dp_list_partitions ( @tabname char(20) )
declare @p_number int
declare @p_rows int
declare @boundary_value int
declare p_details cursor for
select partition_number, rows, convert(int,sprv.value)
from sys.partitions sp,
where object_id = OBJECT_ID(@tabname) and
spf.function_id = sprv.function_id and
sprv.boundary_id = sp.partition_number and
( sp.index_id = 1 or sp.index_id = 0 ) and
spf.name = 'pf_year'
order by partition_number
FETCH NEXT FROM p_details
INTO @p_number, @p_rows, @boundary_value
print 'part no ' +
'# rows ' +
WHILE @@FETCH_STATUS = 0
print convert(char(10), @p_number) +
' ' +
convert(char(10), @p_rows) +
' ' +
-- stored procedure to get rid of a partition by specifying the partion number
create procedure dp_switch_partition_via_partno ( @partno int, @boundary int )
truncate table dp_test1_clone
alter table dp_test1 switch partition @partno to dp_test1_clone
alter partition function [pf_year]() merge range (@boundary)
-- stored procedure to get rid of a partition by specifying the boundary value
create procedure dp_switch_partition_via_boundary ( @boundary int, @merge_flag int )
if( @merge_flag = 1 )
alter table dp_test1 switch partition $PARTITION.pf_year(@boundary) to dp_test1_clone
if( @merge_flag = 2 )
if( @merge_flag = 3 )
-- test sample
-- create partitioned test table
print 'Test table with 5 boundary values : '
execute dp_list_partitions 'dp_test1'
print 'Switch partitions with boundary 1970,1980 via boundary value : '
execute dp_switch_partition_via_boundary 1970, 1
execute dp_switch_partition_via_boundary 1980, 1
print 'as expected boundary values 1970 and 1980 are gone and 1990 remains with 5 rows'
-- reset test table
print 'Switch partitions with boundary 1970,1980 via partno : '
execute dp_switch_partition_via_partno 2, 1970
execute dp_switch_partition_via_partno 3, 1980
print 'boundary values 1970 and 1980 are gone but 1990 remains with 2 rows which is unexpected !'
print 'Switch partitions with boundary 1970,1980 via boundary value with '
print '"deferred merge of the partition function" : '
-- switch out only
execute dp_switch_partition_via_boundary 1970, 2
execute dp_switch_partition_via_boundary 1980, 2
-- merge partition function only
execute dp_switch_partition_via_boundary 1970, 3
execute dp_switch_partition_via_boundary 1980, 3