When you compress an object (index or table or partition), there are two components of space savings that you get. First component is fragmentaton (i.e. the original object might have been fragmented). The object gets degragmented as part of doing compression so you get some space savings. Second component is actual data compression savings. Many customers have asked the question 'Is there a way to know how much savings do I get from each of the two components?".

Interestingly, starting with SQL2008R2, you indeed have this capability. Here is one example where I create a table t1 and then simulate fragmentation by updating a column. Once the table is fragemented, I can use the sp_estimate_data_compression stored procedure to determine how much space savings I can get by defragmenting it.

drop table t1

go

 

-- Create the table

create table t1 (c1 int, c2 char(100), c3 varchar (300))

go

 

-- create a clustered index 

create clustered index t1_ci on t1(c1)

 

-- load 10000 rows. Note,  that for column C3, I am storing full-values that

-- take all 300 characters

declare @i int

select @i = 0

 

while (@i < 10000)

begin

     insert into t1 values (@i, 'hello', replicate ('1', 300))

     select @i = @i + 1

end

 

-- Now let us measure the average length of the row. It comes to 417 bytes.

-- This means that there are 19 rows per page

 

select avg_record_size_in_bytes

from sys.dm_db_index_physical_stats (db_id('TEMPDB'),

object_id('T1'),

                   null, null, 'SAMPLED')

where object_name (object_id) like 't1' AND index_id =1

 

 

-- check the space usage. This shows the space usage of 4296 KB

sp_spaceused t1

 

-- This update will cause fragmentation because I am changing the column C3 from 300

-- chars to 50 chars. With the result, the average row length

-- will now become 167 but the SQL Server will still keep 19 rows/page which

-- leaves more that 50% unused space on each page

update t1 set c3 = REPLICATE ('2', 50)

 

-- check the space usage. It shows that the allocated or reserved space

-- has not changed

  sp_spaceused t1

 

-- estimate how much space you can save by defragmentation. Note

-- here I am executing the stored procedure on an 'uncompressed' table

-- with target compression setting is 'NONE',(i.e. no compression). 

                                   

exec sp_estimate_data_compression_savings 'dbo',

                     'T1',1, NULL, 'none'

 

-- This stored proc returns

 

object_name    schema_name index_id    partition_number

------------   ----------- --------    ---------------

T1              dbo          1           1               

 

 

size_with_current_compression_setting(KB) size_with_requested_compression_setting(KB)

----------------------------------------  ------------------------------------------

4480                                      1720     

 

-- To verify this, I rebuild the index and see the space consumption             

alter index t1_ci on t1 rebuild

 

-- check the space usage. It shows that the table now takes 1744KB

  sp_spaceused t1

 

In SQL2008RTM, the stored procedure did not compute the space savings if the source and target compression settings were same. In otherwords, the stored proc would have returned the following instead

 

size_with_current_compression_setting(KB) size_with_requested_compression_setting(KB)

----------------------------------------  ------------------------------------------

4480                                      4480 

 

 

Interestingly, you can use it to get tell-tale signs to know if ROW compressed table already takes advantage of unicode compression in SQL2008R2. Let us say customer was using ROW compression in SQL2008 and now upgrades to SQL10.5, s/he can use to the sp_estimate_data_compression_savings stored proc by specifying target compression to be ROW as well. Note, in this case, the compression savings will include the potential savings from data defragmentation as well so one can’t easily deduce that the savings are from Unicode compressions or due to both. However, customer can  get some clues by referring to the sys.dm_db_index_physical_stats DMV to see if the object in question was fragmented or not.

 

Thanks

Sunil