SQL Server Storage Engine

Overhead of Row Versioning

Last week I was teaching a class on snapshot isolation and we discussed the overheads of snapshot isolation. There are three main overheads that you need to be aware of when you enable snapshot isolation read-committed-snapshot on a database. First, each UPDATE/DELETE operation generates a row version. Note, that the row version is not generated when you INSERT a new row. There is an exceptions to this, but I will skip that discussion here. Second, row versions are created and kept in tempdb that leads to increase space usage and IOs there. Third, the queries run under snapshot isolation or read-committed-snapshot need to traverse the row version chain which may lead to as many logical IOs and potentially as many 'random' physical IOs because row versions for a given row are not necessarily on the same or even contiguous physical page(s). It is interesting to note that if we did not traverse the row versions, the query would have blocked otherwise. So basically, at some expense of IO/CPU, the SQL Server can return the transactional consistent results without incurring blocking. We recommend enabling row versioning based isolation levels only when your application is incurring reader/writer blocking.

One question that came up was on the overhead of row versioning on the data or index row. When you enable snapshot isolation or read-committed-snapshot option on the database, the SQL Server starts adding a 14 byte overhead to each row to keep the following information:

  • XTS (transaction sequence number). It takes 6 bytes. This is used for marking the XSN that did the DML operation on the row
  • RID (row identifier) that points to the versioned row. It takes 8 bytes.

But this overhead is only added when an existing row is actually modified or if the database was already enabled for either of these options. Let me give a very simple example.

create table foo..t2_snapshot (c1 int)

go

 

insert into foo..t2_snapshot values (1)

go

 

 

-- show the max rowsize in bytes

select max_record_size_in_bytes

from sys.dm_db_index_physical_stats (db_id('foo'),

                                       object_id('foo.t2_snapshot'),

    null, null, 'DETAILED')

This will show a max row size of 11 bytes (note, the 7 bytes are the metadata overhead within the row and other 4 bytes store the integer value). Now, let me enable snapshot isolation on the database foo. Note, this is a metadata operation and it will NOT change the existing rows in the tables to add extra 14 bytes.

alter database foo set allow_snapshot_isolation ON

go

 

-- you can verify the datbaase state using the following 

select is_read_committed_snapshot_on, snapshot_isolation_state_desc,

snapshot_isolation_state from sys.databases where name='foo'

go

 

Now I run the query

select max_record_size_in_bytes

from sys.dm_db_index_physical_stats (db_id('foo'),

                                     object_id('foo.t2_snapshot'),

                                      null, null, 'DETAILED')

You will notice that the row length still remains 11 bytes. No extra 14 bytes overhead even though the database foo has been enabled for snapshot isolation.  Let us now update the data row as follows

update foo..t2_snapshot set c1 = 2

After the update, you will notice the length of the row has become 25 bytes. In other words, there is a 14 byte overhead. Also, the new rows that you insert will have a length of 25 bytes (including row versioning overhead of 14 bytes).There are two interesting observations to be made. First, an update of a fixed length column may lead to page splits because the length of the row has increased. This only happens the first time the row is updated. Second, the increased row length will cause the table size to grow. A typical row size in production is around 300 bytes. So this means that snapshot isolation or read-committed-snapshot will cause 5% growth in the size of the table (in the worst case when all the rows were updated). 

Now, if I disable the snapshot isolation and update the same row again, the extra 14 bytes will go away. This is because row versioning has been disabled. I can do the same by rebuilding the index

Thanks

Sunil

 

Published Sunday, March 30, 2008 12:41 AM by Sunil Agarwal
Filed under:

Comments

 

GProssliner said:

When RowVersioning is enabled on a Database (even with ROLLBACK IMMEDIATE), it takes about a second (with a small or empty database) or several minutes to complete. You say that the extra Data needed for each row is added in a lazy way only when needed (my assumption was that it does it when executing the ALTER DATABASE command), so

1. what happens when executing the ALTER DATABASE Command?

2. is there a kind of "formula" how long will it takes approximately for an existing Database? Does it depend solely on the Database-Size?

Thank you!

March 31, 2008 3:35 AM
 

Sunil Agarwal said:

No formula. It depends on the existing transactional activity in the database.

Here is from BOL:

For SNAPSHOT ISOLATION::

When you set ALLOW_SNAPSHOT_ISOLATION to a new state (from ON to OFF, or from OFF to ON), ALTER DATABASE does not return control to the caller until all existing transactions in the database are committed. If the database is already in the state specified in the ALTER DATABASE statement, control is returned to the caller immediately. If the ALTER DATABASE statement does not return quickly, use sys.dm_tran_active_snapshot_database_transactions to determine whether there are long-running transactions. If the ALTER DATABASE statement is canceled, the database remains in the state it was in when ALTER DATABASE was started. The sys.databases catalog view indicates the state of snapshot-isolation transactions in the database. If snapshot_isolation_state_desc = IN_TRANSITION_TO_ON, ALTER DATABASE ALLOW_SNAPSHOT_ISOLATION OFF will pause six seconds and retry the operation.

FOR RCSI::

To set READ_COMMITTED_SNAPSHOT ON or OFF, there must be no active connections to the database except for the connection executing the ALTER DATABASE command. However, the database does not have to be in single-user mode. You cannot change the state of this option when the database is OFFLINE.

March 31, 2008 12:36 PM
 

Eladio Rincón said:

I've done a few tests and reorganizing the index does not free the extra bytes. Am I correct? I understand that the difference is not big, because only affects to the modified rows, but there are scenarios where customers cannot afford rebuilding the index -- specially non-EE. Would be expensive including that code in the reorganize algorithm?

Thank you for this post Sunil.

Eladio

June 6, 2008 6:44 AM
 

Sunil Agarwal said:

Can you please send me a simple repro? thanks

June 6, 2008 1:16 PM
 

Eladio Rincón said:

The script is based on you example; in the original script before querying the DMV I inserted a waitfor because takes a bit refreshing the DMV; some of the comments are in spanish, but the TSQL is easy.

Thank you for checking it Sunit.

--

-- Implicaciones en filas de niveles de aislamiento row-versioning

--

-- Eladio Rincón (script basado en Sunil Agarwal )

-- http://blogs.msdn.com/sqlserverstorageengine/archive/2008/03/30/overhead-of-row-versioning.aspx

-- Solid Quality Mentors

--

set nocount on

go

use master

go

alter database Northwind

set allow_snapshot_isolation Off

go

alter database Northwind

set READ_COMMITTED_SNAPSHOT off

go

use northwind

go

if exists (select * from sys.tables where name = 't2_snapshot')

 drop table dbo.t2_snapshot

go

--

-- tabla de 1000 y pico bytes

--

create table dbo.t2_snapshot (

 c1 int identity

 , c2 char(1000) default('a')

 , constraint pk_t2_snapshot primary key (c1)

)

go

--

-- inserción de 1000 filas

--

insert dbo.t2_snapshot (c2)

select top (1000) 'a'

from [Order Details]

go

--

-- habilitar snapshot_isolation

--

-- no requiere libre de conexiones

-- si para read-commited-snapshot

--

alter database Northwind

set allow_snapshot_isolation ON

go

select

 index_level, min_record_size_in_bytes

 , avg_record_size_in_bytes, max_record_size_in_bytes

from sys.dm_db_index_physical_stats (

 db_id('Northwind')

 , object_id('t2_snapshot')

 , null, null, 'DETAILED')

go

/*

index_level min_record_size_in_bytes avg_record_size_in_bytes max_record_size_in_bytes

----------- ------------------------ ------------------------ ------------------------

0           1011                     1011                     1011

1           11                       11                       11

*/

update dbo.t2_snapshot set c2 = 'b'

where c1 = 1

go

select

 index_level, min_record_size_in_bytes

 , avg_record_size_in_bytes, max_record_size_in_bytes

from sys.dm_db_index_physical_stats (

 db_id('Northwind')

 , object_id('t2_snapshot')

 , null, null, 'DETAILED')

go

/*

index_level min_record_size_in_bytes avg_record_size_in_bytes max_record_size_in_bytes

----------- ------------------------ ------------------------ ------------------------

0           1011                     1011.014                 1025

1           11                       11                       11

*/

alter index pk_t2_snapshot

on dbo.t2_snapshot

reorganize

go

select

 index_level, min_record_size_in_bytes

 , avg_record_size_in_bytes, max_record_size_in_bytes

from sys.dm_db_index_physical_stats (

 db_id('Northwind')

 , object_id('t2_snapshot')

 , null, null, 'DETAILED')

go

/*

index_level min_record_size_in_bytes avg_record_size_in_bytes max_record_size_in_bytes

----------- ------------------------ ------------------------ ------------------------

0           1011                     1011.014                 1025

1           11                       11                       11

*/

alter index pk_t2_snapshot

on dbo.t2_snapshot

rebuild

go

select

 index_level, min_record_size_in_bytes

 , avg_record_size_in_bytes, max_record_size_in_bytes

from sys.dm_db_index_physical_stats (

 db_id('Northwind')

 , object_id('t2_snapshot')

 , null, null, 'DETAILED')

go

/*

index_level min_record_size_in_bytes avg_record_size_in_bytes max_record_size_in_bytes

----------- ------------------------ ------------------------ ------------------------

0           1011                     1011                     1011

1           11                       11                       11

*/

June 6, 2008 5:42 PM
 

Sunil Agarwal said:

The reorganize (i.e. defrag) does not remove the versioning information but  offline rebuild of index can remove it. But you will need to disable snapshot isolation. Please try the script  that I sent in my blog to start off.

June 6, 2008 8:51 PM
 

Eladio Rincón said:

The versioning informations is only removed when you rebuild the index. Reorganize does not remove the extra bytes in any case (with or without snapshot enabled).

Maybe si my version:

select @@version

Microsoft SQL Server 2005 - 9.00.3042.00 (Intel X86)

this is the script I've used. the only difference with your:

1) table with clustered index

2) table in Northwind Database

create table dbo.t2_snapshot (

 c1 int identity

 , c2 char(1000) default('a')

 , constraint pk_t2_snapshot primary key (c1)

)

this is the complete repro:

use Northwind

go

truncate table dbo.t2_snapshot

go

insert dbo.t2_snapshot (c2)

select top (1000) 'a'

from [Order Details]

go

alter database Northwind

set allow_snapshot_isolation ON

go

update dbo.t2_snapshot set c2 = 'b'

where c1 = 1

go

select max(max_record_size_in_bytes) max

from sys.dm_db_index_physical_stats (

 db_id('Northwind'),

 object_id('dbo.t2_snapshot'),

 null, null, 'DETAILED')

go

/*

max

-------

1025

*/

alter index pk_t2_snapshot

on dbo.t2_snapshot

reorganize

go

select max(max_record_size_in_bytes) max

from sys.dm_db_index_physical_stats (

 db_id('Northwind'),

 object_id('dbo.t2_snapshot'),

 null, null, 'DETAILED')

go

/*

max

-------

1025

11

*/

alter database Northwind

set allow_snapshot_isolation Off

go

alter index pk_t2_snapshot

on dbo.t2_snapshot

reorganize

go

select max(max_record_size_in_bytes) max

from sys.dm_db_index_physical_stats (

 db_id('Northwind'),

 object_id('dbo.t2_snapshot'),

 null, null, 'DETAILED')

go

/*

max

-------

1025

*/

go

alter index pk_t2_snapshot

on dbo.t2_snapshot

rebuild

go

select max(max_record_size_in_bytes) max

from sys.dm_db_index_physical_stats (

 db_id('Northwind'),

 object_id('dbo.t2_snapshot'),

 null, null, 'DETAILED')

go

/*

max

-------

1011

*/

go

truncate table dbo.t2_snapshot

go

insert dbo.t2_snapshot (c2)

select top (1000) 'a'

from [Order Details]

go

alter database Northwind

set allow_snapshot_isolation ON

go

update dbo.t2_snapshot set c2 = 'b'

where c1 = 1

go

select max(max_record_size_in_bytes) max

from sys.dm_db_index_physical_stats (

 db_id('Northwind'),

 object_id('dbo.t2_snapshot'),

 null, null, 'DETAILED')

go

/*

max

-------

1025

*/

alter index pk_t2_snapshot

on dbo.t2_snapshot

rebuild

go

select max(max_record_size_in_bytes) max

from sys.dm_db_index_physical_stats (

 db_id('Northwind'),

 object_id('dbo.t2_snapshot'),

 null, null, 'DETAILED')

go

/*

max

-------

1011

*/

June 8, 2008 6:44 AM
Anonymous comments are disabled

About Sunil Agarwal

Sunil Agarwal is a program manager in the SQL Server Storage Engine Group at Microsoft. He is responsible for concurrency, indexes, tempdb, LOBS, supportability, and bulk import/export. Prior to joining Microsoft, Sunil worked at DEC, Sybase, BMC Software and DigitalThink, focusing primarily on core database engine technologies and related applications.

© 2008 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Microsoft
Page view tracker