SQL Server Storage Engine

  • Update on minimal logging for SQL Server 2008

    As part of SQL2008 release, you can get minimal logging when bulk importing into a HEAP using the following command

    insert into <heap> with (TABLOCK) select * from <source>

    This was detailed in the blog  entry http://blogs.msdn.com/sqlserverstorageengine/archive/2008/03/06/minimal-logging-changes-in-sql-server-2008-part-1.aspx.

    We had similar functionality available when inserting data into btree in SQL Server 2008 but it was removed couple of months back but unfortunately I did not update my blog. I have now removed those blog entries. My apologies for the confusion caused. We will reconsider this functionality in the next release of SQL Server.

     

  • Does data compression lead to more fragmentation?

    Lately I have been asked how data compression impacts fragmentation (i.e. does it cause more or less fragmentation?).  I believe this question is best answered by looking at how does fragmentation occur in the first place and then analyze each of these cases for compressed and uncompressed data.

    Let us start off with create clustered index on a table so  there is no fragmentation to begin with. Now the following operations will cause fragmentation

    ·         Delete Operaton: This means the pages are not as full as they can be. The delete operation will cause fragmentation both with/without data compression  but the impact will be lower with data compression. For example, if you assume 50% compression (i.e. compressed data row is ½ the size of the uncompressed row) for an average uncompressed rowsize of 200 bytes. Now if we delete 10 rows on a page, the uncompressed page will have 2000 bytes of free space (i.e. 25% additional fragmentation within the page) while the compressed page will have 1000 free bytes (i.e. 12.5% additional fragmentation within the page)

    ·         Insert Operation: when a row is inserted, the SQL Server will find the appropriate page to insert the row. If there is space in the page, the row will be inserted without additional fragmentation. However, if there is not sufficient space on the page to store the row, it will cause a page split which will potentially lead to fragmentation. You should see similar fragmentation with/without data compression

    ·         Update Operation: An update operation can cause a page split if the update is causes rowsize to increase, a common case when updating variable length columns. As you may recall, both with ROW and PAGE compression, we store fixed length data type as variable length. So if you are changing the value of a say an integer column from 1 to 100000, it will cause the rowsize  to increase from 1 byte to 3 bytes.  If there was no space on the page, it will lead to page split thereby causing potential fragmentation. However, if you had not compressed the data, then fixed length columns take the maximum space possible in the row so such updates will not cause any changes to the size of the row. So if your application changes fixed length column types, you can potentially see additional fragmentation with data compression.

    As you can see from the discussion above, the impact of data compression on fragmentation will depend on the schema, data distribution and the application. You can minimize fragmentation by using a lower fill factor but then you table will take more space.

  • Enabling FILESTREAM post SQL2008 Setup - a Known Issue in SQL Config Manager

    In SQL Server 2008 Feb CTP, there is a bug in the WMI provider for FILESTREAM, inside SQL Configuration Manager (select instance, r-click properties and go to FILESTREAM tab). This bug causes the FILESTREAM configuration UI, to fail. This happens only if:

        1) you want to enable FILESTREAM after SQL setup

        2) another instance of SQL 2005 or SQL 2000 (Pre-SQL2008) is installed on the same machine, for example when you install SQL 2008 (any SKU other than CEC) on a machine running Visual Studio 2008 (which installs SQL 2005 Express by default). 

     

    The most common case is when you decide to install SQL 2008 on a machine where you installed Visual Studio 2008 (install with default settings) and during SQL 2008 setup you didn't want to enable FILESTREAM; then decide to decide to enable FILESTREAM post setup (for example, in order to use the SQL 2008 AdventureWorks database samples, which require FILESTREAM to be enabled). In this case, you need to enable FILESTREAM in SQL Config Manager first for needed Windows configuration steps. 

     

    This issue will repro also if you install SQL2008 first (FILESTREAM disabled), install Vidual Studio second (which installs SQL 2005) then decide to enable FILESTREAM (which has to be done in SQL Config Manager for Windows settings then in Management studio for SQL instance specific settings).

     

    This UI issue will be fixed for SQL 2008 (whatever next milestone).

     

     

    Workarounds for SQL 2008 February CTP and Upcoming Release Candidate 0:

    Any of the following will work, but option 2 is the recommended one:

     

          1)      Uninstall the SQL 2005 instance, or

    2)      During SQL 2008 setup, check the enable FILESTREAM box, or

    3)      If Filestream was not enabled during setup:

    a.      Copy the WMI script on http://www.codeplex.com/SQLSrvEngine/Wiki/View.aspx?title=FileStreamEnable&referringTitle=Home to a local folder.

    b.     Run the script as explained on codeplex.

     

    thanks!

    joanna

  • TempDB:: Table variable vs local temporary table

    As you know the tempdb is used by user applications and SQL Server alike to store transient results needed to process the workload. The objects created by users and user applications are called ‘user objects’ while the objects created by SQL Server engine as part of executing/processing the workload are called ‘internal objects’. In this blog, I will focus on user objects and on table variable in particular.

    There are three types of user objects; ##table, #table and table variable. Please refer to BOL for specific details. While the difference between ##table (global temporary table) and #table (local temporary table) are well understood, there is a fair amount of confusion between #table and table variable.  Let me walk through main differences between these.

    A table variable, like any other variable, is a very useful programming construct. The scoping rules of the table variable are similar to any other programming variables. For example, if you define a variable inside a stored procedure, it can’t be accessed outside the stored procedure. Incidentally, #table is very similar. So why did we create table variables? Well, a table variable can be very powerful when user with stored procedures to pass it as input/output parameters (new functionality available starting with SQL Server 2008) or to store the result of a table valued function. Here are some similartities and differences between the two:

    ·         First, the table variable is NOT necessarily memory resident. Under memory pressure, the pages belonging to a table variable can be pushed out to tempdb. Here is one example shows space taken by table variable in tempdb

    use tempdb

    go

     

    drop table #tv_source

    go

     

    create table #tv_source(c1 int, c2 char(8000))

    go

     

    declare @i int

    select @i = 0

    while (@i < 1000)

    begin

           insert into #tv_source values (@i, replicate ('a', 100))

           select @i = @i + 1

    end

     

    DECLARE @tv_target TABLE (c11 int, c22 char(8000))

     

     

     INSERT INTO @tv_target (c11, c22)

        SELECT c1, c2

        FROM  #tv_source

     

    -- checking the size through DMV.

    -- The sizes here are in 8k pages. This shows the allocated space

    -- to user objects to be 2000 pages (1000 pages for #tv_source and

    -- 1000 pages for @tv_target

     

    Select total_size = SUM (unallocated_extent_page_count) +

    SUM (user_object_reserved_page_count) +

    SUM (internal_object_reserved_page_count) +

    SUM (version_store_reserved_page_count) +

    SUM (mixed_extent_page_count),

     

    SUM (unallocated_extent_page_count) as freespace_pgs,

    SUM (user_object_reserved_page_count) as user_obj_pgs,

    SUM (internal_object_reserved_page_count) as internal_obj_pgs,

    SUM (version_store_reserved_page_count)  as version_store_pgs,

    SUM (mixed_extent_page_count) as mixed_extent_pgs

    from sys.dm_db_file_space_usage

    ·         Second, when you create a table variable, it is like a regular DDL operation and its metadata is stored in system catalog. Here is one example to check this

    declare @ttt TABLE(c111 int, c222 int)

    select name from sys.columns where object_id > 100 and name like 'c%'

    This will return two rows containing columns c111 and c222. Now this means that if you were encountering DDL contention, you cannot address it by changing a #table to table variable.

    ·         Third, transactional and locking semantics. Table variables don’t participate in transactions or locking. Here is one example


    -- create a source table

    create table tv_source(c1 int, c2 char(100))

    go

     

    declare @i int

    select @i = 0

    while (@i < 100)

    begin

           insert into tv_source values (@i, replicate ('a', 100))

           select @i = @i + 1

           end

    -- using #table

    create table #tv_target (c11 int, c22 char(100))

    go

     

    BEGIN TRAN

     

        INSERT INTO #tv_target (c11, c22)

                SELECT c1, c2

                FROM  tv_source

     

     

    -- using table variable

     

    DECLARE @tv_target TABLE (c11 int, c22 char(100))

     

    BEGIN TRAN

           INSERT INTO @tv_target (c11, c22)

            SELECT c1, c2

        FROM  tv_source

     

     

    -- Now if I look at the locks, you will see that only

    -- #table takes locks. Here is the query that used

    -- to check the locks   

    select 

        t1.request_session_id as spid, 

        t1.resource_type as type,  

        t1.resource_database_id as dbid, 

        (case resource_type

          WHEN 'OBJECT' then object_name(t1.resource_associated_entity_id)

          WHEN 'DATABASE' then ' '

          ELSE (select object_name(object_id) 

                from sys.partitions 

                where hobt_id=resource_associated_entity_id)

        END) as objname, 

        t1.resource_description as description,  

        t1.request_mode as mode, 

        t1.request_status as status,

           t2.blocking_session_id

    from sys.dm_tran_locks as t1 left outer join sys.dm_os_waiting_tasks as t2

    ON t1.lock_owner_address = t2.resource_address

     

    Another interesting aspect is that if I rollback the transaction involving the table variable earlier, the data in the table variable is not rolled back.

    Rollback

    -- this query will return 100 for table variable but 0 for #table.

    SELECT COUNT(*) FROM @tv_target

    ·         Fourth, the operations done on table variable are not logged. Here is the example I tried

    -- create a table variable, insert bunch of rows and update

    DECLARE @tv_target TABLE (c11 int, c22 char(100))

     

    INSERT INTO @tv_target (c11, c22)

        SELECT c1, c2

        FROM  tv_source

     

     

    -- update all the rows

    update @tv_target set c22 = replicate ('b', 100)

     

     

    -- look at the top 10 log records. I get no records for this case

    select top 10 operation,context, [log record fixed length], [log record length], AllocUnitId, AllocUnitName

    from fn_dblog(null, null)

    where AllocUnitName like '%tv_target%'

    order by [Log Record Length] Desc

     

    -- create a local temptable

    drop table #tv_target

    go

     

    create table #tv_target (c11 int, c22 char(100))

    go

     

    INSERT INTO #tv_target (c11, c22)

        SELECT c1, c2

        FROM  tv_source

     

    -- update all the rows

    update #tv_target set c22 = replicate ('b', 100)

     

     

    -- look at the log records. Here I get 100 log records for update

    select  operation,context, [log record fixed length], [log record length], AllocUnitName

    from fn_dblog(null, null)

    where AllocUnitName like '%tv_target%'

    order by [Log Record Length] Desc

     

    ·         Fifth, no DDL is allowed on table variables. So if you have a large rowset which needs to be queried often, you may want to use #table when possible so that you can create appropriate indexes. You can get around this by creating unique constraints when declaring table variable.

     

    ·         Finally, no statistics is maintained on table variable which means that any changes in data impacting table variable will not cause recompilation of queries accessing table variable. Queries involving table variables don't generate parallel plans.

     

     

    Thanks

    Sunil

     

  • Overhead of Row Versioning

    Last week I was teaching a class on snapshot isolation and we discussed the overheads of snapshot isolation. There are three main overheads that you need to be aware of when you enable snapshot isolation read-committed-snapshot on a database. First, each UPDATE/DELETE operation generates a row version. Note, that the row version is not generated when you INSERT a new row. There is an exceptions to this, but I will skip that discussion here. Second, row versions are created and kept in tempdb that leads to increase space usage and IOs there. Third, the queries run under snapshot isolation or read-committed-snapshot need to traverse the row version chain which may lead to as many logical IOs and potentially as many 'random' physical IOs because row versions for a given row are not necessarily on the same or even contiguous physical page(s). It is interesting to note that if we did not traverse the row versions, the query would have blocked otherwise. So basically, at some expense of IO/CPU, the SQL Server can return the transactional consistent results without incurring blocking. We recommend enabling row versioning based isolation levels only when your application is incurring reader/writer blocking.

    One question that came up was on the overhead of row versioning on the data or index row. When you enable snapshot isolation or read-committed-snapshot option on the database, the SQL Server starts adding a 14 byte overhead to each row to keep the following information:

    • XTS (transaction sequence number). It takes 6 bytes. This is used for marking the XSN that did the DML operation on the row
    • RID (row identifier) that points to the versioned row. It takes 8 bytes.

    But this overhead is only added when an existing row is actually modified or if the database was already enabled for either of these options. Let me give a very simple example.

    create table foo..t2_snapshot (c1 int)

    go

     

    insert into foo..t2_snapshot values (1)

    go

     

     

    -- show the max rowsize in bytes

    select max_record_size_in_bytes

    from sys.dm_db_index_physical_stats (db_id('foo'),

                                           object_id('foo.t2_snapshot'),

        null, null, 'DETAILED')

    This will show a max row size of 11 bytes (note, the 7 bytes are the metadata overhead within the row and other 4 bytes store the integer value). Now, let me enable snapshot isolation on the database foo. Note, this is a metadata operation and it will NOT change the existing rows in the tables to add extra 14 bytes.

    alter database foo set allow_snapshot_isolation ON

    go

     

    -- you can verify the datbaase state using the following 

    select is_read_committed_snapshot_on, snapshot_isolation_state_desc,

    snapshot_isolation_state from sys.databases where name='foo'

    go

     

    Now I run the query

    select max_record_size_in_bytes

    from sys.dm_db_index_physical_stats (db_id('foo'),

                                         object_id('foo.t2_snapshot'),

                                          null, null, 'DETAILED')

    You will notice that the row length still remains 11 bytes. No extra 14 bytes overhead even though the database foo has been enabled for snapshot isolation.  Let us now update the data row as follows

    update foo..t2_snapshot set c1 = 2

    After the update, you will notice the length of the row has become 25 bytes. In other words, there is a 14 byte overhead. Also, the new rows that you insert will have a length of 25 bytes (including row versioning overhead of 14 bytes).There are two interesting observations to be made. First, an update of a fixed length column may lead to page splits because the length of the row has increased. This only happens the first time the row is updated. Second, the increased row length will cause the table size to grow. A typical row size in production is around 300 bytes. So this means that snapshot isolation or read-committed-snapshot will cause 5% growth in the size of the table (in the worst case when all the rows were updated). 

    Now, if I disable the snapshot isolation and update the same row again, the extra 14 bytes will go away. This is because row versioning has been disabled. I can do the same by rebuilding the index

    Thanks

    Sunil

     

  • CHECKSUM and Tempdb

    You may recall that starting with SQL Server 2005, you have an option available to enable CHECKSUM on the user databases.  For details, please refer to http://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/29/Enabling-CHECKSUM-in-SQL2005.aspxIn fact, any new database created in SQL2005 has CHECKSUM enabled automatically but it does not happen to databases that are upgraded from previous versions of SQL Server. For the upgraded databases from SQL Server versions earlier than SQL Server2005, you will need to enable CHECKSUM explicitly using ALTER DATABASE command. Enabling CHECKSUM is critical and it allows SQL Server to detect the corruption in the IO path (e.g. a disk malfunction) when the page is read as part of query or when you run DBCC CHECKDB command. While this has been very useful to our customers, there was one missing link and that was that SQL Server did not allow enabling CHECKSUM on tempdb. This provided a window where a page corrupted due to mis-behaving disk found its way into user database even if you had enabled CHECKSUM on the databse. Here is one such scenario

    Scenario:: bulk import the data into a temp table for staging purposes and then move it to the user database. The user database has CHECKSUM enabled, so the new page, when written to the disk, will have checksum computed. But guess what is missing? If the tempdb disk corrupts the pages in tempdb, SQL Server will have no way of knowing that the page was corrupted and it will go to user database without detection. Yes, when the page is subsequently read, depending upon what the corruption was, the SQL Server may detect it or may not. For example if a bit flip happened for the integer value, it will go undetected.

    With CHECKSUM available on tempdb starting with SQL2008, you can finally close this window.  You can use the following command

    ALTER DATASE tempdb set PAGE_VERIFY CHECKSUM

    For new installs of SQL Server 2008, all tempdbs will have CHECKSUM enabled by default. You can always disable it using ALTER DATABASE command but we don't recommend it. For databases upgraded to SQL Server 2008, you will need to explicitly enable CHECKSUM on the tempdb. We measured the performance impact of enabling CHECKSUM in tempdb and the impact is very low (<2% of CPU) which is similar to what you would expect in user database. Since the CHECKSUM is only computed when page is written to the disk, the added point is that there is no 'checkpoint' in tempdb, so a page in tempdb is written to disk ONLY under memory pressure. So you may not see as many CHECKSUM calculations in tempdb.

     Note, this is not available in CTP-6 but will be available in RTM bits.

    Thanks

    Sunil

  • Minimal Logging changes in SQL Server 2008

    Please refer to the earlier post for the background information on minimal logging. Starting with SQL Server 2008, the minimal logging has been enhanced. These enhancements are available to regular TSQL Insert as well. One of the key customer scenario was to be able to transfer data from a staging table to the target table. The only choice the customers had was to use SELECT * INTO <target> from <staging-table> if they wanted minimal logging. The limitation of this solution was that customers had no control on DDL aspect of the target table. WIth this new enhancement, this restriction is now removed. 

    I used a flavor of the following query to find log records

     

    select top 10 operation,context, [log record fixed length], [log record length], AllocUnitId, AllocUnitName

    from fn_dblog(null, null)

    where allocunitname='dbo.t_heap'

    order by [Log Record Length] Desc

    (1)    Insert into a HEAP is minimally logged under TABLOCK but fully logged without TABLOCK. This is one of the features that customers have been asking for. By the way, the only drawback is that it holds X lock, unlike Bulk Insert which holds BU lock, so you cannot insert using multiple threads.

     

    -- create the source table

    create table t_source (c1 int, c2 int, c3 char (100), c4 char(1000))

    go

     

    declare