DML operations on a HEAP and compression

DML operations on a HEAP and compression

  • Comments 3

In my previous blog http://blogs.msdn.com/sqlserverstorageengine/archive/2008/12/20/enabling-compression-on-a-heap.aspx, I described how to enable compression on a HEAP. You can enable either ROW or PAGE compression on the heap. Let me now describe the implications of DML operations on a heap.

ROW compression: No issues. The updated/inserted data stays ROW compressed though out its life time.

PAGE Compression: There are two key points to note (a) when you enable PAGE compression on an index or a heap, SQL Server evaluates each page and only enable PAGE compression if the space savings crosses a minimal (internally defined) threshold. So it is possible that when you enable PAGE compression on a heap, not all pages will have PAGE compression structures.  (b) The PAGE compression is only considered when the insert of a row causes a page split. Since pages are never split in a heap either for update (it uses row-forwarding) or insert operations, the DML operation on a heap offer special challenges with PAGE compression.

Let us start with the assumption that the heap has many pages worth of rows and all of them are PAGE compressed

·         Update operations on existing rows will continue to use PAGE compression information (page dictionary and column prefix) as long as the updated row stays on the page and does not get forwarded. 

·         When the number of DML operations on a given page exceeds an internally defined threshold, the column prefix and page dictionary may get re-computed. Again, this is not any different than btree.

·         For Insert operations, the PAGE compression may or may not get preserved. The behavior is as follows

o   If a singleton Insert goes to an existing page with PAGE compression, SQL Server will see if it can take advantage column prefix and page dictionary structures. Note, this happens only on the page(s) that are PAGE compressed.

o   If a singleton insert goes to an existing page that does not have PAGE compression structures, it is only ROW compressed. A singleton insert into a heap will never cause a conversion to PAGE compression for the reasons mentioned earlier.

o    As a special case, the PAGE compression for new rows is only supported for the following cases.

§  Optimizd Bul lk Load into a heap

§  Insert into <heap> with (TABLOCK) select  * from <source>. Note, that the TABLOCK is required and if you don't specify it, the rows that land on a new page will not be PAGE compressed but still be ROW compressed.

In these cases, when the data page does not have the space to accommodate the new row being inserted, it evaluates if PAGE compression will save space greater than the internally defined threshold and if the answer is yes, it enables PAGE compression on the page and the row is then inserted on that page.

 

Leave a Comment
  • Please add 7 and 2 and type the answer here:
  • Post
  • I know this is an old article, BUT ... it's sometimes recommended (in various blogs on the Web) to drop all indexes (changing a table into a heap) before inserting or update any significant amount of data.  

    I prefer to dlsable the non-clustered indexes and then re-enable them instead of dropping them and re-defining them.  (Sometimes I leave the indexes in place to avoid the overhead of rebuilding them after the data updates.)  Of course, you can't disable the primary key (actually, the clustered index) and then access the table at all.

    Are there any considerations for compressed tables related to dropping and rebuilding indexes?

  • HI David,

    >>Are there any considerations for compressed tables related to dropping and rebuilding indexes?

    Dropping clsutered index with page compression on will leave heap in page compressed state. Dropping a CI will not remove page compression from heap.

    blogs.msdn.com/.../enabling-compression-on-a-heap.aspx

  • HI Sunil,

    I was observing this behavior (what you mentioned for heap) in clustered index and found out what you mentioned for heAP also holds *somewhat* for ci Indexes as well. I will try to reproduce information though query.In CI when I ran sys.dm_db_index_physical_stats the col compressed_page_count was only updated when index page was more tahn half filled it actually did not showed number of pages compressed. More details in below link

    social.technet.microsoft.com/.../25119.sql-server-what-does-column-compressed-page-count-value-signify-in-dmv-sys-dm-db-index-physical-stats.aspx

Page 1 of 1 (3 items)