SQL Server Storage Engine

What happens to non-clustered indexes when the table structure is changed?

Here’s a topic that’s cropped up several times during Q&As at TechEd this year – what happens to non-clustered indexes when changes are made to the underlying table? Are they always rebuilt or not?

 

Before we get into that discussion, I’ll give you a little background.

 

One way to describe a non-clustered index is whether it is a covering index or not. A covering index is one which has all the table columns necessary to satisfy a query, and so there is no need to go back to the underlying table (which I’ll call the base table) to fetch additional columns. Most non-clustered indexes are not covering indexes and so it is necessary for each non-clustered index record to include a link back to the corresponding base table record. This link takes various forms, depending on the structure of the base table:

  • Heap: A heap does not impose any logical ordering on the records within it, so the only way to find a particular record within a heap is to do a full table scan or to know the physical location of the record (i.e. which record slot on which data page) – called the physical record ID or physical RID. Clearly doing a full table scan every time a query needs to go from a non-clustered index record back to a heap is not feasible, so each non-clustered index record stores the physical RID of the corresponding heap record.
  • Clustered index: A clustered index does impose a logical ordering on the records within it and the ordering exactly matches the definition of the clustered index keys – known as the cluster key. The fastest way to find a particular record in a clustered index is to use the cluster key to navigate through the clustered index b-tree to find the exact record. So, non-clustered indexes over a clustered index include the cluster key of the base table record in each non-clustered index record. This is known as the logical RID. There is a twist though, depending on whether the clustered index is unique or not.
    • Non-unique clustered index: Every record in a clustered index HAS to be unique, otherwise there would be no way to deterministically navigate to a particular record using the index b-tree. In a non-unique clustered index, SQL Server has to add a special column, called the uniquifier, to each record, so that if multiple records have the same cluster key values, the uniquifier column will be the tie-breaker. This uniquifier column is added as part of the cluster key, and so it is also present in all non-clustered index records as part of the logical RID.
    • Unique clustered index: Every record in a unique clustered index is already unique and so no extra column is required in the cluster key.

Ok - background out of the way. Now let's discuss what happens to non-clustered indexes when various actions are performed to the base table.

 

SQL Server 2000

  • Going from a heap to a clustered index: Changing the fundamental structure of a table from a heap to a clustered index means you're changing the way that records can be looked up in the table, from physical to logical. This means that all the physical RIDs that are stored in the non-clustered index records need to be changed to logical RIDs. So - in this case, all non-clustered indexes are rebuilt after the clustered index has been built.
  • Going from a clustered index to a heap: This is the opposite of the case above - you're changing the way records are looked up from logical to physical. This means that all the logical RIDs that are stored in the non-clustered index records need to be changed to physical RIDs. So - in this case, all non-clustered indexes are rebuilt after the clustered index has been dropped.
  • Rebuilding a unique clustered index: This operation has no effect on the cluster key, so there is no need for the non-clustered indexes to be rebuilt. Various early Service Packs of SQL Server 2000 had bugs that caused this behavior to be broken - and this is the cause of much of the confusion around this behavior.
  • Rebuilding a non-unique clustered index: In SQL Server 2000, when a non-unique clustered index (which contains uniquifier columns) is rebuilt, all the uniquifier values are regenerated. This essentially means that the cluster keys have changed and so all the non-clustered indexes are rebuilt after the clustered index rebuild has completed.
  • Changing the clustered index schema: This is simple - any change to the clustered index schema that changes the cluster keys will cause all the non-clustered indexes to be rebuilt.

 SQL Server 2005

  • Going from a heap to a clustered index: This is the same as SQL Server 2000 - all non-clustered indexes are rebuilt, regardless of whether the clustered index was created online or offline.
  • Going from a clustered index to a heap: Again, all non-clustered indexes are rebuilt, regardless of whether the clustered index is dropped online or offline.
  • Rebuilding a unique clustered index: This is the same as SQL Server 2000 - the cluster keys aren't changing and so the non-clustered indexes are not rebuilt.
  • Rebuilding a non-unique clustered index: Aha! This is different from SQL Server 2000. SQL Server 2005 will RE-USE the old uniquifier values so the cluster keys don't change. This means that non-clustered indexes are NOT rebuilt in this case - that's very cool!
  • Changing the clustered index schema:
    • Changing the cluster key: This behavior has to be the same as SQL Server 2000 - any change to the cluster keys forces all the non-clustered indexes to be rebuilt after the new clustered index has been created.
    • Changing anything else (e.g. partitioning it or moving it): This is one of the cases that confuses people in SQL Server 2005. Applying a partitioning scheme to a clustered index, or moving it to a different filegroup, doesn't change the cluster key at all - so there's no need to rebuild the non-clustered indexes.

So the rule of thumb to remember is: if the physical or logical RID contained in the non-clustered index records needs to change then all the non-clustered indexes need to be rebuilt. Its as simple as that.

 

Hopefully this will help to clear up some confusion - let me know of any questions!

Published Thursday, June 07, 2007 9:52 AM by Paul Randal - MSFT
Filed under:

Comments

 

arun.philip said:

Quite informative. Thank you, Paul

June 8, 2007 4:27 AM
 

lverhave said:

Hi Paul,

You mentioned in "changing the clustered index schema - changing anything esle" in SQL Server 2005 that if you move the clustered index to a different filegroup one would not have to rebuild the non-clustered indexes. What if the filegroup is in another drive, wouldn't the base table also be moved thus changing the physical RID?

Thanks

Leigh

June 8, 2007 4:27 AM
 

parikht said:

This is a very helpful article. Your rule of thumb says it all. If we start thinking of rebuilding indexes in that manner, it's pretty straight forward then. Thank you, Pal, for this helpful information.

Tejas

June 11, 2007 12:59 PM
 

Paul Randal - MSFT said:

Hi Leigh,

Moving a clustered index has no effect on non-clustered indexes as the RID that is included in the non-clustered index is purely logical - it has nothing to do with the physical location of the clustered index.

Thanks

June 11, 2007 6:31 PM
 

KS. said:

We have a heap table with non-clustered indexes on it ,If we partition it and move the half the data & related index from current filegroup to other filegroup which is on other drive will it affect the non-clustered index ie will they be rebuilt .

In 2000 does the non-clustered index (on Heap )needs a rebuilt when we move it to other file group on other disk .

June 13, 2007 7:10 AM
 

Paul Randal - MSFT said:

KS - yes - ANY time the physical or logical RID changes, non-clustered indexes have to be rebuilt. In your example, the physical RID of the heap rows will change when they're moved to the new filegroups.

Thanks

June 13, 2007 2:16 PM
 

Michael's Coding Den said:

Last update: June 13 , 2007 Document version 0.6 Preface If you have something to add, or want to take

June 13, 2007 3:13 PM
 

Michael's Coding Den said:

Last update: June 13 , 2007 Document version 0.6 Preface If you have something to add, or want to take

June 13, 2007 3:13 PM
 

TiggerJoe said:

Moving a unique clustered index to a different filegroup, or anywhere else, doesn't affect the nonclustered indexes because the leaf level of the nonclsutered index references sthe clustering key of the clustered index, not a RID or logical RID. The engine works through the B-Tree of the clustered index using the clustering key. A search using a non-clustered index where there is a clsutered index, works through two B-Trees to get to the full data row: one search through the B-Tree for the nonclustered index, and one for the clustered index.

January 2, 2008 11:38 AM
Anonymous comments are disabled

About Paul Randal - MSFT

Paul started in the industry in 1994 working for DEC on the VMS file system and check/repair tools. In 1999 he moved to Microsoft to work on SQL Server, specifically on DBCC. For SQL Server 2000, he concentrated on index fragmentation (writing DBCC INDEXDEFRAG and DBCC SHOWCONTIG) plus various algorithms in DBCC CHECKDB. During SQL Server 2005 development Paul was the lead developer/manager of one the core dev teams in the Storage Engine, responsible for data access and storage (DBCC, allocation, indexes & heaps, pages/records, text/LOB storage, snapshot isolation, etc). He also spent several years rewriting DBCC CHECKDB and repair. For SQL Server 2008, Paul managed the Program Management team for the core Storage Engine to become more focused on customer/partner engagement and feature set definition. In 2007, after 8.5 years on the SQL Server team, Paul left Microsoft to join his wife, Kimberly Tripp, running SQLskills.com and pursuing his passion for presenting and consulting. Paul regularly presents at conferences and user groups around the world on high-availability, disaster recovery and Storage Engine internals. His popular blog is at http://www.sqlskills.com/blogs/paul/.

© 2008 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Microsoft
Page view tracker