Last week, we touched on the topic of fragmentation in Fragmentation and What You Can Do about It. This week, we take a deeper look into the problem and tell you how to defragment your indexes.

Why You Should Defragment Indexes

It's all about performance. If you have correctly installed and configured SQL Server and Windows Server on a reasonably up-to-date computer, defragmenting your indexes is a good way to help get the best performance from Teamcenter running on SQL Server.

You have to defragment to grease the wheels. It's just one of those things, you know? If you do it regularly and well, things run great, and nobody knows why. But if you don't, things slow down, and everybody calls you.

Causes of Index Fragmentation

Index fragmentation can occur when data in a table or indexed view is modified. When you perform data operations such as the INSERT or UPDATE statement in a table, the underlying clustered and nonclustered indexes will be modified if they cover the range of data that was modified. This modification can cause index page splits. If the modified data that belongs to the index can't be accommodated in the same page, a new leaf page will be added. This new leaf page will contain part of the original page and space for new data. While the new leaf page will maintain the logical ordering of the index data rows, it usually will not be next to the old leaf page. So, the logical ordering of the pages will not be maintained.

2010-03-17-1

This illustration shows the leaf ordering of a virgin index. Note the tidy physical and logical sequence.

2010-03-17-2

This illustration shows the leaf ordering after just a single page split. How unordered will your index leaf be after many page splits?

But it gets worse.

2010-03-17-3

The red arrow shows the allocation order. The black arrows show the logical order.

This illustration shows what happens to the leaf order after a series of random inserts and deletions.

Based on these diagrams, you can see that two forms of fragmentation exist within an index: external fragmentation and internal fragmentation. In the last diagram, the black lines represent external fragmentation, and the amount of data that is filling each index page depicts internal fragmentation.

External fragmentation is always undesirable in an index, but a small amount of internal fragmentation can be desirable in highly transactional databases. However, both large-scale internal and external fragmentation adversely affect data retrieval performance.

External fragmentation causes a noncontiguous sequence of index pages on the disk, with new leaf pages far from the original leaf pages and their physical ordering different from their logical ordering. Consequently, a range scan on an index will need more switches between the corresponding extents than ideally required. Also, a range scan on an index will be unable to benefit from read-ahead operations that are performed on the disk. If the pages are arranged contiguously, a read-ahead operation can read pages in advance without much head movement.

In the case of internal fragmentation, rows are distributed sparsely across a large number of pages, increasing the number of disk I/O operations required to read the index pages into memory and the number of logical reads required to read multiple index rows from the memory.

For general index fragmentation information, you can run the following query in SQL Server Management Studio on the database that you want to analyze:

SELECT * 
FROM sys.dm_db_index_physical_stats (NULL,NULL,NULL,NULL,'DETAILED')

To get the indexes that need to be reorganized, you can run the following query in SQL Server Management Studio, changing the USE statement to the database that you want to check:

USE AdventureWorks
GO

SELECT OBJECT_NAME(ps.[object_id]) AS 'Table Name'
    ,ind.[name] AS 'Index Name'
    ,partition_number 
    ,ps.avg_fragmentation_in_percent --Logical Fragmentation
    ,ps.avg_page_space_used_in_percent --Page Density
FROM sys.dm_db_index_physical_stats (
    NULL
    ,NULL
    ,NULL
    ,NULL
    ,'DETAILED'
    ) ps
INNER JOIN sys.indexes ind
    ON ps.[object_id] = ind.[object_id]
    AND ps.index_id = ind.index_id
WHERE ind.[name] IS NOT NULL
    --AND DATA PAGES > 8 (Table must have more than 8 pages)
    --Logical Fragmentation
    AND (ps.avg_fragmentation_in_percent BETWEEN 10 AND 15
    --Page Density
    OR ps.avg_page_space_used_in_percent BETWEEN 60 AND 75)

To get the indexes that need to be rebuilt, you can run the following query in SQL Server Management Studio, changing the USE statement to the database that you want to check:

USE AdventureWorks
GO

SELECT OBJECT_NAME(ps.[object_id]) AS 'Table Name'
    ,ind.[name] AS 'Index Name'
    ,partition_number 
    ,ps.avg_fragmentation_in_percent --Logical Fragmentation
    ,ps.avg_page_space_used_in_percent --Page Density
FROM sys.dm_db_index_physical_stats (
    NULL
    ,NULL
    ,NULL
    ,NULL
    ,'DETAILED'
) ps
    INNER JOIN sys.indexes ind ON ps.[object_id] = ind.[object_id]
        AND ps.index_id = ind.index_id
WHERE 
    ind.[name] IS NOT NULL
    AND (ps.avg_fragmentation_in_percent > 15
    OR (ps.avg_page_space_used_in_percent > 0 AND
                     ps.avg_page_space_used_in_percent < 60))

and page fragmentation by rebuilding your indexes or by building and dropping a clustered index on a heap table.

Go ahead and try it in your environment. If you have questions or want to share your experience and expertise, speak up in the comments section.