Fragmentation (part 2): What are pages?

Fragmentation (part 2): What are pages?

Rate This
  • Comments 13

(Boston continues its reputation - with me at least - for great seafood - calamari and pan-seared fresh halibut this evening. Yum! I was tempted to get on the T - Boston's subway system - and see what's happening downtown but with no jacket I'd get wet. So, instead I'll be boring and geeky and write another blog post - you guys are spoiled today.)

Hopefully I'm getting close to resolving the chicken-and-egg problem of being able to discuss topics without having to promise further posts to explain terms.

In the previous post in this series I explained what records are. Pages exist to store records. A database page is an 8192-byte (8KB) chunk of a database data file. They are aligned on 8KB boundaries within the data files, starting at byte-offset 0 in the file.

A page has a structure as shown below (all pages have the same basic structure):

Page header

The 96-byte page header is the same for every page. You can see an example of the fields that are stored within a page header in the last post on DBCC PAGE. When considering fragmentation, the most important fields are:

  • Next page pointer
    • The pages in each level of the index are joined in a doubly-linked list according to the logical order (as defined by the index keys) of the index. The next page pointer does not necessarily point to the immediately adjacent physical page in the file (fragmentation...
    • As these are doubly-linked lists, there is also a previous page pointer.
  • Page level
    • The various algorithms I'll describe later have to know whether the page being examined is from the leaf-level of the index or not (i.e. page level = 0). Page level increases from the leaf-level to the root page at the top of the b-tree (except in clustered indexes in SQL Server 2000 where there are two level=0 levels - the leaf-level and the level above it - this is just for simplicity :-) How do you tell them apart? You need to look at the page type too - data pages are at the leaf-level, index pages in the level above)
  • Free space count
    • This shows how much free space is available in the page. This is used to calculate the average page density - the less free space, the higher the page density.

Other interesting fields in the page header are:

  • object ID and index ID
    • in SQL Server 2000, these correspond to the actual relationl object ID and index ID to which the page is allocated.
    • in SQL Server 2005, this mapping no longer exists. The two IDs on each page correspond (through a calculation) to the allocation unit ID of the allocation unit the page is allocated to. A series of metadata lookups are necessary to determine the parent object and index.
  • page type
    • There are a bunch of different page types, most of which I'll be explaining more about in subsequent posts over the next few weeks. Some interesting ones are (well, interesting to me anyway):
      • data page (holds data records)
      • index page (holds non-clustered index leaf records and non-leaf records from clustered and non-clustered indexes)
      • IAM page (holds allocation information about extents within a fixed 4GB GAM interval that are allocated to an index or allocation unit, in SQL Server 2000 and 2005 respectively. IAM = Index Allocation Map.)
      • GAM and SGAM pages (holds global allocation information about extents in a GAM interval. GAM = Global Allocation Map. SGAM = Shared GAM.)
      • PFS page (holds allocation and free space information about pages within a PFS interval - approx 64MB. PFS = Page Free Space.)
      • text page (actually two types that hold leaf and intermediates nodes in text trees)
      • sort page (holds sort records being used in active sort operations)
      • differential bitmap page (holds information about which extents in a GAM interval have changed since the last full or differential backup)
      • bulk-changed map page (holds information about which extents in a GAM interval have changed while in bulklogged mode since the last backup. This is what allows you to switch to bulk-logged mode for bulk-loads and index rebuilds without worrying about breaking a backup chain.)
      • boot page (holds information about the database - one page used per database)
      • fileheader page (holds information about the file - one page per file)
  • slot count
    • This lists the number of occupied record slots on the page (including ghost records).
    • It can also be described as giving the number of entries in the slot array.
  • ghost record count
    • self-explanatory
  • page ID
    • a page's ID is made up of the file ID and the page position within the file and is always written as (file:page)

Slot array

It is a very common misconception that records within a page are always stored in logical order. This is not true. There is another misconception that all the free-space in a page is always maintained in one contiguous chunk. This also is not true. (Yes, the image above shows the free space in one chunk and that very often  is the case for pages that are being filled gradually.)

If a record is deleted from a page, everything remaining on the page is not suddenly compacted - inserters pay the cost of compaction when its necessary, not deleters.

Consider a completely full page - this means that record deletions cause free space holes within the page. If a new record needs to be inserted onto the page, and one of the holes is big enough to squeeze the record into, why go to the bother of comapcting it? Just stick the record in and carry on.

Ah - but hold on a minute, what if the record should logically have come at the end of all other records on the page, but we've just inserted it in the middle - doesn't that screw things up somewhat?

No, because the slot array is ordered and get reshuffled as records are inserted and deleted from pages. As long as the first slot array entry points to the logically first record on the page, everythings fine. Each slot entry is just a two-byte pointer into the page - so its far more efficient to manipulate the slot array than it is to manipulate a bunch of records on the page. Only when we know there's enough free space contained within the page to fit in a record, but its spread about the page do we compact the records on the page to make the free space into a contiguous chunk.

One interesting fact is that the slot array grows backwards from the end of the page, so the free space is squeezed from the top by new rows, and from the bottom by the slot array.

Records

You know what these are already - we talked about them last time!

IO

Pages are the basic unit of IO that SQL server uses. The buffer pool maps individual pages and torn-page protection and page checksums are implemented per-page. Fragmentation is mostly concerned with pages, and the efficiency of being able to read them in logical order (as defined by their next page pointers).

They can be allocated individually or in extents, which we'll cover next time...

Leave a Comment
  • Please add 8 and 1 and type the answer here:
  • Post
  • Paul - great series.  FYI, the picture of a page structure did not post.
  • Paul, I've recently come across your Blogs and I am very excited to find someone specializing in the Storage Engine.  One thing that I have always found cryptic is what is TRULY considered to be contiguous.  In countless blogs, books, BOL, etc., discussions of logical vs external fragmentation always come back to discussions of contiguity; both logical and physical.  DBA's concern themselves with the possibilities of interleaved extents using INDEXDEFRAG and the possible non-contiguous assignment of free extents using DBREINDEX, so we try to find a happy medium (I am way over simplifying- just making a point).  The confusion comes to play in that whenever possible we are always in hot pursuit of contiguous disk storage - hot on the heals of read-ahead benefits, sequential I/O and the avoidance of forward pointers and extent switches, etc.  Yet, when turning over our asynchronous IOP's over to a RAID array or other storage medium, do we really have any idea what contiguity we are actually getting?

    The following link gets into one example of how folks often think they are getting contiguous physical storage from OS level defragmentation - but are really not:
    http://www.raxco.com/products/perfectdisk2k/whitepapers/pd_raid.pdf

    I have had other discussions with DB vendors (not MS) that stated truly contiguous physical storage in an OLTP environment might actually destroy performance.

    So, (after going on and on) my question for you becomes...  Can you please shed some light on contiguity in the following manner:  Start with disk drives-->up through RAID/Disk controllers-->up through the O/S--> and finally to SQL Server's pages/extents (which are not understood by the OS anyway).

    Please shed some light on what is truly contiguous in nature.

    Thanks in advance.  RH
  • I little confuse here about slot, how they track about the records. I mean is there any particular column entry they have or something else.
    Thanks
    --
    Farhan
  • Hi folks,

    Replies to these comments have been delayed while I was on vacation.

    rhunt - very valid questions and I appreciate the confusion there is out there. I plan to cover 'What is fragmentation?' in post #9 in the fragmentation series. If you can hold on until then, I'll try my best to address your concerns.

    Farhan - let me try to boil down the explanation as far as possible. A table or index row is stored on-disk in a structure called a record. Records are stored on database pages. Records can vary in size over time, can be deleted and inserted, leaving gaps on the page, as the page is not compacted each time a record is changed/deleted/inserted. for this reason, we must keep track of where each record is stored on the page - we do this using the page slot array (an array of pointers into the page). Each record has an entry in the page slot array - and so is said to occupy a slot.

    An example of how a 'slot' is used. Consider a non-clustered index over a heap. Each non-clustered index row contains a physical pointer back to the heap row it maps too. This physical pointer is in form of [file:page:slot] - so the matching heap row can be found be reading the page, going to the slot number in the slot array to find the record's offset in the page and then reading the record at that offset.

    Let me know if this makes more sense. If not I'll take it offline with you over email.

    Thanks for the comments!
  • hi,

    This  blog regarding pages is good.I am now aware about details of pages, but have one doubt. A page is of 8KB which is 8192B.Out of which 96 bytes is used for Page Header and 8060B for storing data.Is the rest 36B used for slot array?And each entry in slot array is of 2B so is it that a max of 18 records can exist in a page.Please clarify.
  • Hi Jaya,

    You've misunderstood slightly. 8060 is the maximum amount of column data that can be stored in a single record. In the case where you only have a single maximum-size record on the page, the other 36b are used for the slot array entry, the record overhead and any possible forwarding-row- backpointer (10 bytes) if the record is a forwarded record in a heap.

    If you have smaller records then the number of rows you can have on a page increases accordingly.

    Hope this explains things.
  • On day-to-day basis a DBA might come across with the issues on the fragmentation on the database, it

  • PingBack from http://www.glorf.it/blog/2006/07/03/sql-talk/sql-server-storage-engine

  • PingBack from http://backyardshed.info/story.php?title=sql-server-storage-engine-fragmentation-part-2-what-are-pages

  • PingBack from http://toenailfungusite.info/story.php?id=4803

  • PingBack from http://hairgrowthproducts.info/story.php?id=5537

  • PingBack from http://insomniacuresite.info/story.php?id=8933

  • PingBack from http://thestoragebench.info/story.php?id=5258

Page 1 of 1 (13 items)