Ghost records are something that are a bit of an enigma for most folks working with SQL Server, and not just because of the name. Today, I’ll seek to explain the concept, as well as identify some troubleshooting techniques.
The main reason behind introducing the concept of Ghost records was to enhance performance. In the leaf level of an index, when rows are deleted, they're marked as ghost records. This means that the row stays on the page but a bit is changed in the row header to indicate that the row is really a ghost. The page header also reflects the number of ghost records on a page. What this means, in effect, is that the DML operation which fired the delete will return to the user much faster, because it does not have to wait for the records to be deleted physically. Rather, they’re just marked as “ghosted”.
Ghost records are present only in the index leaf nodes. If ghost records weren't used, the entire range surrounding a deleted key would have to be locked. Here’s an example i picked up from somewhere: Suppose you have a unique index on an integer and the index contains the values 1, 30, and 100. If you delete 30, SQL Server will need to lock (and prevent inserts into) the entire range between 1 and 100. With ghosted records, the 30 is still visible to be used as an endpoint of a key-range lock so that during the delete transaction, SQL Server can allow inserts for any value other than 30 to proceed.
SQL Server provides a special housekeeping thread that periodically checks B-trees for ghosted records and asynchronously removes them from the leaf level of the index. This same thread carries out the automatic shrinking of databases if you have that option set.The ghost record(s) presence is registered in:
The ghost records can be cleaned up in 3 ways:
The Ghost cleanup process divides the “ghost pages” into 2 categories:
The Ghost cleanup thread is able to retrieve the list of Cold pages from the DBTABLE for that database, or the PFS Page for that interval. The cleanup task cleans up a maximum of 10 ghost pages at a time. Also, while searching for the ghost pages, if it covers 10 PFS Pages, it yields.
As far as hot ghost pages are concerned, the ghost cleanup strives to keep the number of such pages below a specified limit. Also, if the thread cleans up 10 hot ghost pages, it yields. However, if the number of hot ghost pages is above the specified (hard-coded) limit, the task runs non-stop till the count comes down below the threshold value.
If there is no CPU usage on the system, the Ghost cleanup task runs till there are no more ghost pages to clean up.
Troubleshooting
So now we get to the interesting part. If your system has some huge delete operations, and you feel the space is not being freed up at all or even not at the rate it should be, you might want to check if there are ghost records in that database. I’ll try to break down the troubleshooting into some logical steps here:
Another thing that deserves mention is the special role of the PAGLOCK hint w.r.t ghost records:
These steps might or might not solve your problem, but what they will do is give you an insight into how the SQL Server Database Engine works w.r.t Ghost records and their cleanup. One of the most common (and quickest) resolutions for a ghost records issue is to restart SQL Server.
Once again, this post does not come with any guarantees, and the contents are in no way endorsed by Microsoft or any other corporation or individual.
Hope this helps you understand the concept of Ghost Records somewhat. You’re more than welcome to share your experiences/opinions/knowledge in the comments section, and I shall be delighted to include them in the contents of the post if suitable.