Don't try this at home kids... (data recovery using DBCC PAGE)

Don't try this at home kids... (data recovery using DBCC PAGE)

Rate This
  • Comments 5

Time to cough up a real-life data recovery scenario to temporarily stop those people who badger me relentlessly for scenarios...

We have a situation currently with a customer who managed to delete all their data during a SAN reconfig and then found out that their tape backups hadn't been working properly (the same sad story I've heard a thousand times). They did have a primary filegroup backup plus a backup of a secondary filegroup from a different point. This is on SQL Server 2005.

The backups are restored, but the storage engine metadata for the table with the critical data in is messed up - sys.allocunits somehow has the wrong allocation unit IDs, first page, and root pages. So, what the hell can you do?

Easy (and fun if you're twisted like me).

We need to find the following info for the data allocation unit:

  • the page ID of the head of the IAM chain for the allocation unit
  • the root page of the clustered index
  • the first page of the clustered index

And there's a twist, the table has a column of type image so we also need to find the allocation unit ID and page ID of the head of the IAM chain for the LOB allocation unit.

The easiest way to do this is to scan through the data files, doing DBCC PAGEs of every page and storing the results in a table for later analysis. How? Ah, there's something I haven't told you about DBCC PAGE - it has a WITH TABLERESULTS option.

So, setup a simple script to INSERT/EXECDBCC PAGE of every page with option 0 (for speed, otherwise it has to take the time to go through and crack and format every record on every page) and save the results to a table.

To find the root page of the clustered index, query the table for the page with m_type = 2 (an index page) and the highest m_level. The level number increases from the leaf of the b-tree to the root page, so the page with the highest level is the root.

To find the first page, crack open the first record on the root page, and extract the page ID of the child page that the index record points to. This is the left-hand edge of the next level lower in the b-tree. Continue doing this until you find the page with m_level = 0. (Note that on SQL Server 2000, you'll need to look for m_level = 0 AND m_type = 1 - this is because pages at the leaf level and the level above in a SQL Server 2000 clustered index all have m_level = 0).

To find the head of the IAM chain is a little trickier. You need to find any page with m_type = 10 and following the m_prevPage links until its NULL. Then do an option 3 dump of the page and ensure that the sequenceNumber in the IAM page header is 0 (remember that an IAM chain is ordered by the sequence numbers in the IAM pages).

Cool - we've done the clustered index.

Now for the LOB allocation unit. Crack any row at the leaf level of the clustered index. You'll see that one of the columns is a text pointer, containing a page ID and slot number of the text fragment in the LOB allocation unit. Do a DBCC PAGE of that page and now you know the allocation unit ID of the LOB allocation unit and can use the methodology above to find the head of its IAM chain.

Now all you have to do is alter the hidden and unbindable sys.allocunits table to contain the correct data... Who's buying the beer? :-)

 

Leave a Comment
  • Please add 3 and 3 and type the answer here:
  • Post
  • OK, so I was  inspired by WITH TABLERESULTS option, and started cooking up something like this:

    DECLARE
    @INPUT VARCHAR(50),
    @INDEX1 INT,
    @INDEX2 INT,
    @obj int

    /* paste here the value from waitresource */

    SET @INPUT=RTRIM((select waitresource from master..sysprocesses with (nolock) where spid=58))

    /* get the position of the first colon */
    SET  @INDEX1=CHARINDEX(':',@INPUT)

    /* get the position of the second colon */
    SET @INDEX2=@INDEX1+CHARINDEX(':',right(@INPUT,LEN(@INPUT)-@INDEX1))

    /* create a temp table to hold dbcc page results */
    create table #my_tb
    (
    parentObject varchar(200),
    object varchar(200),
    field varchar(200),
    value varchar(200)
    )

    /* convert the page address format from colon delimited to comma delimited */
    DECLARE @STR VARCHAR (2000)
    SET @STR='dbcc PAGe ('+LEFT(@INPUT,@INDEX1-1)+','+
    SUBSTRING(@INPUT,@INDEX1+1,@INDEX2-@INDEX1-1)+','+
    RIGHT(@INPUT,LEN(@INPUT)-@INDEX2)+',0) with tableresults'

    /* store the output of dbcc page into the temp table */
    insert into #my_tb exec(@STR)

    /* query sysobjects to find out what object this page belongs to */
    set @obj=(select value from #my_tb where field='m_objid')

    declare
    @statement nvarchar(500),
    @params1 Nvarchar(200),
    @params2 Nvarchar(200),
    @name_db sysname,
    @name_obj sysname

    set @name_db=db_name(LEFT(@INPUT,@INDEX1-1))
    set @name_obj=object_name(@obj)


    set @statement='select @db_name as [Database], @obj_name as [Object], Type from '+
    rtrim(@name_db)+'..sysobjects where id=@obj'
    set @params1='@db_name sysname, @obj_name sysname, @obj int'
    set @params2='@db_name=db_name(LEFT(@INPUT,@INDEX1-1)),@obj_name=object_name(@obj)'
    exec sp_executesql @statement,@params1,
    @db_name=@name_db,@obj_name=@name_obj,@obj=@obj

    /* clean up */
    drop table #my_tb

    now, there are a couple of things that I am looking forward to changing in this.

    First is simple, which is to ignore null results (because waitresource column can sometimes be empty).

    Next, dbcc page gives object id for pfs pages 99 (or at least in my DB), object_name(99) gives "allocation", and of course there is no object with id=99 in sysobjects and the code above does not work. My question is what other special object ids exist out there that can through a similar monkey ranch into this code?

    finally, for more of a long term research, I would like to figure out how to handle "KEY" and "TAB" values in waitresource.

    Looking forward to future posts,

    Mor Deror
  • Cool script Mor. Object ID 99 is the only 'special' one we have - it's used for all allocation bitmaps except IAM pages.
  • Paul, it is a good article. But why you don't use a DBCC IND command ???

    It shows physical info about indexes and don't need to search root and first pages of index.

    Oh... DBCC IND works with clause 'with tableresults' too =)

  • I'll do a post on DBCC IND soon - both Kimberly and Kalen are hassling me about that. One undocumented command at a time :-)

  • Cool) I'll be wating for future posts. This topic is very interesting to me. And you are the best, who describe it... Very very useful posts to understand Storage Engine

Page 1 of 1 (5 items)