Fixing damaged pages using page restore or manual inserts

Fixing damaged pages using page restore or manual inserts

Rate This
  • Comments 11

Here's an interesting scenario that cropped up today. You have a database on a RAID array that failed and has zero'd out a page. How can you get the data back?

There are two ways to do it, depending on the database recovery model and version of SQL Server - single-page restore or manual insert/select - both of which rely on you having a backup of the database. You can use single-page restore if you're on SQL Server 2005 and the database is in Full or Bulk-Logged recovery mode, otherwise you need to use the manual method, and that will only work as long as you know the data being salvaged hasn't changed since the last backup.

Let's try them both. Here's a script to create a test database and make a backup of it:

-- Create the database.

USE master;

GO

CREATE DATABASE dbccpagetest;

GO

ALTER DATABASE dbccpagetest SET RECOVERY FULL;

GO

-- Create a table to play with.

USE dbccpagetest;

GO

CREATE TABLE sales (

salesID INT IDENTITY,

customerID INT DEFAULT CONVERT (INT, 100000 * RAND ()),

salesDate DATETIME DEFAULT GETDATE (),

salesAmount MONEY);

CREATE CLUSTERED INDEX salesCI ON sales (salesID);

GO

-- Populate the table

SET NOCOUNT ON;

GO

DECLARE @count INT

SELECT @count = 0

WHILE (@count < 5000)

BEGIN

INSERT INTO sales (salesAmount) VALUES (100 * RAND ());

SELECT @count = @count + 1

END;

GO

-- Take a full backup.

BACKUP DATABASE dbccpagetest TO DISK = 'C:\dbccpagetest.bak' WITH INIT;

GO

I'm going to simulate our scenario by shutting down the database and using a hex editor to zero out page 158 of the database. (This translates to byte offset 1294336 of the file being zero'd for 8192 bytes).

Now if I run checkdb, I get the following:

Msg 8909, Level 16, State 1, Line 1

Table error: Object ID 0, index ID -1, partition ID 0, alloc unit ID 0 (type Unknown), page ID (1:158) contains an incorrect page ID in its page header. The PageId in the page header = (0:0).

CHECKDB found 0 allocation errors and 1 consistency errors not associated with any single object.

Msg 8928, Level 16, State 1, Line 1

Object ID 2073058421, index ID 1, partition ID 72057594038386688, alloc unit ID 72057594042384384 (type In-row data): Page (1:158) could not be processed. See other errors for details.

CHECKDB found 0 allocation errors and 1 consistency errors in table 'sales' (object ID 2073058421).

CHECKDB found 0 allocation errors and 2 consistency errors in database 'dbccpagetest'.

repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (dbccpagetest).

What does the page look like?

DBCC TRACEON (3604);

GO

DBCC PAGE (dbccpagetest, 1, 158, 3);

GO

DBCC execution completed. If DBCC printed error messages, contact your system administrator.

PAGE: (0:0)

BUFFER:

BUF @0x02C0632C

bpage = 0x04C12000 bhash = 0x00000000 bpageno = (1:158)

bdbid = 9 breferences = 0 bUse1 = 37241

bstat = 0xc00009 blog = 0x89898989 bnext = 0x00000000

PAGE HEADER:

Page @0x04C12000

m_pageId = (0:0) m_headerVersion = 0 m_type = 0

m_typeFlagBits = 0x0 m_level = 0 m_flagBits = 0x200

m_objId (AllocUnitId.idObj) = 0 m_indexId (AllocUnitId.idInd) = 0 Metadata: AllocUnitId = 0

Metadata: PartitionId = 0 Metadata: IndexId = -1 Metadata: ObjectId = 0

m_prevPage = (0:0) m_nextPage = (0:0) pminlen = 0

m_slotCnt = 0 m_freeCnt = 0 m_freeData = 0

m_reservedCnt = 0 m_lsn = (0:0:0) m_xactReserved = 0

m_xdesId = (0:0) m_ghostRecCnt = 0 m_tornBits = 16777216

Allocation Status

GAM (1:2) = ALLOCATED SGAM (1:3) = NOT ALLOCATED

PFS (1:1) = 0x60 MIXED_EXT ALLOCATED 0_PCT_FULL DIFF (1:6) = NOT CHANGED

ML (1:7) = NOT MIN_LOGGED

Msg 2514, Level 16, State 5, Line 2

DBCC PAGE error: Invalid page type - dump style 3 not possible.

Note the error at the end of the output - DBCC PAGE can't do an in-depth dump because it doesn't know what page type the page is. Let's try a full page hex dump using dump style 2 instead:

DBCC PAGE (dbccpagetest, 1, 158, 2);

GO

PAGE: (0:0)

<page header etc skipped for brevity>

DATA:

Memory Dump @0x44F3C000

44F3C000: 00000000 00020000 00000000 00000000 †................

44F3C010: 00000000 00000000 00000000 00000000 †................

<deleted for brevity>

44F3DFE0: 00000000 00000000 00000000 00000000 †................

44F3DFF0: 00000000 00000000 00000000 00000000 †................

DBCC execution completed. If DBCC printed error messages, contact your system administrator.

It really is all zero. First we'll fix it using page restore.

USE master;

GO

RESTORE DATABASE dbccpagetest PAGE = '1:158' FROM DISK = 'C:\dbccpagetest.bak';

GO

Processed 1 pages for database 'dbccpagetest', file 'dbccpagetest' on file 1.

The roll forward start point is now at log sequence number (LSN) 32000000047000001. Additional roll forward past LSN 33000000001700001 is required to complete the restore sequence.

RESTORE DATABASE ... FILE=<name> successfully processed 1 pages in 0.176 seconds (0.046 MB/sec).

Isn't that cool? You can restore up to 1000 single pages from a backup at a time. For VLDBs, this cuts the recovery time WAY down. Now we need to roll forward the log. We don't have any more log backups so we can finish the roll forward by backing up and restoring the tail of the log.

-- Need to complete roll forward. Backup the log tail...

BACKUP LOG dbccpagetest TO DISK = 'C:\dbccpagetest_log.bak' WITH INIT;

GO

-- ... and restore it again.

RESTORE LOG dbccpagetest FROM DISK = 'C:\dbccpagetest_log.bak';

GO

Processed 5 pages for database 'dbccpagetest', file 'dbccpagetest_log' on file 1.

BACKUP LOG successfully processed 5 pages in 0.146 seconds (0.248 MB/sec).

Processed 0 pages for database 'dbccpagetest', file 'dbccpagetest' on file 1.

RESTORE LOG successfully processed 0 pages in 0.004 seconds (0.000 MB/sec).

And now we should have a clean database:

DBCC CHECKDB (dbccpagetest) WITH NO_INFOMSGS;

GO

Command(s) completed successfully.

Easy. But what if we can't do a page restore? Assuming I've corrupted the database in exactly the same way again, the first thing is to do is make sure we can restore the backup and then see what data range is on that page:

RESTORE DATABASE dbccpagetest_copy FROM DISK = 'C:\dbccpagetest.bak' WITH

MOVE N'dbccpagetest' TO N'C:\dbccpagetest_copy.mdf',

MOVE N'dbccpagetest_log' TO N'C:\dbccpagetest_log.ldf',

REPLACE;

GO

DBCC PAGE (dbccpagetest_copy, 1, 158, 3);

GO

Processed 184 pages for database 'dbccpagetest_copy', file 'dbccpagetest' on file 1.

Processed 2 pages for database 'dbccpagetest_copy', file 'dbccpagetest_log' on file 1.

RESTORE DATABASE successfully processed 186 pages in 0.361 seconds (4.205 MB/sec).

PAGE: (1:158)

BUFFER:

BUF @0x02BE8D38

bpage = 0x03FB4000 bhash = 0x00000000 bpageno = (1:158)

bdbid = 10 breferences = 1 bUse1 = 38283

bstat = 0xc00009 blog = 0x159a2159 bnext = 0x00000000

PAGE HEADER:

Page @0x03FB4000

m_pageId = (1:158) m_headerVersion = 1 m_type = 1

m_typeFlagBits = 0x4 m_level = 0 m_flagBits = 0x8200

m_objId (AllocUnitId.idObj) = 68 m_indexId (AllocUnitId.idInd) = 256

Metadata: AllocUnitId = 72057594042384384

Metadata: PartitionId = 72057594038386688 Metadata: IndexId = 1

Metadata: ObjectId = 2073058421 m_prevPage = (1:157) m_nextPage = (1:159)

pminlen = 28 m_slotCnt = 245 m_freeCnt = 11

m_freeData = 7691 m_reservedCnt = 0 m_lsn = (24:453:8)

m_xactReserved = 0 m_xdesId = (0:0) m_ghostRecCnt = 0

m_tornBits = -1020457745

Allocation Status

GAM (1:2) = ALLOCATED SGAM (1:3) = NOT ALLOCATED

PFS (1:1) = 0x60 MIXED_EXT ALLOCATED 0_PCT_FULL DIFF (1:6) = NOT CHANGED

ML (1:7) = NOT MIN_LOGGED

Slot 0 Offset 0x60 Length 31

Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP

Memory Dump @0x4542C060

00000000: 10001c00 d5030000 5bd30000 3f771101 †........[...?w..

00000010: b9980000 baa10a00 00000000 0500e0††††...............

UNIQUIFIER = [NULL]

Slot 0 Column 1 Offset 0x4 Length 4

salesID = 981

Slot 0 Column 2 Offset 0x8 Length 4

customerID = 54107

Slot 0 Column 3 Offset 0xc Length 8

salesDate = Jan 17 2007 4:35PM

Slot 0 Column 4 Offset 0x14 Length 8

salesAmount = 69.68

<deleted for brevity>

Slot 244 Offset 0x1dec Length 31

Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP

Memory Dump @0x4542DDEC

00000000: 10001c00 c9040000 bfa10000 57771101 †............Ww..

00000010: b9980000 c6b80500 00000000 0500e0††††...............

UNIQUIFIER = [NULL]

Slot 244 Column 1 Offset 0x4 Length 4

salesID = 1225

Slot 244 Column 2 Offset 0x8 Length 4

customerID = 41407

Slot 244 Column 3 Offset 0xc Length 8

salesDate = Jan 17 2007 4:35PM

Slot 244 Column 4 Offset 0x14 Length 8

salesAmount = 37.50

DBCC execution completed. If DBCC printed error messages, contact your system administrator.

So we're looking at salesID range 981 to 1225 inclusive. Before we can copy the rows back to the damaged database, we need to get rid of the corrupt page. Repair should delete the page for us. First I'm going to take another backup though - just in case something goes wrong!

BACKUP DATABASE dbccpagetest TO DISK = 'C:\dbccpagetest_corrupt.bak' WITH INIT;

GO

ALTER DATABASE dbccpagetest SET SINGLE_USER;

GO

DBCC CHECKDB (dbccpagetest, REPAIR_ALLOW_DATA_LOSS) WITH NO_INFOMSGS;

GO

ALTER DATABASE dbccpagetest SET MULTI_USER;

GO

Processed 184 pages for database 'dbccpagetest', file 'dbccpagetest' on file 1.

Processed 4 pages for database 'dbccpagetest', file 'dbccpagetest_log' on file 1.

BACKUP DATABASE successfully processed 188 pages in 0.380 seconds (4.052 MB/sec).

Msg 8909, Level 16, State 1, Line 1

Table error: Object ID 0, index ID -1, partition ID 0, alloc unit ID 0 (type Unknown), page ID (1:158) contains an incorrect page ID in its page header. The PageId in the page header = (0:0).

The error has been repaired.

CHECKDB found 0 allocation errors and 1 consistency errors not associated with any single object.

CHECKDB fixed 0 allocation errors and 1 consistency errors not associated with any single object.

Repair: The Clustered index successfully rebuilt for the object "dbo.sales" in database "dbccpagetest".

Repair: The page (1:158) has been deallocated from object ID 2073058421, index ID 1, partition ID 72057594038386688, alloc unit ID 72057594042384384 (type In-row data).

Msg 8945, Level 16, State 1, Line 1

Table error: Object ID 2073058421, index ID 1 will be rebuilt.

The error has been repaired.

Msg 8928, Level 16, State 1, Line 1

Object ID 2073058421, index ID 1, partition ID 72057594038386688, alloc unit ID 72057594042384384 (type In-row data): Page (1:158) could not be processed. See other errors for details.

The error has been repaired.

Msg 8976, Level 16, State 1, Line 1

Table error: Object ID 2073058421, index ID 1, partition ID 72057594038386688, alloc unit ID 72057594042384384 (type In-row data). Page (1:158) was not seen in the scan although its parent (1:154) and previous (1:157) refer to it. Check any previous errors.

The error has been repaired.

Msg 8978, Level 16, State 1, Line 1

Table error: Object ID 2073058421, index ID 1, partition ID 72057594038386688, alloc unit ID 72057594042384384 (type In-row data). Page (1:159) is missing a reference from previous page (1:158). Possible chain linkage problem.

The error has been repaired.

CHECKDB found 0 allocation errors and 3 consistency errors in table 'sales' (object ID 2073058421).

CHECKDB fixed 0 allocation errors and 3 consistency errors in table 'sales' (object ID 2073058421).

CHECKDB found 0 allocation errors and 4 consistency errors in database 'dbccpagetest'.

CHECKDB fixed 0 allocation errors and 4 consistency errors in database 'dbccpagetest'.

We should check the row count to see that the count has dropped from the initial 5000 rows we inserted:

USE dbccpagetest;

GO

SELECT COUNT (*) FROM SALES;

GO

SELECT COUNT (*) FROM sales WHERE salesID > 980 AND salesID < 1226;

GO

And we're down to 4755 rows, as expected with zero rows in that range. All we need to do now is to copy the missing rows over from the restored copy. Remember, this will only work if you know that the data being salvaged hasn't changed since the backup was taken - otherwise you'll have old and new data mixed in the table which will play havoc with your business. Before we copy the rows, we know we're got an identity column we'd like to preserve so we set IDENTITY_INSERT on which tells the server not to generate new identity values for the inserted rows.

-- Make sure identity values survive.

SET IDENTITY_INSERT sales ON;

GO

-- Insert the missing rows.

SET NOCOUNT OFF;

GO

INSERT INTO sales (salesID, customerID, salesDate, salesAmount)

SELECT * FROM dbccpagetest_copy.dbo.sales AS R

WHERE R.salesID > 980 AND R.salesID < 1226;

GO

-- Restore identity behavior.

SET IDENTITY_INSERT sales OFF;

GO

(245 row(s) affected)

We copy over 245 rows and checking the row count again says we're back to 5000 rows.

I want to post a lot more scenarios like this from now on - this may mean the posts are a little less meaty than last year but should happen a lot more often. If there's any particular scenario you'd like to see covered, add a comment and let me know.

(First post of the year! The diving vacation was excellent - 100 foot visibility, 85F water, and amazing creatures. I managed to get in 28 dives in the 8 diving days I had. Here's a picture of me about 40ft underwater in Indonesia to close.)

Leave a Comment
  • Please add 5 and 8 and type the answer here:
  • Post
  • Hello. This article was very helpful. I had two questions. First is, if possible, can you tell me how to make a SQL 2000 and SQL 2005 db suspect using a hexeditor?

    Also, if you know of any site or article which helps in actually understanding the bytes and offset etc.. It's very interesting but quite dangerous if I dont know what I'm doing. So, I would like to understand how to read it and so on. Thank you very much for all your help.

  • Great howto indeed :)

    It would be cool to have some DBCC PAGE option to return all rows from the page, like with "select * ..." thing.

  • That's a great article, thank you very much indeed.

  • Hi Tejas,

    Easiest way to make a database suspect is start a transaction, shut down the database, delete th transaction log and startup the database again.

    For a discussion on bytes and offsets within a page, look at my earlier blog posts on using DBCC PAGE (http://blogs.msdn.com/sqlserverstorageengine/archive/tags/On-Disk+Structures/default.aspx)

    Thanks

  • Hi, i am windering that using the Hex Editior, how i can go to the page no 158. and also i didnt understand  this line.  

    "using a hex editor to zero out page 158 of the database. (This translates to byte offset 1294336 of the file being zero'd for 8192 bytes)."

    Thanks

    Zia

  • Hi Zia,

    What I'm explaining is what to do with the hex editor to simulate hardware filling page 158 of file 1 with zeros. Use the hex editor to move to byte offset 1294336 in the data file (this corresponds to the start of page 158) and write 8192 zeros (this corresponds to the length of a page).

    I'm not going to get into how to find or use a hex editor. If you're uncomfortable doing this, shoot me email and I'll send you a corrupt SQL Server 2005 database.

    Thanks and regards.

  • Hi,

    does this procedure work for Databases with Allocation Errors too? Or is there any other way to repair such errors (Restoring does not work; Exporting into an empty Database causes an Error while copying the server objects)?

    Thanks

    MCCBasis

  • Hi MCCBasis,

    You can use page restore to recover from allocation errors but not the manual inserts method. You can also use repair but that will usually result in a lot of data loss as repairs of allocation errors are usually pretty intrusive.

    I can't really advise you on your situation without seeing all the info - I recommend you call Product Support to help you if you're having problems.

    Thanks

  • Thank you for your valuable post .I have a question -

    If there are more than one table then applying the Tlog after PAGE restore will not rollforward transactions on other tables ??

  • its very help ful link to solve my page erroe problem

    SQL Server detected a logical consistency-based I/O error: incorrect pageid.

    sureshbabum.blogspot.in/.../sql-dba-corrupted-db-logical.html

  • How to do this step? I can put database offline but the other part not sure.

    'I'm going to simulate our scenario by shutting down the database and using a hex editor to zero out page 158 of the database. (This translates to byte offset 1294336 of the file being zero'd for 8192 bytes).'

Page 1 of 1 (11 items)