<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>SQL Server Storage Engine : CHECKDB Series</title><link>http://blogs.msdn.com/sqlserverstorageengine/archive/tags/CHECKDB+Series/default.aspx</link><description>Tags: CHECKDB Series</description><dc:language>en</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>How long does *your* CHECKDB take?</title><link>http://blogs.msdn.com/sqlserverstorageengine/archive/2007/02/13/how-long-does-your-checkdb-take.aspx</link><pubDate>Tue, 13 Feb 2007 04:29:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1666207</guid><dc:creator>Paul Randal - MSFT</dc:creator><slash:comments>3</slash:comments><comments>http://blogs.msdn.com/sqlserverstorageengine/comments/1666207.aspx</comments><wfw:commentRss>http://blogs.msdn.com/sqlserverstorageengine/commentrss.aspx?PostID=1666207</wfw:commentRss><description>&lt;P&gt;Following on from my post a couple of weeks ago (&lt;A href="https://blogs.msdn.com/sqlserverstorageengine/archive/2007/01/24/how-long-will-checkdb-take-to-run.aspx"&gt;https://blogs.msdn.com/sqlserverstorageengine/archive/2007/01/24/how-long-will-checkdb-take-to-run.aspx&lt;/A&gt;), I'm very interested to know how long it takes for your CHECKDBs to run, so I can get an idea of the distribution of run-times on various kinds of hardware for various size databases.&lt;/P&gt;
&lt;P&gt;So, if you have a couple of minutes, I'd be grateful if you could c&amp;amp;p the following into an email to me, &lt;A href="mailto:prandal@microsoft.com"&gt;prandal@microsoft.com&lt;/A&gt;, and answer the questions for one or more of your databases. I'd especially like to hear from those of you who work for Microsoft and look after multiple customers. If I get a worthwhile number of responses (say 100 or so) then I'll collate the responses and post a summary. I'd love to hear about databases larger than 1TB.&lt;/P&gt;
&lt;P&gt;For the&amp;nbsp;first 10 Microsoft and the first 10 non-Microsoft people to respond, I'll send you the Always-On DVD loaded with Hands-On Labs that walk you through a bunch of the various HA/DR technologies. (I'll let you know whether you win and you need to send me your address).&lt;/P&gt;
&lt;P&gt;Many thanks!&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;How large is the database?&lt;/LI&gt;
&lt;LI&gt;What's the size of the largest table?&lt;/LI&gt;
&lt;LI&gt;Do you have any indexed views?&lt;/LI&gt;
&lt;LI&gt;Which command do you run, including options?&lt;/LI&gt;
&lt;LI&gt;How long does the command take to run?&lt;/LI&gt;
&lt;LI&gt;How often do you run it?&lt;/LI&gt;
&lt;LI&gt;Which version &amp;amp; edition of SQL Server are you running?&lt;/LI&gt;
&lt;LI&gt;(If SQL Server 2005, did you notice an increase or decrease in runtime?)&lt;/LI&gt;
&lt;LI&gt;How many CPUs/cores does the server have?&lt;/LI&gt;
&lt;LI&gt;How much memory does the server have?&lt;/LI&gt;
&lt;LI&gt;What platform is the server?&lt;/LI&gt;
&lt;LI&gt;Describe your IO subsystem?&lt;/LI&gt;
&lt;LI&gt;Any comments?&lt;/LI&gt;&lt;/UL&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1666207" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/DBCC/default.aspx">DBCC</category><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/CHECKDB+Series/default.aspx">CHECKDB Series</category></item><item><title>CHECKDB (Part 8): Can repair fix everything?</title><link>http://blogs.msdn.com/sqlserverstorageengine/archive/2007/02/04/checkdb-part-8-did-repair-fix-everything.aspx</link><pubDate>Sun, 04 Feb 2007 16:53:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1598373</guid><dc:creator>Paul Randal - MSFT</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/sqlserverstorageengine/comments/1598373.aspx</comments><wfw:commentRss>http://blogs.msdn.com/sqlserverstorageengine/commentrss.aspx?PostID=1598373</wfw:commentRss><description>&lt;P&gt;I was teaching at a Microsoft-internal class last week and there was a discussion on what corruptions can't be repaired using DBCC. At the same time, several threads popped up on forums and newsgroups with people hitting some of this unrepairable corruptions so I thought that would make a good topic for the next post in the CHECKDB series.&lt;/P&gt;
&lt;P&gt;Before anyone takes this the wrong&amp;nbsp;way - what do I mean by&amp;nbsp;"can't be repaired"? Remember that that purpose of repair is to make the database structurally consistent, and that to do this usually means deleting the corrupt data/structure (that's why the option to do this was aptly named REPAIR_ALLOW_DATA_LOSS - see &lt;A class="" href="http://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/16/633645.aspx" mce_href="http://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/16/633645.aspx"&gt;this post&lt;/A&gt; for more explanation on why repair can be bad). A corruption is deemed unrepairable when it doesn't make sense to repair it given the damage the repair would cause, or the corruption is so rare and so complicated to repair correctly that it's not worth the engineering effort to provide a repair. Remember also that &lt;U&gt;recovery from corruptions should be based on a sound backup strategy&lt;/U&gt;, not on running repair, so making this trade-off in functionality makes perfect sense.&lt;/P&gt;
&lt;P&gt;Here's a few of the more common unrepairable corruptions that people run into along with the reasons they can't be repaired by DBCC.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;STRONG&gt;PFS page header corruption&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;An example of this is on SQL Server 2005:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P mce_keep="true"&gt;Msg 8946, Level 16, State 12, Line 1&lt;BR&gt;Table error: Allocation page (1:13280496) has invalid PFS_PAGE page header values.&lt;BR&gt;Type is 0. Check type, alloc unit ID and page ID on the page.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P mce_keep="true"&gt;CHECKDB uses the PFS pages to determine which pages are allocated - and so which pages to read to drive the various consistency checks. The only repair for a PFS page is to reconsutruct it - they can't simply be deleted as they're a fixed part of the fabric of the database. PFS pages cannot be rebuilt because there is no infallible way to determine which pages are allocated or not. There are various algorithms I've experimented with to rebuild them with optimistic or pessimistic setting of page allocation statuses and then re-run the various consisteny checks to try to sort out the incorrect choices, but they all require very long run-times. Given the frequency with which we see these corruptions, and the engineering effort required to come up with an (imperfect) solution, I made the choice to leave this as unrepairable.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;STRONG&gt;Critical system table clustered-index leaf-page corruption&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;An example of this is on SQL Server 2000:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P mce_keep="true"&gt;Server: Msg 8966, Level 16, State 1, Line 1&lt;BR&gt;Could not read and latch page (1:18645) with latch type SH. sysindexes failed.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P mce_keep="true"&gt;And on SQL Server 2005:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P mce_keep="true"&gt;&lt;FONT size=1&gt;Msg 7985, Level 16, State 2, Server SUNART, Line 1&lt;BR&gt;System table pre-checks: Object ID 4. Could not read and latch page (1:51) with&lt;BR&gt;latch type SH. Check statement terminated due to unrepairable error.&lt;/FONT&gt;&lt;FONT size=1&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;/FONT&gt;
&lt;P mce_keep="true"&gt;In a &lt;A class="" href="https://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/19/636410.aspx" mce_href="https://blogs.msdn.com:443/sqlserverstorageengine/archive/2006/06/19/636410.aspx"&gt;previous post&lt;/A&gt; I described why how and why we do special checks of the clustered indexes of the critical system tables. If any of the pages at the leaf-level of these indexes are corrupt, we cannot repair them. Repairing would mean deallocating the page, wiping out the most important metadata for potentially hundreds of user tables and so effectively deleteing all of these tables. That's obviously an unpalatable repair for anyone to allow and so we don't do it.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;STRONG&gt;Column value&amp;nbsp;corruption&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;Here's an example of this on SQL Server 2005:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P mce_keep="true"&gt;Msg 2570, Level 16, State 3, Line 1&lt;BR&gt;Page (1:152), slot 0 in object ID 2073058421, index ID 0, partition ID 72057594038321152, alloc unit ID 72057594042318848 (type "In-row data"). Column "c1" value is out of range for data type "datetime".&amp;nbsp; Update column to a legal value.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P mce_keep="true"&gt;This is where a column has a stored value that is outside the valid range for the column type. There are a couple of repairs we &lt;EM&gt;could&lt;/EM&gt; do for this:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;
&lt;DIV mce_keep="true"&gt;delete the entire record&lt;/DIV&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;DIV mce_keep="true"&gt;insert a dummy value&lt;/DIV&gt;&lt;/LI&gt;&lt;/OL&gt;
&lt;P mce_keep="true"&gt;#1 isn't very palatable because then we've lost data and its not a &lt;EM&gt;structural&lt;/EM&gt; problem in the database so doesn't &lt;EM&gt;have&lt;/EM&gt; to be repaired. #2 is dangerous - what value should we choose as the dummy value? Any value we put in may adversely affect business logic, or fire a trigger, or have some unwelcome meaning in the context of the table - even a NULL. Given these problems, we chose to allow people to fix the corrupt values themselves.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;STRONG&gt;Metadata corruption&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;Here's an example of this on SQL Server 2005:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P mce_keep="true"&gt;Msg 3854, Level 16, State 1, Line 2&lt;BR&gt;Attribute (referenced_major_id=2089058478) of row (class=0,object_id=2105058535,column_id=0,referenced_major_id=2089058478,referenced_minor_id=0) in sys.sql_dependencies has a matching row (object_id=2089058478) in sys.objects (type=SN) that is invalid.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P mce_keep="true"&gt;This example is relatively benign. There are other examples that will cause CHECKDB to terminate - not as bad as the critical system table corruption example above, but enough that CHECKDB doesn't trust the metadata enough to use it to drive consistency checks. Repairing metadata corruption has the same problems as repairing critical system table corruption - any repair means deleting metadata about one or more tables, and hence deleting the tables themselves. It's far better to leave the corruption unrepaired so that as much data as possible can be extracted from the remaining tables.&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;STRONG&gt;Summary&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;Repair can't fix everything. You may end up having to perform manual and time-consuming data extraction from the corrupt database and losing lots of data because of, say,&amp;nbsp;a critical system table corruption. Bottom line (as usual) - make sure you have valid backups so you don't get into this state!&lt;/P&gt;
&lt;P mce_keep="true"&gt;&lt;BR&gt;&amp;nbsp;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1598373" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/DBCC/default.aspx">DBCC</category><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/Disaster+Recovery/default.aspx">Disaster Recovery</category><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/CHECKDB+Series/default.aspx">CHECKDB Series</category><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/On-Disk+Structures/default.aspx">On-Disk Structures</category></item><item><title>CHECKDB (Part 7): How long will CHECKDB take to run?</title><link>http://blogs.msdn.com/sqlserverstorageengine/archive/2007/01/24/how-long-will-checkdb-take-to-run.aspx</link><pubDate>Wed, 24 Jan 2007 13:12:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:1521725</guid><dc:creator>Paul Randal - MSFT</dc:creator><slash:comments>2</slash:comments><comments>http://blogs.msdn.com/sqlserverstorageengine/comments/1521725.aspx</comments><wfw:commentRss>http://blogs.msdn.com/sqlserverstorageengine/commentrss.aspx?PostID=1521725</wfw:commentRss><description>&lt;P&gt;This is a question I see every so often and it cropped up again this morning so I'll use it as the subject for this week's blog post.&lt;/P&gt;
&lt;P&gt;There are several ways I &lt;EM&gt;could&lt;/EM&gt; answer this:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;the&amp;nbsp;unhelpful answer&lt;/STRONG&gt; - I've got no idea.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;the almost-helpful answer&lt;/STRONG&gt; - how long did it take to run last time and are the conditions exactly the same?&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;the answer I usually give&lt;/STRONG&gt; - it depends.&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;Now, many people would see the third answer as being somewhat equivalent to the first answer - unhelpful. The problem is that there are many factors which influence how long CHECKDB will take to run. Let me explain the ten most important factors so you get an idea why this is actually a helpful answer. These aren't in any particular order of importance.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;1) The size of the database&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Pretty obvious...&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;2) Concurrent IO load on the server&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;At the simplest level, what is CHECKDB going to do? It reads every allocated page in the database. That's a lot of IO. CHECKDB takes great pains to do the most efficient IO it can and read the database pages in their physical order with plenty of readahead so that the disk heads move smoothly across the disks (rather than jumping around randomly and incurring disk head seek delays). If there's no concurrent IO load on the server, then the IOs will be as efficient as CHECKDB can make them. However, introducing any additional IO from SQL Server means that the disk heads will be jumping around - slowing down the CHECKDB IOs. If the IO subsystem is at capacity already from CHECKDB's IO demands, any additional IO is going to reduce the IO bandwidth available to CHECKDB - slowing it down.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;3) Concurrent CPU activity on the server&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;At the next level of simplicity, CHECKDB is going to process every page it reads in some way. Depending on the various options you've specified and the database schema (details below), that's going to use a lot of CPU - it's possible that the server may be pegged at 100% CPU when CHECKDB is running. If there's any additional workload on the server, that's going to take CPU cycles away from CHECKDB and it going to slow it down.&lt;/P&gt;
&lt;P mce_keep="true"&gt;Basically what these first two points are saying is that CHECKDB is &lt;EM&gt;very resource intensive!&lt;STRONG&gt; &lt;/STRONG&gt;&lt;/EM&gt;Its probably one of the most resource intensive things you can ask SQL Server to do and so it's usually a good idea to not run it during peak workload times, as you'll not only cause CHECKDB to take longer to run, you will slowdown the concurrent workload, possibly unacceptably.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;4) Concurrent update activity on the database&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;This is relevant for both SQL 2000 and SQL 2005, but for different reasons.&lt;/P&gt;
&lt;P&gt;In SQL 2000, CHECKDB gets its consistent view of the database from transaction log analysis of concurrent DML transactions (see &lt;A class="" href="http://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/09/623789.aspx" target=_blank mce_href="http://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/09/623789.aspx"&gt;CHECKDB (Part 1): How does CHECKDB get a consistent view of the database&lt;/A&gt; for full details). The more concurrent DML there is while CHECKDB is running, the more transaction log will be generated - and so the longer it will take for CHECKDB to analyze that transaction log. It's possible that on a large multi-CPU box with a ton of concurrent DML and CHECKDB limited to a single CPU that this phase of CHECKDB could take several times longer than the reading and processing of the database pages! (I've seen this in real-life several times.)&lt;/P&gt;
&lt;P&gt;In SQL 2005, CHECKDB gets its consistent view of the database from a database snapshot, which is stored on the same disk volumes as the database itself. If there are&amp;nbsp;a lot of changes in the database while CHECKDB is running, the changed pages are pushed to the snapshot so that it remains consistent. As the snapshot is stored on the same disks as the database, every time&amp;nbsp;page is pushed to the snapshot, the disk head has to move, which interrupts the efficient IO described in #2. Also, whenever CHECKDB goes to read a page and it needs to read the page from the snapshot files instead of the database files, that's another disk head move, and another efficient IO interruption. The more concurrent changes to the database, the more interruptions to efficient IO and the slower the CHECKDB runs.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;5) Throughput capabilities of the IO subsystem&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;This one's simple. CHECKDB is going to do a boat-load of IOs and it could even end up being IO-bound (meaning that the CPUs are idle periodically waiting for IOs to complete) depending on the options specified and the database schema. This means that the throughput of the IO subsystem is going to have a direct effect on the run-time of CHECKDB. so, if you have a 1TB database and the IO subsystem can only manage 100MB/sec, it's going to take almost 3 hours just to read the database (1TB / 100MB / 3600 secs) and there's nothing you can do to speed that up except upgrade the IO subsystem.&lt;/P&gt;
&lt;P&gt;I've lost count of the number of times I've heard customers complain that CHECKDB (or index rebuilds or other IO-heavy operations) are running sloooowly only to find that the disk queue lengths are enormous and the IO subsystem it entirely unmatched to the server and workload.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;6) The number of CPUs on the box&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;This also really encompasses the Edition of SQL Server that's being run. In Enterprise Edition, CHECKDB can run in parallel across all the CPUs in the box (or as many as the query processor decides to parallelize over when the CHECKDB internal queries are compiled). Running in parallel can give a significant performance boost to CHECKDB and lower run times, as long as the database is also spread over multiple files too (so the IOs can be parallelized). There's a nifty algorithm we use that allows us to run in parallel whcih I'll explain in detail in a future post.&lt;/P&gt;
&lt;P&gt;On the other hand, however, the fact that CHECKDB can run in parallel in Enterprise Edition can be bad for some scenarios, and so some DBAs chose to force CHECKDB to be single-threaded. The way to do this is to turn on the documented trace flag 2528.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;7) The speed of the disks where tempdb is placed&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Running CHECKDB against a VLDB uses lots of memory for internal state and for VLDBs the memory requirement usually exceeds the amount of memory available to SQL Server. In this case, the state is spooled out to tempdb and so the performance of tempdb can be a critical factor in CHECKDB performance.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;8) The complexity of the database schema&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;This can have a really high impact on the run-time of CHECKDB because it impacts the amount of CPU that CHECKDB requires. For example, the most expensive checks that CHECKDB does are for non-clustered indexes. We need to check that each row in a non-clustered index maps to exactly one row in the heap or clustered index for the table, and that every heap/clustered index row has exactly one matching row in each non-clustered index. Although there's a highly efficient algorithm for doing this, it still takes around 30% of the total CPU that CHECKDB uses!&lt;/P&gt;
&lt;P&gt;There are&amp;nbsp;a bunch of other checks that are only done if the features have been used in the database - e.g. computed column evaluation, links between off-row LOB values, Service Broker, XML indexes, indexed views - so you can see that empirical factors along aren't enough to determine the run-time.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;9) Which options you've specified&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;This is almost the same as #7 in that by specifying various options you're limiting what checks CHECKDB actually performs. For instance, using the WITH NOINDEX option will turn off the non-clustered index checks that I described in #7 and using the WITH PHYSICAL_ONLY option will turn off &lt;EM&gt;all&lt;/EM&gt; logical checks, vastly decreasing the run-time of CHECKDB and making it nearly always IO-bound rather than CPU-bound (in fact this is the most common option that DBAs of VLDBs use to make the run-time of CHECKDB manageable).&lt;/P&gt;
&lt;P&gt;One thing to be aware of - if you specify any repair options, CHECKDB &lt;EM&gt;always&lt;/EM&gt; runs single-threaded, even on a multi-proc box.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;10) The number and type of corruptions that exist in the database&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Again, this is similar to #7 and #8. If there are any corruptions present, there may be extra checks triggered to try to figure out more details of the corruptions. For instance, for the non-clustered index checks, the algorithm is tuned very heavily for the case when there are no corruptions present (the overwhelming majority of cases considering the millions of times CHECKDB is run every day around the world). When a non-clustered index corruption is detected, a more in-depth algorithm has to be used to figure out exactly where the corruption is, which involves re-scanning a bunch of data and so taking a bunch more time. There are a few other algorithms like this too.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Summary&lt;/STRONG&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So you can see that there's no simple answer.&lt;/P&gt;
&lt;P&gt;Now, there's only one time when you should be trying to work out how long a CHECKDB is going to take - when you're planning your regular database maintenance (if you find that it's going to take too long, checkout some of the options for splitting up the checks in my previous post on &lt;A class="" href="http://blogs.msdn.com/sqlserverstorageengine/archive/2006/10/20/consistency-checking-options-for-a-vldb.aspx" target=_blank mce_href="http://blogs.msdn.com/sqlserverstorageengine/archive/2006/10/20/consistency-checking-options-for-a-vldb.aspx"&gt;CHECKDB (Part 6): Consistency checking options for a VLDB&lt;/A&gt;). If you're faced with a corrupt (or suspected corrupt)&amp;nbsp;database and you're only just starting to think about how long a CHECKDB is going to take - you've screwed up. You should at least know how long a CHECKDB &lt;EM&gt;usually&lt;/EM&gt; takes.&lt;/P&gt;
&lt;P&gt;Of course, there's one more &lt;EM&gt;really&lt;/EM&gt; unhelpful answer I could have given - &lt;EM&gt;how long is a piece of string?!? :-)&lt;/EM&gt;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=1521725" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/DBCC/default.aspx">DBCC</category><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/Disaster+Recovery/default.aspx">Disaster Recovery</category><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/CHECKDB+Series/default.aspx">CHECKDB Series</category></item><item><title>CHECKDB (Part 6): Consistency checking options for a VLDB</title><link>http://blogs.msdn.com/sqlserverstorageengine/archive/2006/10/20/consistency-checking-options-for-a-vldb.aspx</link><pubDate>Fri, 20 Oct 2006 02:57:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:852006</guid><dc:creator>Paul Randal - MSFT</dc:creator><slash:comments>3</slash:comments><comments>http://blogs.msdn.com/sqlserverstorageengine/comments/852006.aspx</comments><wfw:commentRss>http://blogs.msdn.com/sqlserverstorageengine/commentrss.aspx?PostID=852006</wfw:commentRss><description>&lt;P&gt;&lt;EM&gt;(Yippee - just finished my certification dives and got my PADI Open Water certification - just in time for our dive trip to Indonesia in December :-)&lt;/EM&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is a question that comes up a lot - in fact 3 times this week already - most recently during a guest lecture I did on DBCC for one of Kimberly's popular "&lt;A class="" href="http://www.sqlskills.com/default.asp" mce_href="http://www.sqlskills.com/default.asp"&gt;Immersion Events&lt;/A&gt;". The question is: &lt;EM&gt;how can I run consistency checks on a VLDB?&lt;/EM&gt; &lt;/P&gt;
&lt;P&gt;We're talking hundreds of&amp;nbsp;GBs or&amp;nbsp;1 TB or more. These databases are&amp;nbsp;now common on SQL Server 2000 and 2005, with more migrations happening all the time.&amp;nbsp;Any sensible DBA knows the value of running consistency checks, even when the system is behaving perfectly and the hardware is rock-solid. The two problems that people have with running a full DBCC CHECKDB on their VLDB are:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;It takes a long time to run (proportional to the database size and schema complexity).&lt;/LI&gt;
&lt;LI&gt;It uses lots of resources.&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;So it uses lots of resources for a long time.&amp;nbsp;Even with a decent sized maintenance window, the CHECKDB may well run over into normal operations.&amp;nbsp;There's also the case of a system that's&amp;nbsp;already pegged in more or more resource dimensions (memory, CPU, IO bandwidth).&amp;nbsp;Whatever the case, there are a number of options:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Don't run consistency checks&lt;/LI&gt;
&lt;LI&gt;Run a DBCC CHECKDB using the WITH PHYSICAL_ONLY option&lt;/LI&gt;
&lt;LI&gt;Use SQL Server 2005's partitioning feature and devise a consistency checking plan around that&lt;/LI&gt;
&lt;LI&gt;Figure out your own scheme to divide up the consistency checking work over several days&lt;/LI&gt;
&lt;LI&gt;Use a separate system&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;Let's look at each in turn.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Don't run consistency checks&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Don't be daft. Don't even think about using this option. If you absolutely&amp;nbsp;cannot figure out a way to get consistency checks on your system, send me email and I'll help you. Now let's move on to serious options...&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Use WITH PHYSICAL_ONLY&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;A full CHECKDB does a lot of stuff - see my &lt;A class="" href="https://blogs.msdn.com/sqlserverstorageengine/archive/tags/CHECKDB+Series/default.aspx" mce_href="https://blogs.msdn.com/sqlserverstorageengine/archive/tags/CHECKDB+Series/default.aspx"&gt;CHECKDB internals series&lt;/A&gt; for more details. You can vastly reduce the run-time and resource-usage of CHECKDB by using the WITH PHYSICAL_ONLY option. With this option, CHECKDB will&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Run the equivalent of&amp;nbsp;DBCC CHECKALLOC (i.e. check all the allocation structures)&lt;/LI&gt;
&lt;LI&gt;Read and audit every allocated page in the database&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;So it skips all the logical checks, inter-page checks, and things like DBCC CHECKCATALOG. The fact that all allocated pages are read means that:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Any pages that cannot be read at all (i.e. 823 errors) will be discovered&amp;nbsp;&lt;/LI&gt;
&lt;LI&gt;If&amp;nbsp;page checksums are enabled in SQL Server 2005, any corruptions caused by storage hardware will be discovered (as the page checksum will have changed).&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;So there's a trade-off of consistency checking depth against runtime and resource usage.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Use the partitioning feature to your advantage&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;One of the obvious ways to reduce the time/resources issue is to partition the load. If you're using the partitioning feature in SQL Server 2005 then you're already setup for this. Given that you've hopefully got your partitions stored on seperate filegroups, you can use the DBCC CHECKFILEGROUP command.&lt;/P&gt;
&lt;P&gt;Consider this example -&amp;nbsp;you have the database partitioned by date such that the current month is on a read-write filegroup and the past 11 months are on read-only filegroups (data from more than a year ago is on some offline storage medium). The prior months also have multiple backups on various media so are considered much 'safer' than the current month. It makes sense then that you don't need to check these filegroups as often as the current month's filegroup so an example consistency checking scheme would be:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Run a DBCC CHECKFILEGROUP on each read-only filegroup every week or two&lt;/LI&gt;
&lt;LI&gt;Run a DBCC CHECKFILEGROUP on the read-write filegroup every day or two (depending on the stability of the hardware, the criticality of the data, and the frequency and comprehensiveness of your backup strategy).&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;I know of several companies who've made the decision to move to SQL Server 2005 in part because of this capability to easily divide up the consistency checking.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Figure out your own way to partition the checks&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;If you're on SQL Server 2000, or you just haven't partitioned your database, then there are ways you can split up the consistency checking workload so that it fits within a maintenance window. Here's one scheme that I've recommended to several customers:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Run a bi-weekly DBCC CHECKALLOC&lt;/LI&gt;
&lt;LI&gt;Figure out your largest tables (by number of pages) and split the total number into 7 buckets, such that there are a roughly equal number of database pages in each bucket.&lt;/LI&gt;
&lt;LI&gt;Take all the remaining tables in the database and divide them equally between the 7 buckets (using number of pages again)&lt;/LI&gt;
&lt;LI&gt;On Sunday:&lt;/LI&gt;
&lt;UL&gt;
&lt;LI&gt;Run a DBCC CHECKALLOC&lt;/LI&gt;
&lt;LI&gt;Run a DBCC CHECKCATALOG&lt;/LI&gt;
&lt;LI&gt;Run a DBCC CHECKTABLE on each table in the first bucket&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;On Monday, Tuesday, Wednesday:&lt;/LI&gt;
&lt;UL&gt;
&lt;LI&gt;Run a DBCC CHECKTABLE on each table in the 2nd, 3rd, 4th buckets, respectively&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;On Thursday:&lt;/LI&gt;
&lt;UL&gt;
&lt;LI&gt;Run a DBCC CHECKALLOC&lt;/LI&gt;
&lt;LI&gt;Run a DBCC CHECKTABLE on each table in the 5th bucket&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;On Friday and Saturday:&lt;/LI&gt;
&lt;UL&gt;
&lt;LI&gt;Run a DBCC CHECKTABLE on each table in the 6th and 7th buckets, respectively&lt;/LI&gt;&lt;/UL&gt;&lt;/UL&gt;
&lt;P&gt;In pre-RTM builds of SQL Server 2005, DBCC CHECKTABLE could not bind to the critical system tables, just like with T-SQL - but that's fixed so you can cover all system tables in SQL Server 2000 and 2005 using the method above. Here's what I mean:&lt;/P&gt;
&lt;P&gt;C:\Documents and Settings\prandal&amp;gt;osql /E&lt;BR&gt;1&amp;gt; select * from sys.sysallocunits&lt;BR&gt;2&amp;gt; go&lt;BR&gt;Msg 208, Level 16, State 1, Server SUNART, Line 1&lt;BR&gt;Invalid object name 'sys.sysallocunits'.&lt;BR&gt;1&amp;gt; dbcc checktable ('sys.sysallocunits')&lt;BR&gt;2&amp;gt; go&lt;BR&gt;DBCC results for 'sysallocunits'.&lt;BR&gt;There are 112 rows in 2 pages for object "sysallocunits".&lt;BR&gt;DBCC execution completed. If DBCC printed error messages, contact your system&lt;BR&gt;administrator.&lt;BR&gt;1&amp;gt;&lt;/P&gt;
&lt;P&gt;There's one drawback to this method -&amp;nbsp;a new internal database snapshot is created each time you start a new DBCC command, even for a DBCC CHECKTABLE. If the update workload on the database is significant, then there could be a lot of transaction log to recover each time the database snapshot is created - leading to a long total run-time. In this case, you may need to alter the number of buckets you use to make the total operation fit within your available window.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Use a separate system&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;This alternative is relatively simple - restore your backup (you are taking regular backups, right?) on another system and run a full CHECKDB on the restored database. This offloads the consistency checking burden from the production system and also allows you to check that your backups are valid (which&amp;nbsp;you're already checking though, right?). There are some drawbacks to this however:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;DIV mce_keep="true"&gt;You need to have sufficient disk space on the spare system to be able to restore the backup onto. If the production database is several TB, you need the same several TB on the spare box. This equates to a non-trivial amount of money - initial capital investment plus ongoing storage mgmt costs. (I'm working on this though - I have a patent on consistency checking a database in a backup without restoring it - unclear at this time whether it will make it into Katmai.)&lt;/DIV&gt;&lt;/LI&gt;
&lt;LI&gt;
&lt;DIV mce_keep="true"&gt;If the consistency checks find an error, you don't know for sure that the database is corrupt on the production system. It could be a problem with the spare box that's caused the corruption. The only way to know for sure is to run a consistency check on the production system. This is a small price to pay though, because most of the time the consistency checks on the spare system will be ok, so you know the production database was clean at the time the backup was taken.&lt;/DIV&gt;&lt;/LI&gt;&lt;/UL&gt;
&lt;P mce_keep="true"&gt;&lt;STRONG&gt;Summary&lt;/STRONG&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;You've got a bunch of choices to allow you to run consistency checks, so there's really no excuse for not knowing (within a reasonable timeframe) that something's gone wrong with your database. If you need further help working out what to do, or just want a critical eye cast over the plan you've come up with, send me an email at &lt;A href="mailto:prandal@microsoft.com"&gt;prandal@microsoft.com&lt;/A&gt;&lt;/P&gt;
&lt;P mce_keep="true"&gt;(One of the other questions that also keeps coming up is: &lt;EM&gt;when are you going to write the whitepaper you promised?&lt;/EM&gt; Q1CY07 - honestly!)&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=852006" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/DBCC/default.aspx">DBCC</category><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/Disaster+Recovery/default.aspx">Disaster Recovery</category><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/CHECKDB+Series/default.aspx">CHECKDB Series</category></item><item><title>CHECKDB (Part 5): What does CHECKDB really do? (part 4 of 4)</title><link>http://blogs.msdn.com/sqlserverstorageengine/archive/2006/09/20/763447.aspx</link><pubDate>Wed, 20 Sep 2006 09:03:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:763447</guid><dc:creator>Paul Randal - MSFT</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/sqlserverstorageengine/comments/763447.aspx</comments><wfw:commentRss>http://blogs.msdn.com/sqlserverstorageengine/commentrss.aspx?PostID=763447</wfw:commentRss><description>&lt;P&gt;&lt;EM&gt;(Another airport, another blog post - I really must make an effort to come up with more &lt;/EM&gt;&lt;EM&gt;original banter - I'm sure I've used that line before. TechEd Guangzhou has finished and I'm on the &lt;/EM&gt;&lt;EM&gt;way to Beijing for TechEd #3. Checking in and getting through security were challenging to say the &lt;/EM&gt;&lt;EM&gt;least this morning - there seemed to be problems at every turn, exacerbated by my China Southern &lt;/EM&gt;&lt;EM&gt;airlines chaperone who spoke limited English (or what I should really say is that the problem was I &lt;/EM&gt;&lt;EM&gt;don't speak Mandarin). Finally at the security checkpoint they asked for my ticket as well as my &lt;/EM&gt;&lt;EM&gt;boarding pass - try explaining "e-ticket" to someone who doesn't speak the same language...)&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;So part 4 of the high-level explanation of what &lt;FONT face="Courier New"&gt;DBCC CHECKDB&lt;/FONT&gt; does will focus on the three completely new sets of checks that are in CHECKDB in SQL Server 2005. Well, the metadata checks (i.e. &lt;FONT face="Courier New"&gt;DBCC CHECKCATALOG&lt;/FONT&gt;) used to exist but so many people used to run both CHECKDB and &lt;FONT face="Courier New"&gt;DBCC CHECKCATALOG&lt;/FONT&gt; that I thought I'd make it easier in SQL Server 2005 by running &lt;FONT face="Courier New"&gt;DBCC CHECKCATALOG&lt;/FONT&gt; as part of CHECKDB. Maybe in Katmai we'll add in &lt;FONT face="Courier New"&gt;DBCC CHECKCONSTRAINTS&lt;/FONT&gt; - haven't decided yet.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Primitive checks of critical system tables&lt;/STRONG&gt;&lt;BR&gt;(Part 1...)&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Allocation checks&lt;/STRONG&gt;&lt;BR&gt;(Part2...)&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Logical checks of critical system tables&lt;/STRONG&gt;&lt;BR&gt;&lt;STRONG&gt;Logical checks of all tables&lt;/STRONG&gt;&lt;BR&gt;(Part 3...)&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Service Broker checks&lt;/STRONG&gt;&lt;BR&gt;&lt;STRONG&gt;Metadata cross-checks&lt;/STRONG&gt;&lt;BR&gt;&lt;STRONG&gt;Indexed view and XML index checks&lt;/STRONG&gt;&lt;BR&gt;Its important that these checks are done after the per-table logical checks. This is because any repairs done as part of the per-table checks may affect the outcome of these checks quite substantially.&lt;/P&gt;
&lt;P&gt;Let's look at an example of this. Imagine the case where an indexed view is based on a join between two tables, &lt;FONT face="Courier New"&gt;foo&lt;/FONT&gt; and &lt;FONT face="Courier New"&gt;bar&lt;/FONT&gt;. Table &lt;FONT face="Courier New"&gt;foo&lt;/FONT&gt; has a damaged page and is damaged in a way that reading the page as part of a query would not recognize - because full page audits are not done as part of regular page reads. The damage is such that the page has to be deleted as part of repair - thereby changing the results of the view. If the indexed view was checked &lt;EM&gt;before&lt;/EM&gt; tables &lt;FONT face="Courier New"&gt;foo&lt;/FONT&gt; and &lt;FONT face="Courier New"&gt;bar&lt;/FONT&gt;, then the repair that was done would not get reflected in the indexed view and so the indexed view would essentially be corrupt. The same logic holds for checking XML indexes and Service Broker tables.&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;(Ok - in the air now with some serious turbulence - luckily I popped a Dramamine before we took off. And, more jellyfish in the in-flight meal - very cool :-) The 1st class cabin has what looks like a fully stocked bar - a little too early at 10am to be starting, although it could make for some interesting blog posts...)&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Service Broker checks&lt;/STRONG&gt;&lt;BR&gt;The Service Broker dev team wrote a comprehensive set of checks of the data stored in their internal on-disk structures. The checks validate the relationships between conversations, endpoints, messages and queues. For example:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;a conversation must have two endpoints 
&lt;LI&gt;a service must be related to a valid contract 
&lt;LI&gt;a service must be related to a valid queue 
&lt;LI&gt;a message must have a valid message type&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;What's even cooler is that they implemented a set of logical repairs, so if one of their internal tables was damaged, and repaired by the earlier logical checks, then the Service Broker repair code can clean up any Service Broker entities and entity relationships that were damaged too.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Metadata cross-checks&lt;/STRONG&gt;&lt;BR&gt;The Metadata team also wrote a great set of checks for the relational metadata stored in the system tables. These checks that are run here are the same code that's run for &lt;FONT face="Courier New"&gt;DBCC CHECKCATALOG&lt;/FONT&gt;. The actual checks themselves in SQL Server 2005 are far more involved than in SQL Server 2000, and they're done way more efficiently too. However, they're not comprehensive by any means - we'll add some more checks during Katmai.&lt;/P&gt;
&lt;P&gt;The checks only cover the &lt;EM&gt;relational&lt;/EM&gt; metadata - i.e. the relationships between system table storing relational metadata. There are no such checks for the tables storing the storage engine metadata (also known as the &lt;EM&gt;critical system tables&lt;/EM&gt; that were described in &lt;A href="https://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/19/636410.aspx"&gt;Part 1&lt;/A&gt;&amp;nbsp;of this 4 part sub-series. At present, these relationships between these tables are checked implicitly by the metadata checks I described in the &lt;A href="https://blogs.msdn.com/sqlserverstorageengine/archive/2006/09/18/761287.aspx"&gt;last post&lt;/A&gt;. The checks are a series of validations between the various system tables, such as checking each rowset referenced in the &lt;FONT face="Courier New"&gt;sysrowsetcolumns&lt;/FONT&gt; system table exists in the &lt;FONT face="Courier New"&gt;sysrowsets&lt;/FONT&gt; system table.&lt;/P&gt;
&lt;P&gt;There are no repairs for metadata corruptions. Metadata corruptions are extremely difficult to deal with because changing/deleting metadata has an effect on the entire table the corrupt metadata describes and could potentially be as bad as deleting the table - a table without metadata is just a collection of pages with indecipherable rows (well, not quite true - its &lt;EM&gt;possible&lt;/EM&gt; to decipher any record without metadata but it requires human intervention and is incredibly hard).&lt;/P&gt;
&lt;P&gt;Its possible we may put in some limited metadata repairs in a future release, but the frequency of their occurence in the field is so low that I decided that the engineering investment was not justified for SQL Server 2005 and so didn't push it. So - if you get any metadata corruption, you need to restore from your backups which, of course, after reading through my blog, you've been scared into making sure you have, right?...&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Indexed view and XML index checks&lt;/STRONG&gt;&lt;BR&gt;These are very cool and many thanks to &lt;EM&gt;Conor Cunningham&lt;/EM&gt; (dev lead of one of the two Query Optimizer teams) for helping work out how to do this.The indexed view contains the persisted results of a query, and we have the actual query it persists stored in metadata - so the easiest way to check whether the indexed view is accurate is to recalculate the view results into a temp table in TEMPDB and then compare the calculated values with the values persisted in the indexed view. The view is regenerated into a temporary table and then two concurrent left-anti-semi-joins are run, that basically return all the rows in the indexed view that are not in the recalculated view results, and vice-versa. This gives us all the rows that are extraneous in the indexed view and all the rows that are missing from it.&lt;/P&gt;
&lt;P&gt;Indexed-view problems can also be repaired. The repair for extra rows is to delete them one by one (using internal query syntax that only works from DBCC), and the repair for missing rows is to rebuild the indexed view. This is done by simply disabling the indexed view and then bringing it back online (which rebuilds it).&lt;/P&gt;
&lt;P&gt;There are two drawbacks to the indexed view checks if the views are large:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;it can take up a lot of space in tempdb 
&lt;LI&gt;it can take a lot of time to run the regeneration of the indexed views and to run the left-anti-semi-joins&lt;/LI&gt;&lt;/OL&gt;
&lt;P&gt;So if you upgraded your database from SQL Server 2000 and you regularly run full CHECKDBs (i.e. without using &lt;FONT face="Courier New"&gt;WITH PHYSICAL_ONLY&lt;/FONT&gt;), then you may see a run-time increase for CHECKDB on SQL Server 2005 - this is documented in BOL and the README.&lt;/P&gt;
&lt;P&gt;XML index checks work in a similar way. The XML blobs in the table are re-shredded and checked against the shredding that's persisted in the primary XML index. If anything is wrong, the primary XML index is recreated.&lt;/P&gt;
&lt;P&gt;And that's it. Now you have a complete picture of what's going on with CHECKDB when it runs. In the next few posts I want to give some insight into how CHECKDB manages to do all these checks while only making a single pass through the database and shed some light on how repair works. Until then - time to chill out before doing more sessions in Beijing tomorrow... &lt;EM&gt;qing lai yiping pijiu&lt;/EM&gt;!!!&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=763447" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/DBCC/default.aspx">DBCC</category><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/CHECKDB+Series/default.aspx">CHECKDB Series</category></item><item><title>CHECKDB (Part 4): What does CHECKDB really do? (part 3 of 4)</title><link>http://blogs.msdn.com/sqlserverstorageengine/archive/2006/09/18/761287.aspx</link><pubDate>Mon, 18 Sep 2006 21:43:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:761287</guid><dc:creator>Paul Randal - MSFT</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/sqlserverstorageengine/comments/761287.aspx</comments><wfw:commentRss>http://blogs.msdn.com/sqlserverstorageengine/commentrss.aspx?PostID=761287</wfw:commentRss><description>&lt;DIV&gt;&lt;EM&gt;(OK - I was wrong about the posting frequency - events overtook me and I got bogged down preparing for these TechEds. China is an amazing place and I wish I had more time to soak up some of its culture while I'm here but with the TechEds being so close together there's little time for anything except sessions, email, taxis, flying, sleeping, jellyfish... I did manage to go to the 400 year old Yuyuan Gardens in Shanghai, which are well worth a visit. Now I'm sitting here next to the Pearl River in Guangzhou waiting to do DAT305: Choosing a High Availability Solution -&amp;nbsp;so time to squeeze in another post.)&lt;/EM&gt;&lt;/DIV&gt;
&lt;DIV&gt;
&lt;P&gt;In the&amp;nbsp;previous posts of this series, I covered the &lt;A href="https://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/19/636410.aspx"&gt;system table checks&lt;/A&gt;&amp;nbsp;that have to be done before anything else can be checked by CHECKDB, and the &lt;A href="https://blogs.msdn.com/sqlserverstorageengine/archive/2006/07/18/670341.aspx"&gt;allocation checks&lt;/A&gt;&amp;nbsp;that have to be done before any of the logical checks. Now I want to cover the meat of the functionality in CHECKDB -&amp;nbsp;the logical checks. Note - this is a description of what happens for SQL Server 2005 - its pretty similar in SQL Server 2000.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Primitive checks of critical system tables&lt;/STRONG&gt;&lt;BR&gt;(Part 1...)&lt;/P&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;STRONG&gt;Allocation checks&lt;/STRONG&gt;&lt;/DIV&gt;
&lt;DIV&gt;(Part 2...)&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;&lt;STRONG&gt;Logical checks of critical system tables&lt;/STRONG&gt;&lt;BR&gt;&lt;STRONG&gt;Logical checks of all tables&lt;BR&gt;&lt;/STRONG&gt;Although this section has two titles, the actual logical checks performed in each stage are the same - i.e. everything I'm going to describe below. If any errors are found in the critical system tables (remember these from &lt;A href="https://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/19/636410.aspx"&gt;Part 1&lt;/A&gt;?) and:&lt;/DIV&gt;
&lt;UL&gt;
&lt;LI&gt;repair is not specified; or, 
&lt;LI&gt;repair is specified, but not all the errors can be repaired&lt;/LI&gt;&lt;/UL&gt;
&lt;DIV&gt;then the CHECKDB finishes. An example of an unrepairable system table error is something that would require deleting data from one of the system tables - e.g. a corrupt key value in a clustered index data page of &lt;EM&gt;sysallocunits&lt;/EM&gt; (remember that this is the actual table I'm talking about, not the &lt;EM&gt;sys.allocation_units&lt;/EM&gt; catalog view you may have seen or used).&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;So if the system tables are clean, all the tables in the database are checked. This&amp;nbsp;includes indexed views and primary XML indexes (which are both stored as clustered indexes - and&amp;nbsp;as far as the storage engine is concerned&amp;nbsp;are objects in their own right - its the relational layer that knows that they're not &lt;EM&gt;really&lt;/EM&gt; separate objects).&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;The following checks are performed:&lt;/DIV&gt;
&lt;UL&gt;
&lt;LI&gt;Validate each table's storage engine metadata 
&lt;LI&gt;Read and check all data, index and text pages, depending on the page type 
&lt;LI&gt;Check all inter-page relationships 
&lt;LI&gt;Check the page header counts 
&lt;LI&gt;Perform any necessary repairs&lt;/LI&gt;&lt;/UL&gt;
&lt;DIV&gt;Let's look at each of these stages in more detail.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;&lt;STRONG&gt;Validate each table's storage engine metadata&lt;/STRONG&gt;&lt;/DIV&gt;
&lt;DIV&gt;Why do we need to do this? Well, the storage engine metadata is what tells CHECKDB how to crack open records on pages in particular rowsets - so if there's something wrong with it then CHECKDB will generate a bunch of&amp;nbsp;misleading errors or, worse yet,&amp;nbsp;miss some errors. Here's roughly what happens while the metadata is parsed:&lt;/DIV&gt;
&lt;UL&gt;
&lt;LI&gt;Loop through all the indexes, and then all the rowsets of each index, building a list of known allocation units for the index (including all the DATA, LOB, and SLOB allocation units). This allows reverse lookups - when we read a page, because the information stamped on the page is the allocation unit, we need a very fast way to convert between an allocation unit and an index/object ID. 
&lt;LI&gt;Make sure we skip any indexes that are in the middle of being built/rebuilt online, because they will not have a complete set of data. 
&lt;LI&gt;Build a list of computed columns and generate the necessary code to enable the column values to be recomputed. 
&lt;LI&gt;Build a list of columns that are used in non-clustered indexes 
&lt;LI&gt;Make sure the various USED, DATA, RSVD counts are not negative&amp;nbsp;(see the BOL for DBCC CHECKDB for an explanation of how this can happen - yes, I'm deliberately not explaining here so that you read through the new BOL entry :-) 
&lt;LI&gt;Figure out what each kind of column is (e.g. a regular column, a generated uniquifier, a dropped column) 
&lt;LI&gt;Build a series of mappings to allow conversion between column IDs at different levels of storage abstraction 
&lt;LI&gt;Check that the relational and storage-engine nullability flags for a column agree 
&lt;LI&gt;Make sure the columns counters in metadata match what we've just seen&lt;/LI&gt;&lt;/UL&gt;
&lt;DIV&gt;&lt;STRONG&gt;Read and check all data, index and text pages&lt;/STRONG&gt;&lt;/DIV&gt;
&lt;DIV&gt;No matter what type a page is, the page is audited and then all the records on it are audited.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;Page audit checks first of all for IO errors when the page is read (e.g. &lt;A href="https://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/29/Enabling_CHECKSUM_in_SQL2005.aspx"&gt;page checksums or torn-page detection&lt;/A&gt;). If there are some, then the page is not processed any further. This is what can lead to errors like 8976 being reported (e.g. &lt;FONT face="Courier New"&gt;Table error: Object ID 132765433, index ID 1, partition ID 72057594038321152, alloc unit ID 72057594042318848 (type DATA). Page (1:3874) was not seen in the scan although its parent (1:3999) and previous (1:3873) refer to it. Check any previous errors.&lt;/FONT&gt;). Then it checks for page header correctness and that the page has an appropriate type for the allocation unit its in (e.g. a DATA page should not be found in&amp;nbsp; an allocation unit for a non-clustered index)&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;Record audits include checking the various fields in the record header and that the various offsets in the record make sense (e.g. the offset to the variable length columns section of the record should not point off the end of the record. See the &lt;A href="https://blogs.msdn.com/sqlserverstorageengine/archive/2006/08/09/692806.aspx"&gt;post&lt;/A&gt;&amp;nbsp;on cracking records using DBCC PAGE for more info on the record format.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;The more complex checks that are done per-page depend on what type the page is. As an example, here's what is done for a DATA page in the leaf level of a clustered index (excluding the inter-page relationships - I'll list those in the next section):&lt;/DIV&gt;
&lt;UL&gt;
&lt;LI&gt;Records in a page must be strictly ordered by the defined keys of the index (although the records themselves aren't necessarily stored in sorted order in the data portion of the page, accessing the records through the slot array must yield them in the correct order) 
&lt;LI&gt;No two records can have duplicate key values (remember that non-unique indexes have a hidden, automatically-generated uniquifier column added to the key - to make or extend the composite key - so that record uniqueness is guaranteed) 
&lt;LI&gt;If the index is partitioned, each record is run through the partitioning function to ensure its stored in the correct partition 
&lt;LI&gt;All the complex columns in each record are checked: 
&lt;UL&gt;
&lt;LI&gt;Complex columns are those storing legacy text or LOB values (&lt;FONT face="Courier New"&gt;text&lt;/FONT&gt;, &lt;FONT face="Courier New"&gt;ntext,&lt;/FONT&gt; &lt;FONT face="Courier New"&gt;image&lt;/FONT&gt;, &lt;FONT face="Courier New"&gt;XML&lt;/FONT&gt;, &lt;FONT face="Courier New"&gt;nvarchar(max)&lt;/FONT&gt;, &lt;FONT face="Courier New"&gt;varchar(max)&lt;/FONT&gt;, &lt;FONT face="Courier New"&gt;varbinary(max)&lt;/FONT&gt;) or in-row pointers to variable length columns that have been pushed off-row in rows that are longer than 8060 bytes 
&lt;LI&gt;The column is checked to make sure its storing the right kind of data - either the value itself or a text pointer or some kind of in-row root containing pointers to portions of an off-row value. 
&lt;LI&gt;The linkages between what is stored in-row and the off-row values stored in other pages are eventually checked too&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;Check computed columns 
&lt;UL&gt;
&lt;LI&gt;If the column is persisted (either because its defined as a persisted computed column or because its used as a non-clustered index key), its value is recomputed and checked against the persisted value 
&lt;LI&gt;This is also important when we come to do the non-clustered index cross-checks - as any discrepancy in the stored column values will cause mismatches.&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;Data purity checks 
&lt;UL&gt;
&lt;LI&gt;The column value is checked to ensure its within the bounds for its data-type (e.g. the minutes-past-midnight portion of the internal representation of a datetime value cannot be greater than 1440 - 24 hours x 60 minutes)&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;Non-clustered index cross-checks &lt;/LI&gt;
&lt;UL&gt;
&lt;LI&gt;This is probably my favorite part of CHECKDB and is one of the most complicated bits of code. 
&lt;LI&gt;What CHECKDB is trying to do is make sure that each record in a heap or clustered index has exactly one matching record in each non-clustered index, and vice-versa. The brute-force (n-squared complexity) way to do this is to do a physical lookup of all the matching rows - but that's incredibly time consuming so instead we have a fast algorithm to detect problems. 
&lt;LI&gt;Imagine a table defined by the following DDL:&lt;/LI&gt;&lt;/UL&gt;&lt;/UL&gt;
&lt;BLOCKQUOTE dir=ltr style="MARGIN-RIGHT: 0px"&gt;
&lt;P&gt;&lt;SPAN style="COLOR: yellow; FONT-FAMILY: Courier; text-shadow: auto"&gt;&lt;FONT face="Courier New" color=#000000&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;CREATE TABLE t (c1 INT, c2 char(10), c3 varchar(max))&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="COLOR: yellow; FONT-FAMILY: Courier; text-shadow: auto"&gt;&lt;/SPAN&gt;&lt;SPAN style="COLOR: yellow; FONT-FAMILY: Courier; text-shadow: auto"&gt;&lt;FONT face="Courier New" color=#000000&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;CREATE INDEX index1 ON t (c1)&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="COLOR: yellow; FONT-FAMILY: Courier; text-shadow: auto"&gt;&lt;/SPAN&gt;&lt;SPAN style="COLOR: yellow; FONT-FAMILY: Courier; text-shadow: auto"&gt;&lt;FONT face="Courier New" color=#000000&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;CREATE INDEX index2 ON t (c2) INCLUDE (c3)&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;UL&gt;
&lt;UL&gt;
&lt;LI&gt;Each row in the heap has to have two matching non-clustered index rows, on in each of &lt;FONT face="Courier New"&gt;index1&lt;/FONT&gt; and &lt;FONT face="Courier New"&gt;index2&lt;/FONT&gt;. But how can we tell without doing the direct lookup or having to store tremendous amounts of state? We use a hash-bitmap algorithm. 
&lt;LI&gt;Imagine a large bitmap - say half a million bits. Initially all the bits are zero. 
&lt;LI&gt;For each record in the heap that we come across, we generate the matching non-clustered index records and hash them to a value. This is very quick because we have all the column values we need. We then take the hash value, map it directly to a bit in the bitmap and flip the bit. So in our example above, each record in the heap will produce two hash values (one for the record in each non-clustered index) and so will cause two bits to be flipped. 
&lt;LI&gt;For each record in each non-clustered index, just hash the record and flip the corresponding bit. 
&lt;LI&gt;The idea is that the bit-flips should cancel each other out and the bitmap should be left with all zeroes at the end of the checks. 
&lt;LI&gt;Taking the view that corruptions are magnitudes rarer than clean databases, we went a step further and allowed the checks for multiple tables to use the same bitmap. If you think about it, this won't cause any problems, even if two sets of records map to the same bit in the bitmap - as long as the number of bit-flips is a power of 2 (i.e. each record really does have its correct matching record) then there should be no problem. 
&lt;LI&gt;Here's the catch - what happens when there's a bit left on in the bitmap? Well, this is where the trade-off comes into play. If there's a bit left on, we can't tell which records in which table or index mapped to it so, we have to re-scan the tables and indexs that used the bitmap to see which records map to the bit. For every one we find, we actually do the physical lookup of the matching record and then do a comparison of all the columns, including any LOB columns used as &lt;FONT face="Courier New"&gt;INCLUDE'&lt;/FONT&gt;d columns in non-clustered indexes. This process is called the &lt;EM&gt;deep-dive&lt;/EM&gt; and can add a significant amount to the run-time of CHECKDB if it occurs. &lt;/LI&gt;&lt;/UL&gt;&lt;/UL&gt;
&lt;DIV&gt;&lt;STRONG&gt;Check all inter-page relationships&lt;/STRONG&gt;&lt;/DIV&gt;
&lt;DIV&gt;Inter-page relationships are relevant for:&lt;/DIV&gt;
&lt;UL&gt;
&lt;LI&gt;pages in heaps that have forwarding or forwarded records 
&lt;LI&gt;pages in indexes 
&lt;LI&gt;pages that have records with LOB columns that have their data stored off-row (these can be heap pages, clustered index data pages or non-clustered index leaf&amp;nbsp;pages in SQL Server 2005) 
&lt;LI&gt;text pages that have records with child nodes on other pages&lt;/LI&gt;&lt;/UL&gt;
&lt;DIV&gt;Continuing the example above, here are the checks that are done for index pages:&lt;/DIV&gt;
&lt;UL&gt;
&lt;LI&gt;All pages in an index level should be pointed to by a page in the next level higher in the b-tree, and also by the left- and right-hand neighboring pages in the same level of the b-tree. Exceptions are made for the left-most and right-most pages in a level (where the &lt;FONT face="Courier New"&gt;m_prevPage&lt;/FONT&gt; and &lt;FONT face="Courier New"&gt;m_nextPage&lt;/FONT&gt; fields are &lt;FONT face="Courier New"&gt;(0:0)&lt;/FONT&gt;) and for the root page of the b-tree, where the parent page link comes from the storage-engine metadata. The 8976 error message that I referenced above is generated from this set of checks. 
&lt;LI&gt;Key ranges in neighboring pages in a level should not overlap 
&lt;LI&gt;Key ranges of pages should be correctly defined by the parent pages in the next level up in the b-tree (the parent pages contain the minimum key value that can exist on a child page)&lt;/LI&gt;&lt;/UL&gt;
&lt;DIV&gt;I'll go into details of how the inter-page checks are done in a future post. For now, its enough to say that its not an n-squared complexity algorithm.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;&lt;STRONG&gt;Check the page header counts&lt;/STRONG&gt;&lt;/DIV&gt;
&lt;DIV&gt;The page header contains a bunch of counters - the ones we need to check are:&lt;/DIV&gt;
&lt;UL&gt;
&lt;LI&gt;slot count 
&lt;LI&gt;ghost record count 
&lt;LI&gt;free space count&lt;/LI&gt;&lt;/UL&gt;
&lt;DIV&gt;The first two are obvious - count the rows as they're processed and make sure page header counts are valid. The free space count is only checked for text pages and data pages in a heap (the only pages for which free space&amp;nbsp;is tracked in a PFS page).&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;&lt;STRONG&gt;Perform any necessary repairs&lt;/STRONG&gt;&lt;/DIV&gt;
&lt;DIV&gt;I want to leave discussing repairs for another post as there's a ton of info there.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;So, that's the majority of the work in running a DBCC CHECKDB and now its time for my session on choosing an HA solution in SQL Server 2005. Next time I'll post the next part of the &lt;A href="https://blogs.msdn.com/sqlserverstorageengine/archive/category/13747.aspx"&gt;Fragmentation series&lt;/A&gt;.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;&lt;EM&gt;(So I mentioned jellyfish above... I was having a meal last night with Scott Schnoll from the Exchange team and we had what we thought were noodles on the side of one dish. Scott asked what kind of noodles they were and then we found out we'd been eating cold, sliced jellyfish - it was really very tasty!)&lt;/EM&gt;&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;&lt;STRONG&gt;Service Broker checks&lt;/STRONG&gt;&lt;BR&gt;&lt;STRONG&gt;Metadata cross-checks&lt;/STRONG&gt;&lt;BR&gt;&lt;STRONG&gt;Indexed view and XML index checks&lt;/STRONG&gt;&lt;BR&gt;(Part 4...)&lt;/DIV&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=761287" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/DBCC/default.aspx">DBCC</category><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/CHECKDB+Series/default.aspx">CHECKDB Series</category></item><item><title>CHECKDB (Part 3): What does CHECKDB really do? (2 of 4)</title><link>http://blogs.msdn.com/sqlserverstorageengine/archive/2006/07/18/670341.aspx</link><pubDate>Tue, 18 Jul 2006 22:16:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:670341</guid><dc:creator>Paul Randal - MSFT</dc:creator><slash:comments>2</slash:comments><comments>http://blogs.msdn.com/sqlserverstorageengine/comments/670341.aspx</comments><wfw:commentRss>http://blogs.msdn.com/sqlserverstorageengine/commentrss.aspx?PostID=670341</wfw:commentRss><description>&lt;P&gt;In the &lt;A href="https://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/19/636410.aspx"&gt;previous post&lt;/A&gt; of this series, I covered the system table checks that have to be done before anything else can be checked by CHECKDB. Now that I've described &lt;A href="https://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/26/647005.aspx"&gt;pages&lt;/A&gt;, &lt;A href="https://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/28/649884.aspx"&gt;extents&lt;/A&gt;, &lt;A href="https://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/25/646865.aspx"&gt;IAM chains/allocation units&lt;/A&gt;&amp;nbsp;and the &lt;A href="https://blogs.msdn.com/sqlserverstorageengine/archive/2006/07/08/660071.aspx"&gt;major allocation bitmaps&lt;/A&gt;, in this post I'll cover the allocation checks.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;1. Primitive checks of critical system tables&lt;BR&gt;&lt;/STRONG&gt;(Part 1...)&lt;/P&gt;
&lt;P&gt;&lt;BR&gt;&lt;STRONG&gt;2. Allocation checks&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;This stage is in SQL Server 2000 and 2005. Allocation checks are checks of the various structures (IAM pages, IAM chains, allocation units, GAM/SGAM pages, PFS pages) that describe the allocations (pages, extents) that have been done within a database.&lt;/P&gt;
&lt;P&gt;I'll describe how we go about collecting information from the various pages and then describe what some of the actual checks are.&lt;/P&gt;
&lt;P&gt;The allocation checks are very fast (orders of magnitude faster than the logical checks, so fast in fact that they happen in the blink of an&amp;nbsp;eye! well, perhaps I'm&amp;nbsp;getting carried away&amp;nbsp;a little as usual but you get the idea) because the number of database pages that have to be read is very small (so small in fact that... ok, I'll shut-up).&amp;nbsp; The algorithm for gathering allocation data is as follows:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;For each file in each online filegroup in the database (except transaction log files):&lt;/LI&gt;
&lt;UL&gt;
&lt;LI&gt;read all PFS pages&lt;/LI&gt;
&lt;UL&gt;
&lt;LI&gt;This gives us a bitmap showing all IAM pages, plus another one showing all mixed pages&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;read the GAM pages&lt;/LI&gt;
&lt;UL&gt;
&lt;LI&gt;This gives us bitmaps of all allocated extents&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;read the SGAM pages&lt;/LI&gt;
&lt;UL&gt;
&lt;LI&gt;This gives us bitmaps of all mixed extents with at least one unallocated page&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;read the DIFF_MAP pages (a &lt;EM&gt;'differential bitmap'&lt;/EM&gt; page shows which extents in the GAM interval have been modified since the last full or differential backup - a differential backup only needs to backup those extents marked modified in the various DIFF_MAP pages)&lt;/LI&gt;
&lt;UL&gt;
&lt;LI&gt;This is just to make sure the pages can be read&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;read the ML_MAP pages (a &lt;EM&gt;'minimally-logged bitmap' &lt;/EM&gt;page shows which extents in the GAM interval have been modified in bulk-logged recovery mode since the last log backup - a log backup must also backup all such extents to ensure that all changes to the database have been backed up. This can make the log backup quite large (although the log itself stays much smaller) - but that's a topic for another blog post.&lt;/LI&gt;
&lt;UL&gt;
&lt;LI&gt;This is just to make sure the pages can be read&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;read all IAM pages&lt;/LI&gt;
&lt;UL&gt;
&lt;LI&gt;This gives us:&lt;/LI&gt;
&lt;UL&gt;
&lt;LI&gt;a list of all the mixed pages in the file, and by derivation, a list of all mixed extents in the file (remember that the first IAM page in an IAM chain/allocation unit&amp;nbsp;contains an array to hold up to 8 mixed pages for the object/index/partition it represents&lt;/LI&gt;
&lt;LI&gt;a list of all the valid IAM pages in the file&lt;/LI&gt;
&lt;LI&gt;a list of all the allocated dedicated extents in the file&lt;/LI&gt;
&lt;LI&gt;linkage information for IAM chains&lt;/LI&gt;&lt;/UL&gt;&lt;/UL&gt;&lt;/UL&gt;
&lt;LI&gt;After all the per-file stuff, read the Storage Engine metadata&lt;/LI&gt;
&lt;UL&gt;
&lt;LI&gt;This gives us:&lt;/LI&gt;
&lt;UL&gt;
&lt;LI&gt;information about the &lt;EM&gt;root&lt;/EM&gt; of each IAM chain. Each row in the sys.allocation_units hidden metadata table contains the page Id of the first IAM page in the IAM chain for that allocation unit.&lt;/LI&gt;
&lt;LI&gt;information about IAM chains currently waiting to be&lt;EM&gt; deferred-dropped.&lt;/EM&gt; (This is the process by which an IAM chain with &amp;gt; 128 extents that is dropped - by dropping/rebuilding an index or dropping/truncating a table - does not have its actual pages and extents deallocated until after the transaction has committed. The IAM chain is unhooked from sys.allocation_units though and hooked into an internal queue - if we didn't scan that queue too as part of the allocation checks, we'd see all kinds of inconsistencies with the various allocation bitmaps)&lt;/LI&gt;&lt;/UL&gt;&lt;/UL&gt;&lt;/UL&gt;
&lt;P&gt;So, now we've got a whole bunch of allocation data that we're juggling and we need to make sense of it all to see if all the allocation structures are correct. Here's a non-exhaustive list of checks that we do with all this data:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Check that each extent is either allocated to:&lt;/LI&gt;
&lt;UL&gt;
&lt;LI&gt;the GAM page for the GAM interval&lt;/LI&gt;
&lt;LI&gt;or, the SGAM page for the GAM interval, as a non-full mixed extent&lt;/LI&gt;
&lt;LI&gt;or, exactly one IAM page that covers the GAM interval&lt;/LI&gt;
&lt;LI&gt;or, to no bitmap page, but all pages in the extent must be allocated to IAM pages as mixed pages&lt;/LI&gt;
&lt;LI&gt;This could result in an 8903 (GAM and SGAM), 8904 (multiple IAMs), or 8905 (no&amp;nbsp;page)&amp;nbsp;errors depending on the combination of bitmaps that have the extent allocated&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;Check that all pages marked as being IAM pages in PFS pages really are IAM pages when they're read&lt;/LI&gt;
&lt;LI&gt;Check that all pages marked as being mixed pages in PFS pages appear somewhere in a mixed page array on an IAM page&lt;/LI&gt;
&lt;LI&gt;Check that each mixed page is only allocated in a single IAM page&lt;/LI&gt;
&lt;LI&gt;Check that the IAM pages in an IAM chain have monatonically increasing sequence numbers&lt;/LI&gt;
&lt;LI&gt;Check that the first IAM page in an IAM chain has a reference from a row in sys.allocation_units&lt;/LI&gt;
&lt;LI&gt;Check that no two IAM pages within the same IAM chain map the same GAM interval&lt;/LI&gt;
&lt;LI&gt;Check that all IAM pages within an IAM chain belong to the same object/index/partition&lt;/LI&gt;
&lt;LI&gt;Check that the linkages within an IAM chain are correct (no missing pages for instance)&lt;/LI&gt;
&lt;LI&gt;Check that all IAM/GAM/SGAM pages that map the final GAM interval in a file do not have extents marked allocated that are beyond the physical end of the file&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;Any errors found here will require REPAIR_ALLOW_DATA_LOSS to repair and some of the repairs are very complicated (e.g. multiply-allocated extents) - topic for a future blog post.&lt;/P&gt;
&lt;P&gt;So, the allocation checks lay the next foundation level over the system table primitive checks and we're ready to move on to the logical checks.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;3. Logical checks of critical system tables&lt;BR&gt;4. Logical checks of all tables&lt;/STRONG&gt;&lt;BR&gt;(Part 3...)&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;5. Service Broker checks&lt;BR&gt;6. Metadata cross-checks&lt;BR&gt;7. Indexed view and XML index checks&lt;/STRONG&gt;&lt;BR&gt;(Part 4...)&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=670341" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/DBCC/default.aspx">DBCC</category><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/CHECKDB+Series/default.aspx">CHECKDB Series</category></item><item><title>CHECKDB (Part 2): What does CHECKDB really do? (1 of 4)</title><link>http://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/19/636410.aspx</link><pubDate>Mon, 19 Jun 2006 01:09:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:636410</guid><dc:creator>Paul Randal - MSFT</dc:creator><slash:comments>4</slash:comments><comments>http://blogs.msdn.com/sqlserverstorageengine/comments/636410.aspx</comments><wfw:commentRss>http://blogs.msdn.com/sqlserverstorageengine/commentrss.aspx?PostID=636410</wfw:commentRss><description>&lt;P&gt;Hmmm - I sat for 5 minutes thinking of something amusing to say to start this one off and nothing came to mind, so I'm afraid this will be a humor-free post. Maybe I'm jet-lagged from being on the East coast all last week.&lt;/P&gt;
&lt;P&gt;As with all things related to DBCC, this topic has its share of misinformation. In this post I'll set the record straight by running through all the stages of CHECKDB in SQL Server 2000 and 2005. I'll need to split this up into seperate posts otherwise I'll be writing a book. I also introduce a whole raft of new terms which will also be subjects for future posts (my list is already getting pretty long!)&lt;/P&gt;
&lt;P&gt;So the very first thing it does is work out how to get the transactionally consistent view it requires (see &lt;A href="http://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/09/623789.aspx"&gt;CHECKDB Part 1&lt;/A&gt;) and then, if needed, either record the relevant LSN and switch to full-logging (for SQL Server 2000) or create a database snapshot (for SQL Server 2005).&lt;/P&gt;
&lt;P&gt;Then it runs through the checks in the order shown below:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;1. Primitive checks of critical system tables&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;This stage is in SQL Server 2000 and 2005. First of all, what are critical system tables? These are the system tables that hold Storage Engine metadata. Without these we'd have no idea where any data was stored in the database files or how to interpret records.&lt;/P&gt;
&lt;P&gt;In SQL Server 2000, the critical system tables are:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;sysindexes&lt;/LI&gt;
&lt;LI&gt;sysobjects&lt;/LI&gt;
&lt;LI&gt;syscolumns&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;These tables have to be checked first because we use the metadata they contain to access all the other tables and indexes in the database. These tables are freely queryable so poke about and see what's stored in there.&lt;/P&gt;
&lt;P&gt;In SQL Server 2005, the metadata layer has been rewritten and the critical system tables are:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;sysallocunits&lt;/LI&gt;
&lt;LI&gt;syshobts&lt;/LI&gt;
&lt;LI&gt;syshobtcolumns&lt;/LI&gt;
&lt;LI&gt;sysrowsets&lt;/LI&gt;
&lt;LI&gt;sysrowsetcolumns&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;More on allocation units, hobts, and rowsets later in the week - for now you can assume they serve the same function as the three critical system tables in SQL Server 2000. You can't see these system tables because they're 'hidden' - the parser won't allow them to be bound to in a query. Try running '&lt;FONT face="Courier New"&gt;select * from sysallocunits&lt;/FONT&gt;' to see what I mean.&lt;/P&gt;
&lt;P&gt;The primitive checks are designed to check that internal queries on the metadata tables won't throw errors. Each of the critical system tables has a clustered index. The primitive checks just check the leaf-level data pages of the clustered indexes. For every one of these pages, the following is done:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Read and latch the page (a latch is a lightweight internal version of a lock).&amp;nbsp; This makes sure that there aren't any IO problems with the page such as a torn-page or bad page checksum and&amp;nbsp;ensures that we can put the page in the buffer pool correctly. This is the most common cause of failure of the primitive system table checks and results in error 8966, which in SQL Server 2000 could look something like:&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;FONT face="Courier New"&gt;Server: Msg 8966, Level 16, State 1, Line 1&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Could not read and latch page (1:33245) with latch type SH. sysobjects failed.&lt;/FONT&gt;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Audit the page. This is a series of checks of the page structures which I'll cover in a seperate post. If these pass, the page looks like a SQL Server page of type its supposed to be.&lt;/LI&gt;
&lt;LI&gt;Check the basic page linkage. Pages in each level of a clustered index are linked together in a doubly-linked list to allow range scans to work. At this stage we only check the left-to-right linkage to ensure the linked-to page actually exists.&lt;/LI&gt;
&lt;LI&gt;Check the page linkage for loops.&amp;nbsp;This is simple to do - have two pointers into the page linked-list with one advancing at every step and one advancing at every second step. If they ever point to the same thing before the faster-advancing pointer reaches the right-hand side of the leaf level then there's a loop. Its important that there are no linkage loops otherwise a range scan may turn into an infinite loop. I've never seen this occur in the field.&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;Any error found at this stage cannot be repaired&amp;nbsp;so you must restore from a backup. This is because the&amp;nbsp;repair would have to deallocate the page, effectively deleting metadata for a whole lot of tables and indexes. As people's databases get larger and more complex (thousands of tables and indexes), the percentage of pages that comprise these critical system tables rises and so the chance of a hardware problem corrupting one of these pages also rises - I see several of these a month on the forums. Without a backup, the only alternative is to try to export as much data as you can - not good.&lt;/P&gt;
&lt;P&gt;If all the pages are ok then we know we've got solid enough metadata on which to base the next set of checks.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;2. Allocation checks&lt;BR&gt;&lt;/STRONG&gt;(Part 2...)&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;3. Logical checks of critical system tables&lt;BR&gt;4. Logical checks of all tables&lt;BR&gt;&lt;/STRONG&gt;(Part 3...)&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;5. Service Broker checks&lt;BR&gt;6. Metadata cross-checks&lt;BR&gt;7. Indexed view and XML index checks&lt;BR&gt;&lt;/STRONG&gt;(Part 4...)&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=636410" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/DBCC/default.aspx">DBCC</category><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/CHECKDB+Series/default.aspx">CHECKDB Series</category></item><item><title>CHECKDB (Part 1): How does CHECKDB get a consistent view of the database?</title><link>http://blogs.msdn.com/sqlserverstorageengine/archive/2006/06/09/623789.aspx</link><pubDate>Fri, 09 Jun 2006 01:44:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:623789</guid><dc:creator>Paul Randal - MSFT</dc:creator><slash:comments>7</slash:comments><comments>http://blogs.msdn.com/sqlserverstorageengine/comments/623789.aspx</comments><wfw:commentRss>http://blogs.msdn.com/sqlserverstorageengine/commentrss.aspx?PostID=623789</wfw:commentRss><description>&lt;P&gt;As you can guess from the title, I'm planning a long series over the summer to go into the guts of how CHECKDB works (both the consistency checks part and the repair part). And as you can guess from 'CHECKDB', I'm already bored with putting DBCC in front of it all the time and changing the font to Courier New to make it stand out. I don't do that in real life so why on the blog?&lt;/P&gt;
&lt;P&gt;I have no idea how&amp;nbsp;long it'll be&amp;nbsp;- easily more than 20, probably less than 50, but there's a ton of info in my head that's desperate to get out (ever seen the scene in the movie Scanners where that guy's head explodes? Well, its not quite that bad but remembering that scene was fun). As part of this I'll need to go into some of the low-level structural details of the database, which will hopefully be interesting too. (How do I know all this stuff? Read my bio). If there's something I need to explain, post a comment with a question in and I'll do a post to answer it. I may even do a post on how to use DBCC PAGE...&lt;/P&gt;
&lt;P&gt;So here's the problem statement: CHECKDB needs a consistent view of the database.&lt;/P&gt;
&lt;P&gt;Why? Well, usually its running on a live database with all kinds of stuff going on. It needs to read and analyze the whole database but it can't do it instantaneously (this isn't Star Trek) so it has to take steps to ensure that what it reads is transactionally consistent.&lt;/P&gt;
&lt;P&gt;Here's an example. Consider a transaction to insert a record into a table&amp;nbsp;that is a heap and has a non-clustered index, with a concurrent CHECKDB that doesn't enforce a consistent view.&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;The table record is inserted first, and then the non-clustered index record is inserted (that's just the way the operations are split up in the database engine)&lt;/LI&gt;
&lt;LI&gt;Because this hypothetical CHECKDB doesn't have a consistent view, it could read see the record in the table but not that in the index, conclude that the non-clustered index is out of sync with the table and flag an 8951 error (missing index row).&lt;/LI&gt;
&lt;LI&gt;How could this happen? Depending on the order in which the pages are read, the page on which the new non-clustered index record should go could be read before the page on which the new heap record should go. (I use record and row somewhat interchangeably to mean the physically-stored contents of a table or index row). If the index page read happens just before the record is inserted into the table page, and then the table page is read, then we see the inconsistent state.&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;The easy way to get the consistent state is through locking, which is what SQL Server 7.0 did. You can still do that in SQL Server 2000 and 2005 using the TABLOCK option. Another way to do it is to put the database into single-user or read-only mode.&lt;/P&gt;
&lt;P&gt;However, excessive locking is a drag and taking the database essentially offline tends to irritate users so with SQL Server 2000 we came up with a neat way to get the consistent view&amp;nbsp;and be able to run CHECKDB online - log analysis. In a nutshell, after we've read through all the database, we read the transaction log to make sure we didn't miss anything. Sounds simple, right? Dream on. Here's how it works:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;The log is read from the LSN of the 'begin tran' log record of the oldest transaction that is active at the time the database scan started, to the LSN at the time the database scan stops.&lt;/LI&gt;
&lt;LI&gt;Log records from transactions that commit during that time are used to generate REDO facts. (We have a chicken-and-egg situation here - it's difficult to explain how CHECKDB works without referencing some mechanisms that haven't been explained yet - I'll get to what facts are in part &amp;lt;single-digits-I-promise&amp;gt;.) The REDO facts either reinforce something we've already seen (in which case we ignore them) or provide information on something we haven't seen. For example:&lt;/LI&gt;
&lt;UL&gt;
&lt;LI&gt;a page allocation log record would produce a REDO fact of 'page X is allocated to IAM chain Y' (yes, I'm throwing around unexplained terms again - unavoidable I'm afraid and I'll explain them later - or read Kalen's books)&lt;/LI&gt;
&lt;LI&gt;a row insertion record (such as from the index example above) would produce a REDO fact of 'a row with these index keys was inserted into page A of table B, index C at slot position S'&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;Log records from transactions that rollback or don't commit during that time are used to generate UNDO facts. The UNDO facts either cancel something that we've already seen (e.g. the first half of the index example above, if it didn't commit while CHECKDB was doing the database scan) or reference something we haven't seen (in which case we ignore them). For example:&lt;/LI&gt;
&lt;UL&gt;
&lt;LI&gt;&amp;nbsp;page allocation log record would produce an UNDO fact of 'page X was deallocated from IAM chain Y'&lt;/LI&gt;
&lt;LI&gt;a row insert record would produce an UNDO fact of 'a row with these index keys was removed from page A of table B, index C at slot position S'&lt;/LI&gt;&lt;/UL&gt;
&lt;LI&gt;As you may have realized, what we're essentially doing is our own log recovery, inside CHECKDB, but without actually affecting the database.&lt;/LI&gt;
&lt;LI&gt;This can get excruciatingly complicated (e.g. having to generate UNDO facts from the compensation log records that wrap sections of a cancelled index rebuild transaction...) I spent too many days of 2000 working out what was going on in the log and making tweaks to this code. However, it worked really well and we had online CHECKDB finally. The kudos for writing most of this stuff goes to &lt;U&gt;&lt;STRONG&gt;Steve Lindell&lt;/STRONG&gt;&lt;/U&gt; - while he was busy writing the online code I was up to my eyes writing DBCC INDEXDEFRAG (another post).&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;Back in late 2000, it became apparent that with all the new features we were planning for 'Yukon' (we had no idea it would be called SQL Server 2005 back then), including some changes to the transaction log to allow for fast recovery and deferred transactions and stuff like versioning and online index build, the transaction log analysis was a non-starter.&amp;nbsp;While it had given us the holy-grail of online consistency checks,&amp;nbsp;with all the added complications of Yukon it&amp;nbsp;would become impossible to maintain and get right.&lt;/P&gt;
&lt;P&gt;But what to use instead? Who would come to my rescue? Turns out that database snapshots would be my savior. (That is their eventual name. I preferred their first name COW databases - Copy-On-Write databases - and my TechEd slide deck has a nice animated cow in homage). The in-depth details of database snapshots are beyond the scope of this post. To put it simply:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&amp;nbsp;they use NTFS sparse-file technology&lt;/LI&gt;
&lt;LI&gt;database recovery is run when the snapshot is created, but the recovered database is stored in the snapshot, not the source database&lt;/LI&gt;
&lt;LI&gt;they only hold pages from the source database that have been changed since the database snapshot was created (either by the recovery process, or as part of normal operations on the source database)&lt;/LI&gt;
&lt;LI&gt;Books Online has a bunch more info about their use by DBCC - look in the 'DBCC Statements' section.&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;By moving to database snapshots we changed to using mainline server code to get our transactionally consistent view. This vastly reduced the complexity of the code and meant that someone else was responsible for finding and fixing its bugs during development :-)&lt;/P&gt;
&lt;P&gt;So when CHECKDB starts, the first thing we do is work out whether we'd like to run online - if so we create a hidden database snapshot of the source database (i.e. CHECKDB's target database). That could cause you a problem - depending on your transaction load concurrent with the CHECKDB, the database snapshot can grow in size. As the we create a hidden one, you have no control over where we places the files - we just place them as alternate streams of the files comprising the source database. If you don't have room for this, just create your own database snapshot and check that.&lt;/P&gt;
&lt;P&gt;Once the database snapshot is created, we're guaranteed a transactionally consistent view of the database and can merrily run our check algorithms against the database snapshot. Ah, you might say, but that means CHECKDB is checking the database as it was at some point in the past! Yes, I'd say, but that point is&amp;nbsp;the start time of the CHECKDB, just as it was (essentially) with the log analysis mechanism in SQL Server 2000.&lt;/P&gt;
&lt;P&gt;There are&amp;nbsp;a few slight gotchas (all documented) with this approach:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;sparse files are only available with NTFS so online checks can't be run on databases stored on FAT or FAT32 volumes&lt;/LI&gt;
&lt;LI&gt;recovery cannot be run on TEMPDB, so online checks can't be run on TEMPDB (CHECKDB automatically switches to locking in that case)&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;And that's that. Now its time for breakfast - no oatmeal hopefully...&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=623789" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/DBCC/default.aspx">DBCC</category><category domain="http://blogs.msdn.com/sqlserverstorageengine/archive/tags/CHECKDB+Series/default.aspx">CHECKDB Series</category></item></channel></rss>