My name is Harsh Deep Singh, and I am a SQL Server Premier Field Engineer with Microsoft. I've been working on SQL Server for quite some time, and the intention behind starting this blog is to try and share my knowledge with the SQL Server community. Any feedback or suggestions on the blog are both welcome and solicited. Please feel free to drop me a line on email@example.com anytime.
Disclaimer: The information in this weblog is provided “AS IS” with no warranties, and confers no rights. This weblog does not represent the thoughts, intentions, plans or strategies of my employer. It is solely my opinion. Feel free to challenge me, disagree with me, or tell me I’m completely nuts in the comments section of each blog entry, but I reserve the right to delete any comment for any reason whatsoever (abusive, profane, rude, or anonymous comments) – so keep it polite, please.
This is an issue that I had been busy working on these past few days. We were getting AV dumps on the DB, and when I looked at the stack, I found that SQL was calling a function to get the name of a column(given a table name), an index id and key id. This function call resulted in a an exception being generated, which is what caused the AV dump:
Child-SP RetAddr Call Site00000000`249676c8 00000000`76ecc0b0 ntdll!ZwWaitForSingleObject+0xa00000000`249676d0 00000000`01596369 kernel32!WaitForSingleObjectEx+0x9c00000000`24967790 00000000`01595d2b sqlservr!CDmpDump::DumpInternal+0x4d900000000`24967890 00000000`01f95080 sqlservr!CDmpDump::Dump+0x3b00000000`249678e0 00000000`0204ebae sqlservr!SQLDumperLibraryInvoke+0x1a000000000`24967910 00000000`021968d5 sqlservr!CImageHelper::DoMiniDump+0x3ce00000000`24967af0 00000000`0219728c sqlservr!ContextDumpNoStackOverflow+0x32500000000`24968340 00000000`021978ea sqlservr!ContextDump+0x7bc00000000`24968da0 00000000`01f6db08 sqlservr!stackTraceExceptionFilter+0x24a00000000`24968df0 00000000`01f820d8 sqlservr!SOS_OS::ExecuteDumpExceptionHandlerRoutine+0x2800000000`24969080 00000000`0267fda5 sqlservr!GenerateExceptionDump+0x4800000000`249690b0 00000000`0267fe4c sqlservr!ex_trans_cexcept+0x4500000000`249690f0 00000000`74f6acf0 sqlservr!SOS_SEHTranslator+0x4c00000000`24969120 00000000`74f69e0b msvcr80!_CallSETranslator+0x4000000000`24969190 00000000`74f6a62b msvcr80!FindHandlerForForeignException+0x9b00000000`24969230 00000000`74f6a86b msvcr80!FindHandler+0x63b00000000`249697e0 00000000`74f6abe7 msvcr80!__InternalCxxFrameHandler+0x1fb00000000`24969830 00000000`00d10cf3 msvcr80!__CxxFrameHandler+0x7700000000`24969880 00000000`770058dd sqlservr!__GSHandlerCheck_EH+0x6300000000`249698b0 00000000`770096d7 ntdll!RtlpExecuteHandlerForException+0xd00000000`249698e0 00000000`77016e08 ntdll!RtlDispatchException+0x20c00000000`24969f80 00000000`025ff563 ntdll!KiUserExceptionDispatch+0x2e00000000`2496a520 00000000`00bc516b sqlservr!WstrIndkeyWstrI4I4+0x32300000000`2496a670 00000000`00869f6c sqlservr!CQScanNLJoinNew::GetRowHelper+0x119b00000000`2496abd0 00000000`00c72c2b sqlservr!CQScanSortNew::BuildSortTable+0x18c00000000`2496ac90 00000000`0086cc15 sqlservr!CQScanTopSortNew::Open+0x4700000000`2496acc0 00000000`0086cb2e sqlservr!CQueryScan::Startup+0xcd00000000`2496ad10 00000000`0086bdea sqlservr!CXStmtQuery::SetupQueryScanAndExpression+0x41200000000`2496ad70 00000000`0087389b sqlservr!CXStmtQuery::ErsqExecuteQuery+0x2f800000000`2496dd80 00000000`0086fe6b sqlservr!CMsqlExecContext::ExecuteStmts<1,1>+0xcc200000000`2496e030 00000000`0086f789 sqlservr!CMsqlExecContext::FExecute+0x58b00000000`2496e1b0 00000000`0245bcfd sqlservr!CSQLSource::Execute+0x31900000000`2496e2e0 00000000`02460b34 sqlservr!ExecuteSql+0x72d00000000`2496ed60 00000000`02e43271 sqlservr!CSpecProc::ExecuteSpecial+0x23400000000`2496ee80 00000000`00871270 sqlservr!CSpecProc::Execute+0x1f100000000`2496eff0 00000000`008cf87a sqlservr!process_request+0x37000000000`2496f2b0 00000000`0080b29b sqlservr!process_commands+0x1ba00000000`2496f4b0 00000000`0080af5a sqlservr!SOS_Task::Param::Execute+0x11b00000000`2496f5d0 00000000`0080ac35 sqlservr!SOS_Scheduler::RunTask+0xca00000000`2496f660 00000000`00dbc560 sqlservr!SOS_Scheduler::ProcessTasks+0x9500000000`2496f6d0 00000000`00dbaca0 sqlservr!SchedulerManager::WorkerEntryPoint+0x11000000000`2496f790 00000000`00dba640 sqlservr!SystemThread::RunWorker+0x6000000000`2496f7c0 00000000`00dbc6ff sqlservr!SystemThreadDispatcher::ProcessWorker+0x12c00000000`2496f850 00000000`74f337d7 sqlservr!SchedulerManager::ThreadEntryPoint+0x12f00000000`2496f8e0 00000000`74f33894 msvcr80!_callthreadstartex+0x1700000000`2496f910 00000000`76ebbe3d msvcr80!_threadstartex+0x8400000000`2496f940 00000000`76ff6861 kernel32!BaseThreadInitThunk+0xd00000000`2496f970 00000000`00000000 ntdll!RtlUserThreadStart+0x1d
From the dump, I was able to extract the dbid and the object id, and when we tried to run update statistics on the table in question, it failed with the error:-
Msg 0, Level 11, State 0, Line 0 A severe error occurred on the current command. The results, if any, should be discarded.
So definitely there was some Metadata corruption here. However, we were able to update statistics on all the indexes and stats on the table explicitly without any issues, but the update table continued to fail with the same error.
It was then that we stumbled upon another interesting piece of the puzzle. When we ran a “Select * from sysindexes”, the query failed with the same error…!!! Based on this, and some more research, we summed up that there was a statistic against this table (and a system statistic at that), which was present in sysindexes. This makes sense, as for individual index rebuilds, we would not need to scan the sysindexes dmv, but when running update statistics against the table, we would need to scan the dmv based on the object id, which causes an AV. We were able to confirm this by querying only the status, id, name and indid columns of sysindexes for the object id in question. We saw that the statistic was mapped to an index id which was not present in any of the other dmv’s such as sys.stats, sys.stats_columns, sys.sysindexkeys, etc.
We even found a KB explaining the issue, and a fix for it, which would prevent the issue from occurring in the future (see here).
But the question remained, how do we get rid of it now? As you probably guessed, we just need to delete the offending statistic, right? But when trying to run the Drop statistics statement, we were getting the error:-
Msg 3701, Level 11, State 6, Line 1
Cannot drop the statistics 'Fielders._WA_Sys_08000002_1FA46B10', because it does not exist or you do not have permission.
So we connected through the DAC Connection (simple, just type Admin: Servername\Instancename in the connection string in SSMS, but remember you have to be a sysadmin for this), and enclosed the statistic name in square brackets, and ran the drop statistics command, and would you believe it, it worked like a charm.
Hope that the next time you run into a metadata corruption issue like this, you know what to do.
As always, feedback/comments/suggestions are both welcome and solicited.