My name is Ryan Mangipano (ryanman) and I am a Sr. Support Escalation Engineer at Microsoft. Today I will be blogging about how I used the SOS .Net Framework debugging extension (and !analyze -v) to easily troubleshoot a .Net Framework exception. This exception was preventing Event Viewer from displaying properly. Event Viewer was returning an error that provided very little information about what was actually causing the issue. I will demonstrate how, in this case, it was very easy to use windbg to obtain information about what went wrong. I did not have to perform a live debug on the issue. Instead, I used a process dump to obtain very exact information, which was returned by the debugger, relating to the root cause. I was then able to use Process Monitor to identify the file that needed to be examined. These actions led me to the source of the problem, which was easily corrected. Also, the issue discussed in this blog was easily reproduced on Windows Server 2008. This means that you should be able to practice this debug exercise, on your non-production Windows 2008 SP1 Server, for learning purposes if you are interested in doing so.
Issue Reported: The following error was encountered when opening eventvwr.msc (Event Viewer) on a Windows 2008 Server system:
"MMC could not create the snap-in."
MMC could not create the snap-in.
The snap-in might not have been installed correctly
Name: Event Viewer
First Step- Research & Data Gathering: After ensuring I first understood the problem reported, I searched for known issues. I found out that we have seen this error before. It may occur when the following registry key gets deleted or corrupted:
I had the customer export this key and found that it was not corrupted in any way. I verified that all data was as expected
Next, a memory dump of the mmc.exe process was collected. The mmc.exe process is used to host the eventvwr.msc snap-in. This was easily obtained using the built in Windows 2008 Server "Windows Task Manager" feature: "Create Dump File" . If you have several mmc console instances executing on your system, you can use the Task Manager context menu shortcuts "Switch To" and "Go To Process" to help you to identify the correct instance.
Note: We also collected a process monitor logfile during the startup of eventvwr.msc. This log file later proved very helpful in resolving the issue (as I will show below). Process monitor can be obtained at the following URL: http://technet.microsoft.com/en-us/sysinternals/bb896645.aspx
Now let's have a look at the debug.
1. First, I navigated Windows Explorer to the location of the dump file and double-clicked it to open it in windbg.exe.
It opened in windbg because I had previously run the command windbg -IA, which associates .dmp files with windbg. You can read more about the command line options in windbg in the help file that is included with the debugging tools.
2. I noticed the following output from the debugger after it loaded the dump file:
This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.
(ff8.a2c): CLR exception - code e0434f4d (first/second chance not available)
3. Next, I wanted to ensure my symbol path was set correctly. I could have set it using the .sympath command:
0:011> .sympath SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols
Symbol search path is: SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols
Expanded Symbol search path is: srv*c:\websymbols*http://msdl.microsoft.com/download/symbols
However, when your goal is to simply point to the default symbol server, .symfix is a very nice shortcut. It prevents one from having to try to remember the URL. Here’s the syntax:
0:011> .symfix c:\websymbols
4. To ensure that I didn't waste time reviewing the wrong data, I performed a quick check to ensure that we collected a dump of the requested snap-in.
PEB at 000007fffffdb000
CommandLine: '"C:\Windows\system32\mmc.exe" "C:\Windows\system32\eventvwr.msc" '
You could alternatively dump the CommandLine from the nt!_PEB using the dt command
0:005> dt nt!_PEB ProcessParameters->CommandLine 000007fffffdb000
+0x020 ProcessParameters :
+0x070 CommandLine : _UNICODE_STRING ""C:\Windows\system32\mmc.exe" "C:\Windows\system32\eventvwr.msc" "
5. Next, I then dumped out all of the threads in this process and found the following thread contained a stack that was raising a .Net Framework exception
0:011> ~* kL
... (ommitted the non-relevent threads)
# 11 Id: ff8.a2c Suspend: 1 Teb: 7ffd3000 Unfrozen
0691f03c 7343a91c kernel32!RaiseException+0x58
0691f09c 7343d81a mscorwks!RaiseTheExceptionInternalOnly+0x2a8
*** WARNING: Unable to verify checksum for MMCEx.ni.dll
0691f140 6bfe0b5a mscorwks!JIT_Rethrow+0xbf
*** WARNING: Unable to verify checksum for mscorlib.ni.dll
0691f1e8 69926cf6 MMCEx_ni+0xd0b5a
0691f1f4 6993019f mscorlib_ni+0x216cf6
0691f208 69926c74 mscorlib_ni+0x22019f
0691f220 733d1b4c mscorlib_ni+0x216c74
0691f230 733e21b1 mscorwks!CallDescrWorker+0x33
0691f2b0 733f6501 mscorwks!CallDescrWorkerWithHandler+0xa3
0691f3e8 733f6534 mscorwks!MethodDesc::CallDescr+0x19c
0691f404 733f6552 mscorwks!MethodDesc::CallTargetWorker+0x1f
0691f41c 7349d803 mscorwks!MethodDescCallSite::CallWithValueTypes+0x1a
0691f604 733f845f mscorwks!ThreadNative::KickOffThread_Worker+0x192
0691f618 733f83fb mscorwks!Thread::DoADCallBack+0x32a
0691f6ac 733f8321 mscorwks!Thread::ShouldChangeAbortToUnload+0xe3
0691f6e8 733f84ad mscorwks!Thread::ShouldChangeAbortToUnload+0x30a
0691f710 7349d5d4 mscorwks!Thread::ShouldChangeAbortToUnload+0x33e
6. Out of curiosity, I also ran the Get Last Error command
LastErrorValue: (Win32) 0 (0) - The operation completed successfully.
LastStatusValue: (NTSTATUS) 0 - STATUS_WAIT_0
7. After this, I ran analyze -v to see what helpful information the debugger would provide. The debugger did output exception information but informed me that I needed to use the x86 debugger instead.
0:011> !analyze -v
EXCEPTION_RECORD: ffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 771a42eb (kernel32!RaiseException+0x00000058)
ExceptionCode: e0434f4d (CLR exception)
Managed code needs matching platform of sos.dll for proper analysis. Use 'x86' debugger.
8. I fired up the x86 debugger and loaded the appropriate version of the SOS .Net Framework debugger extension. This extension ships in the Operating System along with the .Net Framework. On most occasions, I would have initiated the loading of the extension through the use of the following syntax:
0:011> .load C:\Windows\Microsoft.NET\Framework\v2.0.50727\sos.dll
0:011> .load c:\Windows\Microsoft.NET\Framework64\v2.0.50727\sos.dll
However, once you realize that managed debugging will be necessary and that you need the services of the SOS extension, it’s best to use the .loadby command rather than .load. This is due to the fact that the version of SOS must match the version of the CLR loaded into that process. Here’s the recommended syntax:
0:011>.loadby sos mscorwks
I always verify that my extensions are loaded properly by using the .chain command.
... Extension DLL chain:
C:\Windows\Microsoft.NET\Framework\v2.0.50727\sos.dll: image 2.0.50727.1434, API 1.0.0, built Wed Dec 05 22:42:38 2007
9. Running !help printed out the following helpful information about the SOS extension since sos.dll was at the top of the .chain output:
SOS is a debugger extension DLL designed to aid in the debugging of managed
programs. Functions are listed by category, then roughly in order of
importance. Shortcut names for popular functions are listed in parenthesis.
Type "!help <functionname>" for detailed info on that function.
Object Inspection Examining code and stacks
DumpObj (do) Threads
DumpArray (da) CLRStack
DumpStackObjects (dso) IP2MD
PrintException (pe) COMState
10. Using the exception address, displayed by the debugger when opening the dump, and the !pe command listed above, I obtained more information about the exception:
0:011> !pe 771a42eb
There are nested exceptions on this thread. Run with -nested for details
0:011> !pe -nested 771a42eb
Nested exception -------------------------------------------------------------
Exception object: 040a676c
Exception type: System.Reflection.TargetInvocationException
Message: Exception has been thrown by the target of an invocation.
InnerException: System.Reflection.TargetInvocationException, use !PrintException 040a6a20 to see more
SP IP Function
0:011> !PrintException 040a6a20
Exception object: 040a6a20
InnerException: System.Configuration.ConfigurationErrorsException, use !PrintException 040a6cf8 to see more
0:011> !PrintException 040a6cf8
Exception object: 040a6cf8
Exception type: System.Configuration.ConfigurationErrorsException
Message: Configuration system failed to initialize
InnerException: System.Configuration.ConfigurationErrorsException, use !PrintException 040a7174 to see more
0:011> !PrintException 040a7174
Exception object: 040a7174
Message: Unrecognized configuration section system.web/myInvalidData
11. Based on the exception information listed above, it appeared that a .Net Framework configuration section, system.web, contained an invalid configuration section named myInvalidData inside of it. I then re-ran !analyze -v against the dump again (now that I had loaded the x86 debugger) and found that !analyze -v will load the sos.dll extension and even run the !pe extension automatically. It then automatically displayed the exception record information for me as well. Also, notice that the thread listed by !analyze -v matches the thread I examined earlier.
0:011> !analyze -v
EXCEPTION_MESSAGE: Unrecognized configuration section system.web/myInvalidData.
0 Id: ff8.c84 Suspend: 1 Teb: 7ffdf000 Unfrozen
1 Id: ff8.96c Suspend: 1 Teb: 7ffde000 Unfrozen
2 Id: ff8.d10 Suspend: 1 Teb: 7ffdd000 Unfrozen
3 Id: ff8.d94 Suspend: 1 Teb: 7ffdc000 Unfrozen
4 Id: ff8.a14 Suspend: 1 Teb: 7ffda000 Unfrozen
5 Id: ff8.fbc Suspend: 1 Teb: 7ffd9000 Unfrozen
6 Id: ff8.f88 Suspend: 1 Teb: 7ffd8000 Unfrozen
7 Id: ff8.a64 Suspend: 1 Teb: 7ffd6000 Unfrozen
8 Id: ff8.bf8 Suspend: 1 Teb: 7ffd5000 Unfrozen
9 Id: ff8.d24 Suspend: 1 Teb: 7ffd4000 Unfrozen
10 Id: ff8.ff0 Suspend: 1 Teb: 7ffd7000 Unfrozen
. 11 Id: ff8.a2c Suspend: 1 Teb: 7ffd3000 Unfrozen
12. At this point I was interested in identifying the source of this unrecognized configuration. Instead of engaging our .Net support team, I started with a quick search using www.live.com for
"unrecognized configuration section" system.web site:microsoft.com
This returned the following results http://search.live.com/results.aspx?q=%22unrecognized+configuration+section%22+system.web+site%3Amicrosoft.com&form=QBRE
By quickly reviewing some of the hits returned, I found that others had encountered this exception in their own applications. This is due to invalid entries in the various .config files used in .Net. Looking through the posts, different configuration file names and paths were observed.
So, I opened up the process monitor logfile to see which configuration files we were reading data from. I added filter criterion to match entries from the mmc.exe process, the TID from the FAULTING_THREAD listed in the exception data, path data containing .config, and a successful status result. It's best to be as specific as possible.
I found that we were reading in a large amount of settings over and over again from the .net Framework global configuration file:
(on x64 this would be C:\Windows\Microsoft.NET\Framework64\v2.0.50727\CONFIG\machine.config)
Final Step- Putting it all together, Reproducing the issue, & confirming resolution : Using notepad, a quick search of the suspect xml file (C:\Windows\Microsoft.NET\Framework64\v2.0.50727\CONFIG\machine.config) on my system revealed a <system.web> section. At this point, I suspected that this section contained an invalid section which may have been related to the problem. To verify this, and since I like to break things, I added an invalid configuration setting <myInvalidData/> to my global configuration file. Doing so, I successfully reproduced the issue on my system. I then contacted the customer and asked if they had by any chance added any settings under the <system.web> in the configuration file: c:\Windows\Microsoft.NET\Framework\v2.0.50727\CONFIG\machine.config.
The customer informed me that, per the request of their ASP.net developer, they had in fact added settings to that section of the file. By researching http://msdn.microsoft.com/en-us/library/system.web.aspx and the schema documentation at http://msdn.microsoft.com/en-us/library/dayb112d.aspx, we were able to determine that the settings that were present in this file should not have been present inside of <system.web> . The settings were moved to the proper location per the developer and the issue was resolved.
Here are the steps I used to reproduce the issue in case you are attempting to replicate this at home-
A. Using notepad, open the following configuration file on a non-production Windows Server 2008 SP1 system:
(please make a backup copy first in case you make a mistake)
OR (Open the version that matches the architecture of your platform )
B. Find the section <system.web> in this file (you can use the find function in notepad):
C. Add the following line directly after <system.web> as shown in the example below:
D. Save the file and then open eventvwr.msc and verify that the following error is displayed:
Hopefully this blog has demonstrated an example of how you can use the "create dump file" feature of Windows 2008, windbg, and other related tools in an attempt to gain more specific data when your error message is not revealing the source of the problem. Feel free to post any questions, comments, or concerns.
good post, i like how you included how we can reproduce this and have a look ourselves
thanks ryanman for posting a good article
i saw this blog site while i've rest time
i had study a lot better than before.
thanks very much
Great post. I really liked your step by step analysis along with your reasoning. Thank you.
Sweet info. I always wanted to see how this works. Thanks.
Awesome post. it saved me a lot of troubleshooting. thx
tnx. I liked ur step by step explanation...easy enough for a rookie...
In my case it was the files C:\Windows\Microsoft.NET\Framework\v2.0.50727\CONFIG\machine.config and C:\Windows\Microsoft.NET\Framework\v2.0.50727\CONFIG\machine.config which were corrupt. The tag <DbProviderFactories> had no end tag. Adding the end tag in both files fixed it. I suspect the installation of SQL Anywhere caused the error.
Ryan, you are awesome! what a great effort and clear debugging knowledge!! so nice. keep rocking...
Uh, where's the article? All I see are comments on the post - not the actual post.
[Is this still broken? It works for me.]
I also had problems viewing the Eventlog but a lot of other applications had problems as well. Most MMC Snap-Ins worked and I could view the Eventlog from a remote computer. VS2010 was also available on the machine but did not run eighter. An .NET Stacktrace from another Application gave me an hint. There was something wrong with Fonts on the machine. Direct comparison of installed Fonts in the Windows\Fonts directory did not bring a solution. But this KB article did help me: support.microsoft.com/.../943140
The Tahoma Font was not pointing to the ttf file. After fixing this in the registry I could view the Eventlog and also start VS2010 again. No clue on which Application changed the Tahoma-Font entry but glad I don't have to reinstall the machine.