Since I already posted a challenge for this lab earlier I didn't want to wait too long with publishing the review...
Previous demos and setup instructions
If you are new to the debugging labs, here you can find information on how to set up the labs as well as links to the previous labs in the series.
Information and setup instructions
Lab 1: Hang
Lab 1: Hang - review
Lab 2: Crash
Lab 2: Crash - review
Lab 3: Memory
Lab 3: Memory - review
Lab 4: High CPU hang
Lab 4: High CPU hang - review
Reproducing the issue:
1. Recycle IIS (IISReset)
2. Browse to http://localhost/BuggyBits/CompanyInformation.aspx
3. Type some message and click the Send button
Note: If just-in-time debugging is enabled a message box might pop up because the w3wp.exe process crashed. If this comes up, please ignore it.
Q: What do you see in the browser?
A: A "Internet Explrer cannot display the webpage" page indicating that something is wrong on the server.
Explore the eventlogs
1. Open the eventlogs for system an application
Q: What events do you notice relating to the crash? (note: you may see different events on different operating systems)
A:
Application
Event Type: Error
Event Source: .NET Runtime 2.0 Error Reporting
Event Category: None
Event ID: 1000
Date: 2008-03-04
Time: 12:57:33
User: N/A
Computer: MYMACHINE
Description:
Faulting application w3wp.exe, version 6.0.3790.3959, stamp 45d6968e, faulting module kernel32.dll, version 5.2.3790.4062, stamp 4626467c, debug? 0, fault address 0x00022366.
For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
...
System
Event Type: Warning
Event Source: W3SVC
Event Category: None
Event ID: 1009
Date: 2008-03-04
Time: 12:57:19
User: N/A
Computer: MYMACHINE
Description:
A process serving application pool 'DefaultAppPool' terminated unexpectedly. The process id was '5764'. The process exit code was '0x800703e9'.
For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Event Type: Warning
Event Source: W3SVC
Event Category: None
Event ID: 1011
Date: 2008-03-04
Time: 12:57:48
User: N/A
Computer: MYMACHINE
Description:
A process serving application pool 'DefaultAppPool' suffered a fatal communication error with the World Wide Web Publishing Service. The process id was '5764'. The data field contains the error number.
For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 6d 00 07 80 m..
Q: What does exit code '0x800703e9' mean?
A: Searching on the internet or looking it up in Error Lookup we find that 0x800703E9 means "Recursion too deep; the stack overflowed."
Q: Based on the eventlogs, what caused the crash?
A: A stackoverflow exception in the application. The most likely cause for this is infinite recursion, i.e. function a calls function b calls function a calls function b etc. until the memory allocated for the stack is exhausted.
Reproduce the issue again and gather a memory dump:
1. Restart iis by running iisreset
2. Browse to http://localhost/BuggyBits/CompanyInformation.aspx again without hitting the send button
3. Open a command prompt, move to the debuggers directory and type the following command to attach the debugger and wait for the crash
adplus -crash -pn w3wp.exe -quiet
4. Enter a message on the page and click send
Q: What files get created in the dump folder (i.e. the folder named crash_mode current date and time under the debuggers directory)?
A:
2008-03-04 14:28 4 415 ADPlus_report.txt
2008-03-04 14:28 <DIR> CDBScripts
2008-03-04 14:28 151 874 536 PID-2548__W3WP.EXE__1st_chance_Process_Shut_Down__full_06f0_2008-03-04_14-28-17-145_09f4.dmp
2008-03-04 14:28 111 648 PID-2548__W3WP.EXE__Date_03-04-2008__Time_14-28-0606.log
2008-03-04 14:28 13 156 Process_List.txt
Open the dump to figure out what it is doing:
1. Open the process_shut_down dump (crash_mode with todays date and time in the debuggers directory) in windbg using file/open crash dump
2. Load up the symbols and sos
Examine the stacks:
Note: In a dump file the active thread is the thread that caused the debugger to dump the process, i.e. a process exit exception or other exception.
One of a few things can cause the process to shut down.
1. Run kb to see which of the above categories this crash/process exit falls into
Q: Which category did it fall into and why?
A: In my particular dump I got this stack for the last remaining executing thread. Since this is not the main thread (didn't start with w3wp!wmainCRTStartup) this means that the process died and shut down the threads. The fact that we didn't get a 2nd chance dump means we didnt die from an unhandled exception, and the fact that it wasn't the main thread means that it isn't a normal recycle.
0:000> kL
ChildEBP RetAddr
0290fe54 7d4d8c82 ntdll!NtWaitForSingleObject+0x15
0290fec4 79e789c6 kernel32!WaitForSingleObjectEx+0xac
0290ff08 79e7898f mscorwks!CLREventWaitHelper+0x2f
0290ff58 79e78944 mscorwks!CLREvent::WaitEx+0x117
0290ff6c 79efa706 mscorwks!CLREvent::Wait+0x17
0290ff98 79fc8583 mscorwks!SVR::gc_heap::gc_thread_function+0x30
0290ffb8 7d4dfe21 mscorwks!SVR::gc_heap::gc_thread_stub+0x92
0290ffec 00000000 kernel32!BaseThreadStart+0x34
Examine the log file to find out what happened just prior to the crash
Q: What events occurred just prior to the crash?
A: Right before the process exit event I see the following exception (Unknown exception - code e053534f). Since I don't have the symbol path set up in the environment variables I can't really trust the stack, but what I can see is that we have some .net code (the mscorlib_ni frames) followed by some calls into mscorwks and eventually the kernel32!RaiseException that raises the e053534f exception. Since this exception code is not a known exception code adplus reports it as unknown exception.
0:025> g
Tue Mar 4 14:28:16.551 2008 (GMT+1): (9f4.30c): Unknown exception - code e053534f (first chance)
---
--- 1st chance UnknownException exception ----
---------------------------------------------------------------
Occurrence happened at:
Debug session time: Tue Mar 4 15:20:44.687 2008 (GMT+1)
System Uptime: 19 days 12:10:42.146
Process Uptime: 0 days 0:00:17.546
Kernel time: 0 days 0:00:00.578
User time: 0 days 0:00:00.640
Faulting stack below ---
# ChildEBP RetAddr Args to Child
WARNING: Stack unwind information not available. Following frames may be wrong.
00 0ff3aecc 7a0ac3d8 e053534f 00000000 00000000 kernel32!RaiseException+0x4e
01 0ff3aee8 79fff01a 0024a900 6ea810e3 0024a900 mscorwks!GetCompileInfo+0x1d896
02 0ff3b06c 79f067d3 06f26a08 6ea81197 06fd9788 mscorwks!NGenCreateNGenWorker+0x213a7
03 0ff3b118 7948d92a 793826f1 06fd98c4 79e82084 mscorwks!GetCLRFunction+0x2371f
04 0ff3b214 793983c8 00000000 06fd975c 00000000 mscorlib_ni+0x3cd92a
05 0ff3b254 793984e3 08000000 00001000 00000001 mscorlib_ni+0x2d83c8
06 0ff3b278 79398550 0ff3b2d8 06fd9470 06f2657c mscorlib_ni+0x2d84e3
07 0ff3b28c 7949959c 00000400 06f26960 00000000 mscorlib_ni+0x2d8550
08 0ff3b2c4 793983c8 06fd8d90 06fd8fe4 06f2657c mscorlib_ni+0x3d959c
09 0ff3b33c 7949959c 00000400 06f26960 00000000 mscorlib_ni+0x2d83c8
0a 00000000 00000000 00000000 00000000 00000000 mscorlib_ni+0x3d959c
We can rebuild the native piece of the stack by using ln (list near) on the mscorwks methods in the process exit dump.0:000> ln mscorwks!GetCLRFunction+0x2371f
(79f06728) mscorwks!GetResourceFromDefault+0xb6 | (79f0681a) mscorwks!GetResourceStringFromManaged
0:000> ln mscorwks!NGenCreateNGenWorker+0x213a7
(79f0681a) mscorwks!GetResourceStringFromManaged+0x44 | (79f069e5) mscorwks!NativeCompareInfo::CompareStringW
0:000> ln mscorwks!GetCompileInfo+0x1d896
(7a0ac375) mscorwks!ReportStackOverflow+0x61 | (7a0ac3dc) mscorwks!ChangeThreadToSOTolerant
Substituting the unresolved method names (method names with bad symbols) we get this... # ChildEBP RetAddr Args to Child
WARNING: Stack unwind information not available. Following frames may be wrong.
00 0ff3aecc 7a0ac3d8 e053534f 00000000 00000000 kernel32!RaiseException+0x4e
01 0ff3aee8 79fff01a 0024a900 6ea810e3 0024a900 mscorwks!ReportStackOverflow+0x61
02 0ff3b06c 79f067d3 06f26a08 6ea81197 06fd9788 mscorwks!GetResourceStringFromManaged+0x44
03 0ff3b118 7948d92a 793826f1 06fd98c4 79e82084 mscorwks!GetResourceFromDefault+0xb6
04 0ff3b214 793983c8 00000000 06fd975c 00000000 mscorlib_ni+0x3cd92a
05 0ff3b254 793984e3 08000000 00001000 00000001 mscorlib_ni+0x2d83c8
06 0ff3b278 79398550 0ff3b2d8 06fd9470 06f2657c mscorlib_ni+0x2d84e3
07 0ff3b28c 7949959c 00000400 06f26960 00000000 mscorlib_ni+0x2d8550
08 0ff3b2c4 793983c8 06fd8d90 06fd8fe4 06f2657c mscorlib_ni+0x3d959c
09 0ff3b33c 7949959c 00000400 06f26960 00000000 mscorlib_ni+0x2d83c8
0a 00000000 00000000 00000000 00000000 00000000 mscorlib_ni+0x3d959c
So we can see that it was a stackoverflow exception that brought the process down. If we wanted to we could also resolve the top of the .net stack by running !ip2md 790c0000+0x3cd92a for example where 790c0000 is the base address for mscorlib.dll
0:000> !ip2md 790c0000+0x3d959c
MethodDesc: 7925bc40
Method Name: System.IO.StreamWriter..ctor(System.String)
Class: 7910381c
MethodTable: 7910389c
mdToken: 06003696
Module: 790c2000
IsJitted: yes
m_CodeOrIL: 79499580
But I will leave that as an excercise for you since we only have a short piece of the stack so we don't get deep enough to find the recursion. Note here that the recursion doesn't happen in the streamwriter constructor. This is just the last method call when we are running out of stack space as you'll see later
Configure adplus to gather full dumps on the exception in the log file
1. Restart iis by running iisreset
2. Browse to http://localhost/BuggyBits/CompanyInformation.aspx again without hitting the send button
3. Copy the attached adplus configuration file to your debuggers directory (Note: this configuration file causes adplus to take a full dump on the unknown exception)
4. Open a command prompt, move to the debuggers directory and type the following command to attach the debugger and wait for the crash
adplus -pn w3wp.exe -c Unknown.cfg
4. Enter a message on the page and click send
Open the dump to figure out what it is doing:
1. Open the 1st_chance_unknown_exception dump in windbg using file/open crash dump
2. Load up the symbols and sos
Examine the active stack:
1. Run kb 2000 and !clrstack to see what the active thread is doing (i.e. the thread that caused the stackoverflow)
0:014> kb
ChildEBP RetAddr Args to Child
0250ae9c 7a0ac3d8 e053534f 00000000 00000000 kernel32!RaiseException+0x53
0250aeb8 79fff01a 001a67c0 c29e1ac8 001a67c0 mscorwks!ReportStackOverflow+0x61
0250b03c 79f067d3 02f367d4 c29e1a1c 0309dfe0 mscorwks!GetResourceStringFromManaged+0x44
0250b0e8 7948d92a 793826f1 0309e11c 79e82084 mscorwks!GetResourceFromDefault+0xb6
0250b1e4 793983c8 00000000 0309dfb4 00000000 mscorlib_ni+0x3cd92a
0250b224 793984e3 08000000 00001000 00000001 mscorlib_ni+0x2d83c8
0250b248 79398550 0250b2a8 0309dcc8 02f35fd0 mscorlib_ni+0x2d84e3
0250b25c 7949959c 00000400 02f3640c 00000000 mscorlib_ni+0x2d8550
0250b294 793983c8 0309d5e8 0309d83c 02f35fd0 mscorlib_ni+0x3d959c
0250b30c 7949959c 00000400 02f3640c 00000000 mscorlib_ni+0x2d83c8
00000000 00000000 00000000 00000000 00000000 mscorlib_ni+0x3d959c
0:014> !clrstack
OS Thread Id: 0x177c (14)
ESP EIP
0250b090 7d4e2366 [HelperMethodFrame_2OBJ: 0250b090] System.Environment.GetResourceFromDefault(System.String)
0250b0f0 7948d92a System.IO.__Error.WinIOError(Int32, System.String)
0250b11c 79395557 System.IO.FileStream.Init(System.String, System.IO.FileMode, System.IO.FileAccess, Int32, Boolean, System.IO.FileShare, Int32, System.IO.FileOptions, SECURITY_ATTRIBUTES, System.String, Boolean)
0250b214 793983c8 System.IO.FileStream..ctor(System.String, System.IO.FileMode, System.IO.FileAccess, System.IO.FileShare, Int32, System.IO.FileOptions)
0250b240 793984e3 System.IO.StreamWriter.CreateFile(System.String, Boolean)
0250b250 79398550 System.IO.StreamWriter..ctor(System.String, Boolean, System.Text.Encoding, Int32)
0250b270 7949959c System.IO.StreamWriter..ctor(System.String)
0250b27c 0fe90b52 Utility.WriteToLog(System.String, System.String)
0250b2b0 0fe90b0b ExceptionHandler.LogException(System.Exception)
...
0252f0c4 0fe90bd4 Utility.WriteToLog(System.String, System.String)
0252f170 0fe90b0b ExceptionHandler.LogException(System.Exception)
0252f174 0fe90bd4 Utility.WriteToLog(System.String, System.String)
0252f220 0fe90b0b ExceptionHandler.LogException(System.Exception)
0252f224 0fe90a81 BuggyMail.SendEmail(System.String, System.String)
0252f2c8 0fe90a28 CompanyInformation.btnContact_Click(System.Object, System.EventArgs)
0252f2d8 6619004e System.Web.UI.WebControls.Button.OnClick(System.EventArgs)
0252f2ec 6619023c System.Web.UI.WebControls.Button.RaisePostBackEvent(System.String)
0252f300 661901b8 System.Web.UI.WebControls.Button.System.Web.UI.IPostBackEventHandler.RaisePostBackEvent(System.String)
0252f304 6614b47c System.Web.UI.Page.RaisePostBackEvent(System.Web.UI.IPostBackEventHandler, System.String)
0252f30c 6614b3d2 System.Web.UI.Page.RaisePostBackEvent(System.Collections.Specialized.NameValueCollection)
0252f31c 6614e263 System.Web.UI.Page.ProcessRequestMain(Boolean, Boolean)
0252f518 6614d8c3 System.Web.UI.Page.ProcessRequest(Boolean, Boolean)
0252f550 6614d80f System.Web.UI.Page.ProcessRequest()
0252f588 6614d72f System.Web.UI.Page.ProcessRequestWithNoAssert(System.Web.HttpContext)
0252f590 6614d6c2 System.Web.UI.Page.ProcessRequest(System.Web.HttpContext)
0252f5a4 0fe90205 ASP.companyinformation_aspx.ProcessRequest(System.Web.HttpContext)
0252f5a8 65fe6bfb System.Web.HttpApplication+CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
0252f5dc 65fe3f51 System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)
0252f61c 65fe7733 System.Web.HttpApplication+ApplicationStepManager.ResumeSteps(System.Exception)
0252f66c 65fccbfe System.Web.HttpApplication.System.Web.IHttpAsyncHandler.BeginProcessRequest(System.Web.HttpContext, System.AsyncCallback, System.Object)
0252f688 65fd19c5 System.Web.HttpRuntime.ProcessRequestInternal(System.Web.HttpWorkerRequest)
0252f6bc 65fd16b2 System.Web.HttpRuntime.ProcessRequestNoDemand(System.Web.HttpWorkerRequest)
0252f6c8 65fcfa6d System.Web.Hosting.ISAPIRuntime.ProcessRequest(IntPtr, Int32)
0252f8d8 79f047fd [ContextTransitionFrame: 0252f8d8]
0252f90c 79f047fd [GCFrame: 0252f90c]
0252fa68 79f047fd [ComMethodFrame: 0252fa68]
Q: Can you tell what the issue is from here?
A: From !clrstack it appears that we are processing companyinformation.aspx and in the page_load method it calls into BuggyMail.SendEmail. From there we go into some type of recursion between ExceptionHandler.LogException(System.Exception) and Utility.WriteToLog(System.String, System.String), and eventually we run out of stackspace and throw the stackoverflow exception that kills the process... To resolve the issue we need to figure out a way to stop the recursive loop between LogException and WriteToLog. You may be tempted to believe that it is the System.IO.__Error.WinIOError or the GetResourceFromDefault methods that cause the stackoverflow but in reality those methods are just the straws that broke the camels back... They are called in each
Verify your assumption in the code and modify the code to resolve the issue
1. Open ExceptionHandler.cs and Utility.cs and device a plan for how to fix the problem
public static void LogException(Exception ex)
{
Utility.WriteToLog(ex.Message, "c:\\log.txt");
}
public static void WriteToLog(string message, string fileName)
{
try
{
using (StreamWriter sw = new StreamWriter(fileName))
{
//Log the event with date and time.
sw.WriteLine("--------------------------");
sw.WriteLine(DateTime.Now.ToLongTimeString());
sw.WriteLine("-------------------");
sw.WriteLine(message);
}
}
catch (Exception ex)
{
ExceptionHandler.LogException(ex);
}
}
A: From this and the stack we can backtrack what happened... We got an exception in BuggyMail.SendMail and asked to log the Exception which calls WriteToLog, unfortunately we also got an exception in WriteToLog when actually writing to the log (for those of you interested, you can find these exceptions with !dso and the reason we get an exception in WriteToLog is because we (the w3wp.exe process) don't have access to write to c:\log.txt. The solution shold be to have some other exceptionhandling mechanism in WriteToLog.
Have fun,
Tess