Welcome to MSDN Blogs Sign in | Join | Help

Whitepaper on the Debug Diagnostic Tool v1.1

I have recently published a whitepaper on the Debug Diagnostic Tool v1.1 (Debugdiag) at the following location in the download center.

The whitepaper gives a detailed description of the tool. It explains the main components of the tool and their usage. It also provides Walkthroughs on how to configure the tool to get the right data and how to analyze it. Debug Diagnostic Tool v1.1 is currently available from this Microsoft web site.

Posted by Mouradl | 3 Comments

Post-mortem Analysis of Userdumps with Debugdiag

This post is an extract of an MSDN whitepaper on debugdiag that will be released soon!

 

Post-mortem analysis with the windows core debuggers (windbg.exe or cdb.exe) is a time consuming process and requires a lot of debugging skills.

Automated post-mortem analysis of usedumps is basically one of the main goals of debugdiag. It is delivered via the analysis module of the tool and promises to give accurate solution by:

· Separating the raw data extraction from analysis algorithms.

· Providing a script based solution for building analysis algorithms, thus reducing the debugging skills necessary for implementing such analysis scripts.

· Providing an extensible object model solution to meet the demands of future unidentified requirements.

· Providing a built-in HTML based report generation and formatting solution similar to ASP pages.

Debugdiag is shipped with 2 main analysis scripts CrashHangAnalysis.asp and MemoryAnalysis.asp. The former is for analyzing crash and hang userdumps while the latter is for memory and handle leak analysis.

Note: Debugdiag 1.0 was shipped with "IISAnalysis.asp" and "MemoryAnalysis.asp" instead.

Before you start the analysis, make sure that you have lined up the debug symbols properly. Debugdiag accesses by default the Microsoft public symbol server, but you would need Internet connection for that. To add or modify the symbol path, go to Tool -->Options and Settings… --> Symbol Search Path for Analysis.

To analyze a userdump, open the debugdiag UI and click the Advanced Analysis view. The following window is displayed:

 

image

Add the userdumps to be analyzed by clicking Add data Files

 

image

You could select multiple userdumps for analysis, once the selection is made, click Open.

image

Choose the analysis category (Crash/Hang or Memory Pressure) or you could even choose both analysis categories to run against the userdump(s).

Once the selection is made, click Start Analysis, debugdiag will show the analysis progress as follows.

 

image

 

Another way to start the analysis is from the rules view when the rule has already generated dumps. Right-click the rule and select Analyze data.

You could also start the analysis of a user dump by just going to the userdump in windows explorer and right click the file and choose the type of analysis required.

Once the analysis is complete, debugdiag will automatically save and open the analysis report. The report will be saved in the DebugDiag\Reports folder and will open automatically in Internet Explorer.

Every Analysis report is composed of 3 main sections:

Analysis Summary

The analysis summary is an event viewer type of message that records errors, warnings and information relevant to the userdump analysis along with their descriptions and recommendations to resolve the problem they show.

Analysis Details

The Analysis Details section starts with a table of contents which lists all the memory dumps that are analyzed. For each memory dump, there is a listing of report titles indicating the type of analysis that was performed.

Script Summary

In this section, the analysis will report the status of the script that was ran to analyze the userdump. If there was any error(s) encountered while running the script, this section will show the Error Code, Source, Description, and the line(s) that caused the error(s)

Posted by Mouradl | 0 Comments

Debugging Heap corruption with Application Verifier and Debugdiag

When dynamic allocation/deallocation of memory is not handled properly by user code, this might lead to memory blocks in the heap being corrupted. There are many causes of heap corruption. Some of the common causes are: Buffer overrun (Writing beyond the allocated memory), Double free (Freeing a pointer twice) and Old pointer reuse(Reusing a pointer after being freed).  The difficulty with troubleshooting heap corruption is because when a thread for instance corrupts the heap, the process does not terminate or throws an error! As long as the corrupted heap is not used, the process will not crash, but once a thread tries to use that corrupted block of memory in the heap, the process crashes! If a crash rule is active and the process crashes because of heap corruption,  what we would see as a “culprit” thread that caused the crash is actually nothing more than a victim thread!

So to get to the root of the problem and find out the cause of the corruption, that is the thread that corrupted the heap, Pagegeap should be enabled. Pageheap could be enabled directly in debugdiag via a crash rule and would provide the desired results, but if you would want to get more granular information about the corruption to simplify the code fix, Application verifier could be used in conjunction with debugdiag to get such information.

To turn on pageheap for the worker process w3wp.exe and attach the debugdiag debugger host to it, here is how to do that:

First, download and install both tools:

- Download and Install Debugdiag 1.1

- Download and Install Application Verifier v3.4

- Start Application Verifier (Start --> Programs --> Application Verifier --> Application Verifier).

- Click File --> Add Application and browse to C:\Windows\System32\Inetsrv\w3wp.exe

- In the Tests Panel, expand Basics checkbox and uncheck all except Heaps

AppVerif-w3wp 

- In the Tests Panel again, select Heaps checkbox and click Edit --> Verifier Stop Options

VerifierStops

This basically shows the stop codes that application verifier generates. The defaults actions are for all stop codes. The most important action here is  the "Breakpoint" in the Error Reporting section which means that Application Verifier will call into the breakpoint exception when it detects that the heap is being corrupted.

- Start Debugdiag (Start --> Programs --> Debug Diagnostic Tool 1.1 --> Debugdiag 1.1  (x86)

- Add a crash rule against a specific process.

- Type in "w3wp.exe" in the "Select Target" window and make sure the "This process instance only" check box is unchecked!

- In the "Advanced Configuration (Optional)" window, click Exceptions... and add 80000003 exception with an action type of Full Userdump.

- Finish the wizard and Activate the rule. 

- Restart IIS so the new w3wp.exe loads both pageheap layer and application verifier dlls.

Note: Since pageheap is enabled per process, every instance of w3wp.exe running on the system will have pageheap on along with application verifier. There is also a performance impact associated with pageheap that would cause the processing to slow down due to heap verification.

So basically, the above configuration will make application verifier calls into the breakpoint exception when it detects that a heap operation is corrupting the heap. When the breakpoint exception is called, debugdiag will generate a full userdump. Post-mortem  analysis of the userdump will give details about the corruption such as the call stack, the type of corruption, the heap address being corrupted... etc.

Here is a simple example on how application verifier calls into the breakpoint exception after detecting a buffer overrun.

  

0:009> kb
ChildEBP RetAddr  Args to Child             
0685f71c 004c3933 139f8126 02206ff8 02206ff0 ntdll!DbgBreakPoint
0685f920 004c7487 004cb5d8 00000013 0a501000 vrfcore!VerifierStopMessageEx+0x4bd
0685f944 009030f9 00000013 008f33a8 0a501000 vrfcore!VfCoreRedirectedStopMessage+0x81
0685f974 008f97aa 00000013 008f33a8 0a501000 vfbasics!VfBasicsStopMessage+0x1c9
0685f9d8 008f8ed8 0685fa00 0685fa00 0685fa10 vfbasics!AVrfpCheckFirstChanceException+0x13a
0685f9e8 7c84f937 0685fa00 0685faac 0685faac vfbasics!AVrfpVectoredExceptionHandler+0x18
0685fa10 7c813fb5 00000000 02206ff0 7c888f68 ntdll!RtlpCallVectoredHandlers+0x57
0685fa24 7c814055 0685faac 0685fac8 77bd8930 ntdll!RtlCallVectoredExceptionHandlers+0x15
0685fa94 7c82ecc6 0685faac 0685fac8 0685faac ntdll!RtlDispatchException+0x19
0685fa94 09531614 0685faac 0685fac8 0685faac ntdll!KiUserExceptionDispatcher+0xe
0685fda4 095313ef 0686de18 00000001 0686de18 badEXT!doHC1+0x24
0685fdc4 5a322991 0686de18 0686cb60 0686d7a8 badEXT!HttpExtensionProc+0x108
0685fde4 5a3968ff 0686dd90 095312e7 0685fe10 w3isapi!ProcessIsapiRequest+0x214
0685fe18 5a3967e0 00000000 00000000 0686cb60 w3core!W3_ISAPI_HANDLER::IsapiDoWork+0x3fd
0685fe38 5a396764 0685fea8 0686cb60 00000000 w3core!W3_ISAPI_HANDLER::DoWork+0xb0
0685fe58 5a3966f4 0686cb60 00000000 0685fe84 w3core!W3_HANDLER::MainDoWork+0x16e
0685fe68 5a3966ae 0686cb68 0686cb60 00000001 w3core!W3_CONTEXT::ExecuteCurrentHandler+0x53
0685fe84 5a396648 00000001 0685fea8 07e84ff8 w3core!W3_CONTEXT::ExecuteHandler+0x51
0685feac 5a392264 00000000 00000000 00000000 w3core!W3_STATE_HANDLE_REQUEST::DoWork+0x9a
0685fed0 5a3965ea 00000000 00000000 00000000 w3core!W3_MAIN_CONTEXT::DoWork+0xa6

....

The exception code here that Application Verifier is raising is 00000013  which means a Buffer Overrun. 

The code that I used to test this is:

   1: {
   2:     char *ptr, *tmp;
   3:     int i;
   4:  
   5:     ptr = (char*)GlobalAlloc(GMEM_FIXED, 16);
   6:     tmp = ptr;
   7:  
   8:  
   9:     for (i = 0; i < 32; ++i)
  10:         *(tmp++) = 'a';
  11:  
  12:     GlobalFree(ptr);
  13:  
  14: }
 
Page view tracker