September, 2007

  • Never doubt thy debugger

    Remember to undo your impersonation

    • 5 Comments

    A couple of weeks ago I got an interesting query from a customer, whom had a problem impersonating a service user account by code; the design was a bit more complicated, though:

    • Impersonation not set in web.config
    • By code they needed to impersonate the account logged on the client issuing the HTTP request (this worked fine)
    • By code they needed to impersonate a service account they used to access a backend database (here they were getting an access denied error)
    • Switch back to the previous user, the one logged on the client (again this was working fine)

    This was quite clearly an impersonation problem, and after some debugging we found the "Access Denied" was being thrown when executing the line highlighted in red in the following snippet, way before even trying to access the network to read the backend database:

     1: If CType(LogonUser(username, domain, password, LOGON32_LOGON_INTERACTIVE, LOGON32_PROVIDER_DEFAULT, token), Boolean) Then
     2:    If DuplicateToken(token, 2, tokenDuplicate) Then
     3:    Dim identity As New WindowsIdentity(tokenDuplicate)
     4:       If System.Web.HttpContext.Current Is Nothing Then
     5:          Dim mImpersonatedContext As WindowsImpersonationContext = identity.Impersonate
     6:       Else
     7:          System.Web.HttpContext.Current.Items("ImpersonationContext") = identity.Impersonate
     8:       End If
     9:    End If
     10: [...]

    In the screenshot below you can see the "Access Denied" message when trying to display the WindowsIdentity.Name property

    Autos: access is denied

    Since I was able to repro on my machine, I attached WinDbg to the worker process and set a breakpoint on advapi32!ImpersonateLoggedOnUser and having a look at the stack and managed exceptions the problem was quite clear. !gle shows the last error for the current thread:

       1: 0:017> !gle
       2: LastErrorValue: (Win32) 0x5 (5) - Access is denied.
       3: LastStatusValue: (NTSTATUS) 0xc0000022 - {Access Denied}  A process has requested access to an object, 
       4:     but has not been granted those access rights.

    Also confirmed by the managed exceptions:

       1: Exception object: 021314ec
       2: Exception type: System.Web.HttpException
       3: Message: An error occurred while attempting to impersonate.  Execution of this request cannot continue.
       4: InnerException: <none>
       5: StackTrace (generated):
       6:     SP       IP       Function
       7:     01ECF4F0 044DCB73 System.Web.ImpersonationContext.GetCurrentToken()
       8:     01ECF534 041E3BE9 System.Web.ImpersonationContext.get_CurrentThreadTokenExists()
       9:     01ECF564 0417FB4E System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)
      10:     01ECF61C 041922CC System.Web.HttpApplication+ApplicationStepManager.ResumeSteps(System.Exception)
      11:     01ECF66C 0417EEA6 System.Web.HttpApplication.System.Web.IHttpAsyncHandler.BeginProcessRequest(System.Web.HttpContext, System.AsyncCallback, System.Object)
      12:     01ECF688 04183DB5 System.Web.HttpRuntime.ProcessRequestInternal(System.Web.HttpWorkerRequest)
      13:  
      14:  
      15: Exception object: 021312a8
      16: Exception type: System.Security.SecurityException
      17: Message: Access is denied.
      18:  
      19: InnerException: <none>
      20: StackTrace (generated):
      21:     SP       IP       Function
      22:     01ECF24C 79636928 System.Security.Principal.WindowsIdentity.GetCurrentInternal(System.Security.Principal.TokenAccessLevels, Boolean)
      23:     01ECF26C 79389652 System.Security.Principal.WindowsIdentity.GetCurrent()
      24:     01ECF278 06350A92 WebApplication1._Default.Page_Load(System.Object, System.EventArgs)
      25:     01ECF318 04301954 System.Web.UI.Control.OnLoad(System.EventArgs)
      26:     01ECF328 043019A0 System.Web.UI.Control.LoadRecursive()
      27:     01ECF33C 043147C4 System.Web.UI.Page.ProcessRequestMain(Boolean, Boolean)
      28:     01ECF50C 04312982 System.Web.UI.Page.ProcessRequest(Boolean, Boolean)
      29:     01ECF544 0431285F System.Web.UI.Page.ProcessRequest()
      30:     01ECF57C 0431277F System.Web.UI.Page.ProcessRequestWithNoAssert(System.Web.HttpContext)
      31:     01ECF584 04312712 System.Web.UI.Page.ProcessRequest(System.Web.HttpContext)
      32:     01ECF598 063502F6 ASP.default_aspx.ProcessRequest(System.Web.HttpContext)
      33:     01ECF5A4 041BA93F System.Web.HttpApplication+CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
      34:     01ECF5DC 0417FAD1 System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)

    As a test we added the Job Manager account to the local Administrators group and the problem went away, so this was clearly a lack of permission for that account; specifically, the Job Manager user was not allowed to "take ownership" of the current thread, which was already impersonating the client account whom issued the HTTP request. Since we where using service account we checked the permission required in the article Process and request identity in ASP.NET, in particular the ASPNET account specific permission configurable through the Group Policy snap-in (gpedit.msc); we added the required permission, but the problem was still there.

    We then found a quite old (but still applicable) KB article which seemed to be applicable to this problem: LogonUser fails in ISAPI extensions. Here's the interesting part:

    CAUSE

    The code inside LogonUser tries to open the process token. It fails since the authenticated user may not have access to the process token (SYSTEM if it's an inproc ISAPI.)

    RESOLUTION

    As a temporary workaround, you can call RevertToSelf to return the thread to the security context of the process token before calling LogonUser.

    STATUS

    This behavior is by design.

    So, going back in the stack where the failing call started, here is what we find:

       1: Dim test As New customer
       2: Dim col As List(Of CustomerDetails)
       3: Dim p As WindowsPrincipal = HttpContext.Current.User
       4: Dim id As WindowsIdentity = p.Identity
       5:  
       6: 'impersonate the Windows Authenticated User
       7: Dim wic As WindowsImpersonationContext = id.Impersonate()
       8: col = test.getcustomers() ' Works okay to access database
       9:  
      10: 'Error Switching below, step into code
      11: CBPUser.SwitchToIOUser()   ' Access Denied, step through code to see issue arise.
      12: col = test.getcustomers()  ' Fails to access database
      13:  
      14: CBPUser.SwitchFromIOUser() ' Switch back
      15: col = test.getcustomers()  ' Works okay to access database again

    I then had a look at the MSDN docs about impersonation, especially to find some sample code, like for example How To: Using impersonation and delegation in ASP.NET 2.0 and How to implement impersonation in an ASP.NET application: interesting enough all the samples in those articles always revert the impersonation calling the WindowsImpersonationContext.Undo() method (which under the covers ultimately calls RevertToSelf(), as you can guess)...

    Since testing the code in practice is easier an quicker, I added the Undo() call and run it again:

       1: Dim test As New customer
       2: Dim col As List(Of CustomerDetails)
       3: Dim p As WindowsPrincipal = HttpContext.Current.User
       4: Dim id As WindowsIdentity = p.Identity
       5:  
       6: ' impersonate the Windows Authenticated User
       7: Dim wic As WindowsImpersonationContext = id.Impersonate()
       8: col = test.getcustomers() ' Works okay to access database
       9: wic.Undo()
      10:  
      11: ' Error Switching below, step into code
      12: CBPUser.SwitchToIOUser()   'Works fine now, no more access denied!
      13: col = test.getcustomers()  'Works ok to access database
      14:  
      15: CBPUser.SwitchFromIOUser() ' Switch back
      16: col = test.getcustomers()  ' Works okay to access database again

    Much better! smile_regular Just to be sure, Sql Profiler shown a connection with the service account which was our final goal. So the final message is: remember to always use Undo() when you're done with your code impersonation.

    Case closed Sherlock! smile_nerd

     

    Carlo


    Quote of the Day:
    Resentment is like taking poison and hoping the other person dies.
    --St. Augustine
  • Never doubt thy debugger

    My goodness, were's gone my Properties window?!?

    • 12 Comments

    I had a funny half-hour this afternoon, when one of my colleagues took a new call from a customer whom had some troubles with the Properties window in his Visual Studio 2005... The customer reported a weird behavior within the Visual Studio IDE in particular with the Properties window, which was not available despite his attempts to display it both pressing the F4 keyboard shortcut and from the View > Properties Window menu command. Moreover this had the effect to remove the focus from the Visual Studio IDE but apparently nothing else was getting it, and there was a weird Format menu appearing and disappearing... smile_omg

    It was kind of fun to see Stefano on the phone with the customer, listening to his description and his face getting more and more puzzled... smile_teeth Until after some quick research in our internal docs he found a reference to a similar problem, and after a quick test he triumphally called the customer back with the solution after just 20 minutes! thumbs_up

    As I guess you know the Properties window (like other windows in Visual Studio) can be detached from the IDE and left float around the screen; the point here was that the customer somehow moved the window outside the desktop and it was not visible anymore... smile_omg How to get it back?

    • Press F4 (or the corresponding keyboard shortcut) to move the focus to the Visual Studio window you're hunting
    • Press ALT+- (ALT and minus key) to open the small window menu
    • Press M
    • Press ENTER
    • Now the window will be "attached" to the mouse pointer to move it around, but most important we can see it now! smile_angelDock it to the IDE if you wish

    properties

    Carlo


    Quote of the Day:
    I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals.
    --Winston Churchill
  • Never doubt thy debugger

    Why should we care about symbols?

    • 6 Comments

    I already touched this topic a while ago, but since it's an important part of the debugging process (and your debugging techniques may vary a lot, depending if you have or not good symbols for your dump) I though would be a good idea to give some more details. And just to jump start on the topic, here's something I learnt a while ago after wasting a few hours typing commands and looking at inconsistent results... debugging with the wrong symbols could be much worse than debugging with no symbols at all.

    What are symbols?

    You can think of symbol files basically as small databases, files which contain source line information, data types, variables, functions and everything else needed to provide names for all lines of code, instead of hexadecimal addresses. They usually have .pdb (or .dbg) extension, and are matched with the actual executable code using an internal timestamp, so it's very important to generate symbols every time you build the application, also for release builds. If you'll ever have a problem with your live application and you'll need to debug it, and if you'll not have the matching symbols (matching means the symbols obtained from the same exact build of the dlls you put in production) you could be in troubles... Building once again the application to create the symbol files, even without changing your code does not really help because the timestamp will not match, and WinDbg will complain about missing symbols anyway...

    generate debugging information

    Note that in Visual Studio if you set "Release" in the "Standard" toolbar, the "Generate debugging information" checkbox will automatically be cleared, so remember to go to the Project properties and flag it again before rebuilding the project.

    Since in ASP.NET 2.0 we have a new default page architecture, there is no real need to create the .pdb files unless you're using the "Web Application Project" template which comes with the Service Pack 1 for Visual Studio 2005, in which case you can find it from the project properties, "Compile" tab.

    advanced compiler settings

    The same applies for Visual Studio 2008.

    How can we use symbols?

    When you open a dump within WinDbg and you type in a command for example to inspect the call stack, the debugger will start looking for matching symbols to show you an output as detailed as possible: but how does it decide where to look for those symbols? It will use the path(s) specified in the "Symbol Search Path" dialog you can find under the File menu (CTRL+S is the keyboard shortcut.

    symbol search path dialog Here you can specify the symbol servers where WinDbg can download the symbols from, and of course you can use has many server more than one server at a time; WinDbg will simply access those servers in the same order you put in the search path, and it goes through the end of the list until it finds a match.

    Microsoft has a public symbol server accessible through the Internet which stores public .pdb files for almost all of our products: http://msdl.microsoft.com/download/symbols.

    I work a lot with WinDbg and memory dumps in my daily job, also with my laptop when not connected to the Internet or corporate network, so I need to be able to debug while offline and in any case I don't want to waste time waiting for the tool to download the same symbols over and over again, through one dump to the other... for this reason it's also possible to create a local symbol cache (also referred as downstream store) on your hard disk. When looking for symbols, WinDbg will first of all check your local cache and if the match is found there no additional check is done against the other symbol servers, while if the match is not found WinDbg goes on as usual until it finds the right match; in this case it downloads the .pdb and stores it in your local symbol cache, so that next time it will be readily available for you.

    The symbol path is a string composed of multiple directory paths, separated by semicolons. For each directory in the symbol path, the debugger will look in three directories: for instance, if the symbol path includes the directory c:\MyDir, and the debugger is looking for symbol information for a dll, the debugger will first look in c:\MyDir\symbols\dll, then in c:\MyDir\dll, and finally in c:\MyDir. It will repeat this for each directory in the symbol path. Finally, it will look in the current directory, and then the current directory with \dll appended to it. (The debugger will append dll, exe, or sys, depending on what binaries it is debugging.)

    Here is a sample symbol path:

    SRV*c:\symbols*\\internalshare\symbols*http://msdl.microsoft.com/download/symbols

    The above is actually the symbol path (a bit simplified) I use on my machines: as you can see I have a local cache in C:\Symbols; if I've never downloaded a particular symbol before, WinDbg does to an internal share were we have full symbols, and if still unsuccessful I finally give a try to the public Microsoft symbol server on the Internet. If you include two asterisks in a row where a downstream store would normally be specified, then the default downstream store is used. This store will be located in the sym subdirectory of the home directory. The home directory defaults to the debugger installation directory; this can be changed by using the !homedir extension. If the DownstreamStore parameter is omitted and no extra asterisk is included (i,e. if you use srv with exactly one asterisk or symsrv with exactly two asterisks) then no downstream store will be created and the debugger will load all symbol files directly from the server, without caching them locally. Note that If you are accessing symbols from an HTTP or HTTPS site, or if the symbol store uses compressed files, a downstream store is always used. If no downstream store is specified, one will be created in the sym subdirectory of the home directory.

    The symbol server does not have to be the only entry in the symbol path. If the symbol path consists of multiple entries, the debugger checks each entry for the needed symbols; moreover the symbol path can contain several directories or symbol servers, separated by semicolons. This allows you to locate symbols from multiple locations (or even multiple symbol servers). If a binary has a mismatched symbol file, the debugger cannot locate it using the symbol server because it checks only for the exact parameters. However, the debugger may find a mismatched symbol file with the correct name, using the traditional symbol path, and successfully load it; in this case it's important to know if our symbols matches (see the next topic).

    You can set the symbol path in advance once for all within WinDbg: open an empty instance, press CTRL+S, type in the path, clock "Ok" on the dialog and close WinDbg, accepting to save the workspace if prompted to do so (next time you'll open WinDbg the value will still be there). Or you can use the .sympath command within WinDbg with a dump open.

    Another option is to set the system wide variable _NT_SYMBOL_PATH (the syntax is still the same), used by debuggers like WinDbg or also from adplus directly when it captures the dump.

    The same principle applies to the Visual Studio debugger (also have a look at the article http://support.microsoft.com/kb/311503/en-us):

    visual studio symbols options

    How can I check if my symbols matches?

    Looking at a call stack sometimes it's clear you're having a problem with unmatched symbols because WinDbg tells you something like:

     1: ChildEBP RetAddr 
     2: 0012f6dc 7c59a2d1 NTDLL!NtDelayExecution(void)+0xb
     3: 0012f6fc 7c59a29c KERNEL32!SleepEx(unsigned long dwMilliseconds = 0xfa, int bAlertable = 0)+0x32
     4: *** ERROR: Symbol file could not be found. Defaulted to export symbols for aspnet_wp.exe - 
     5: 0012f708 00442f5f KERNEL32!Sleep(unsigned long dwMilliseconds = 0x444220)+0xb
     6: WARNING: Stack unwind information not available. Following frames may be wrong.
     7: 0012ff60 00444220 aspnet_wp+0x2f5f
     8: 0012ffc0 7c5989a5 aspnet_wp!PMGetStartTimeStamp+0x676
     9: 0012fff0 00000000 KERNEL32!BaseProcessStart(<function> * lpStartAddress = 0x004440dd)+0x3d

    Unfortunately could happen to not be so lucky, and you'll find yourself wondering if the stack you are looking at is genuine or there are some small (or maybe even not so small) inconsistencies which may lead you down to a completely wrong path. In such cases, you can first of all use the lm command to find which .pdb files have been loaded:

     1: kernel32 (pdb symbols) .sympath SRV\kernel32.pdb\CE65FAF896A046629C9EC86F626344302\kernel32.pdb
     2: ntdll (pdb symbols) .sympath SRV\ntdll.pdb\36515FB5D04345E491F672FA2E2878C02\ntdll.pdb
     3: shell32 (deferred)
     4: user32 (deferred)

    As you can see in the example above, two symbols were loaded (for kernel32.dll and ntdll.dll), while shell32.dll and user32.dll were not part of the stack analyzed, so WinDbg has not loaded yet (deferred) their symbols. A bad match will look like the following:

     1: ntdll M (pdb symbols) .sympath SRV\ntdll.pdb\36515FB5D04345E491F672FA2E2878C02\ntdll.pdb

    Notice the "M" highlighted in red (could also be a "#" pound sign)? That stands or "mismatch", and indicates there is a problem with that particular module (search for "Symbol Status Abbreviations" in WinDbg help for further details). Alternatively you can use the !sym noisy and .reload command to reload symbols verbosely to have a detailed output. Look for "Symbols files and paths - Overivew" in WinDbg help for more details.

    Trick: you have the right symbol, but WinDbg does not matches it anyway...

    I'm not sure why this happens, and especially why I had this problem only with ntdll.dll (and its .pdb): I was not able to get the proper stack even with a matching symbol (and I checked more than once to be really sure)... until I got the idea to delete the ntdll.pdb folder in my local cache (if you have a dump open you must first unload the symbol from WinDbg or the file will be locked: use the .reload /u <module_name> command), then run a .reload /f <module_name> (/f forces immediate symbol load) and let WinDbg to download it again... this usually does the trick and I finally get the correct stack.

    Debugging without symbols?

    It's not impossible, but it's harder than debugging with matching symbols; the main difference is that you'll not be able to see method names, variable names etc... and generally speaking the stack will be less easily readable. To give you a quick example, here is an excerpt of the stack of a very simple application I wrote for test (it has just a button which sets the text of a label to the DateTime.Current.ToString()):

    without symbols:

     1: 5 Id: 10d4.1204 Suspend: 1 Teb: 7ffd7000 Unfrozen
     2: ldEBP RetAddr 
     3: NING: Frame IP not in any known module. Following frames may be wrong.
     4: 9f524 71a6b7f8 0x7c90eb94
     5: 9fa0c 03490657 0x71a6b7f8
     6: WARNING: Unable to verify checksum for System.dll
     7: ERROR: Module load completed but symbols could not be loaded for System.dll
     8: 9fa40 7a603543 CLRStub[StubLinkStub]@3490657(<Win32 error 318>)
     9: a8240 032908ff System!System.Net.Sockets.Socket.Accept(<HRESULT 0x80004001>)+0xc7
     10: ERROR: Module load completed but symbols could not be loaded for WebDev.WebHost.dll
     11: 9fab0 7940a67a WebDev_WebHost!Microsoft.VisualStudio.WebHost.Server.OnStart(<HRESULT 0x80004001>)+0x27
     12: WARNING: Unable to verify checksum for mscorlib.dll
     13: ERROR: Module load completed but symbols could not be loaded for mscorlib.dll
     14: bd1b4 7937d2bd mscorlib!System.Threading._ThreadPoolWaitCallback.WaitCallback_Context(<HRESULT 0x80004001>)+0x1a
     15: bd1b4 7940a7d8 mscorlib!System.Threading.ExecutionContext.Run(<HRESULT 0x80004001>)+0x81
     16: 9fae0 7940a75c mscorlib!System.Threading._ThreadPoolWaitCallback.PerformWaitCallbackInternal(<HRESULT 0x80004001>)+0x44
     17: 32010 79e79dd3 mscorlib!System.Threading._ThreadPoolWaitCallback.PerformWaitCallback(<HRESULT 0x80004001>)+0x60
     18: 9fb04 79e79d57 0x79e79dd3
     19: 9fb84 79f71cba 0x79e79d57
     20: 9fba4 79f71c64 0x79f71cba
     21: 9fc08 79f71cf3 0x79f71c64
     22: 9fc3c 7a0b0896 0x79f71cf3
     23: 9fc9c 79f7ba4f 0x7a0b0896
     24: 9fcb0 79f7b9eb 0x79f7ba4f
     25: 9fd44 79f7b90c 0x79f7b9eb
     26: 9fd80 79ef9887 0x79f7b90c
     27: 9fda8 79ef985e 0x79ef9887
     28: 9fdc0 7a0a32da 0x79ef985e
     29: 9fe28 79ef938f 0x7a0a32da
     30: 9fe94 79f7be67 0x79ef938f
     31: 9ffb4 7c80b683 0x79f7be67
     32: 9ffec 00000000 0x7c80b683

    with matching symbols:

     1: 5 Id: 10d4.1204 Suspend: 1 Teb: 7ffd7000 Unfrozen
     2: ldEBP RetAddr 
     3: 9f4e4 7c90e9c0 ntdll!KiFastSystemCallRet
     4: 9f4e8 71a54033 ntdll!ZwWaitForSingleObject+0xc
     5: 9f524 71a6b7f8 mswsock!SockWaitForSingleObject+0x1a0
     6: 9f9bc 71ac0e2e mswsock!WSPAccept+0x21f
     7: 9f9f0 71ac103f ws2_32!WSAAccept+0x85
     8: 9fa0c 03490657 ws2_32!accept+0x17
     9: WARNING: Unable to verify checksum for System.ni.dll
     10: 9fa40 7a603543 CLRStub[StubLinkStub]@3490657(<Win32 error 318>)
     11: a8240 032908ff System_ni!System.Net.Sockets.Socket.Accept(<HRESULT 0x80004001>)+0xc7
     12: 9fab0 7940a67a WebDev_WebHost!Microsoft.VisualStudio.WebHost.Server.OnStart(<HRESULT 0x80004001>)+0x27
     13: WARNING: Unable to verify checksum for mscorlib.ni.dll
     14: bd1b4 7937d2bd mscorlib_ni!System.Threading._ThreadPoolWaitCallback.WaitCallback_Context(<HRESULT 0x80004001>)+0x1a
     15: bd1b4 7940a7d8 mscorlib_ni!System.Threading.ExecutionContext.Run(<HRESULT 0x80004001>)+0x81
     16: 9fae0 7940a75c mscorlib_ni!System.Threading._ThreadPoolWaitCallback.PerformWaitCallbackInternal(<HRESULT 0x80004001>)+0x44
     17: 32010 79e79dd3 mscorlib_ni!System.Threading._ThreadPoolWaitCallback.PerformWaitCallback(<HRESULT 0x80004001>)+0x60
     18: 9fb04 79e79d57 mscorwks!CallDescrWorker+0x33
     19: 9fb84 79f71cba mscorwks!CallDescrWorkerWithHandler+0xa3
     20: 9fba4 79f71c64 mscorwks!DispatchCallBody+0x1e
     21: 9fc08 79f71cf3 mscorwks!DispatchCallDebuggerWrapper+0x3d
     22: 9fc3c 7a0b0896 mscorwks!DispatchCallNoEH+0x51
     23: 9fc9c 79f7ba4f mscorwks!QueueUserWorkItemManagedCallback+0x6c
     24: 9fcb0 79f7b9eb mscorwks!Thread::DoADCallBack+0x32a
     25: 9fd44 79f7b90c mscorwks!Thread::ShouldChangeAbortToUnload+0xe3
     26: 9fd80 79ef9887 mscorwks!Thread::ShouldChangeAbortToUnload+0x30a
     27: 9fda8 79ef985e mscorwks!Thread::ShouldChangeAbortToUnload+0x33e
     28: 9fdc0 7a0a32da mscorwks!ManagedThreadBase::ThreadPool+0x13
     29: 9fe28 79ef938f mscorwks!ManagedPerAppDomainTPCount::DispatchWorkItem+0xdb
     30: 9fe3c 79ef926b mscorwks!ThreadpoolMgr::ExecuteWorkRequest+0xaf
     31: 9fe94 79f7be67 mscorwks!ThreadpoolMgr::WorkerThreadStart+0x223
     32: 9ffb4 7c80b683 mscorwks!Thread::intermediateThreadProc+0x49
     33: 9ffec 00000000 kernel32!BaseThreadStart+0x37

    The difference is quite obvious... The WinDbg help file also gives a few hints:

    1. To figure out what the addresses mean, you'll need a computer which matches the one with the error. It should have the same platform (x86, Intel Itanium, or x64) and be loaded with the same version of Windows
    2. When you have the computer configured, copy the user-mode symbols and the binaries you want to debug onto the new machine
    3. Start CDB or WinDbg on the symbol-less machine
    4. If you don't know which application failed on the symbol-less machine, issue an | (Process Status) command. If that doesn't give you a name, break into KD on the symbol-less machine and do a !process 0 0, looking for the process ID given by the CDB command
    5. When you have the two debuggers set up — one with symbols which hasn't hit the error, and one which has hit the error but is without symbols — issue a k (Display Stack Backtrace) command on the symbol-less machine
    6. On the machine with symbols, issue a u (Unassemble) command for each address given on the symbol-less stack. This will give you the stack trace for the error on the symbol-less machine
    7. By looking at a stack trace you can see the module and function names involved in the call

    I got symbols from my customer: what should I do now?

    The easiest thing you could do is use symstore.exe (you'll find it in WinDbg/adplus installation folder) with a command like the following:

    symstore add /f c:\temp\SymbolsTest\Bin\*.pdb /s c:\symbols /t "Symbols Test"

    Let's have a quick look at the syntax:

    • The "add" keyword is quite self explanatory smile_wink
    • /f tells symstore which is the file you want to add; note you can use wildcards, so you can add multiple files at once
    • /s is the path to your symbols store (this will most likely be your local cache, or your shared symbol server)
    • /t a required description for the symbol to store

    Symstore will create a structure similar to the following:

    symstore

    Also note the "0000Admin" folder created by symstore, which contains one file for each transaction (every "add" or "delete" operation is recorded as a transaction), as well as the logs server.txt and history.txt; the former contains a list of all transactions currently on the server, while the latter contains a chronological history of all transactions run on the machine. For further information you can see the "Using SymStore" topic in WinDbg help (debugger.chm).

    Conclusion

    While it is possible to debug without symbols (this is true especially for managed code), remember that your life could be much easier if you (and your customers) will take care of your symbols and will generate them every time the application will be rebuilt, also in release mode.

    I deliberately simplified the argument (I just wanted to share what I learnt in my daily job and give some quick tips to get started), much more could be said and if you're interested I encourage you to read the "Symbols" section in the WinDbg help, or search the Internet where you'll find good blog posts/articles on this subject smile_nerd

    Carlo

  • Never doubt thy debugger

    SyncToy not working on Vista x64?

    • 3 Comments

    I've been using SyncToy for quite a few months to keep in sync some folders between my laptop and the other two machines I have in office, and it always worked just great for me (I know, I should be using Groove instead but I'm not happy to have services and programs running when they want, instead of when I tell them to run... smile_nerd).

    When I switched my primary desktop in office to Vista x64, I very quickly discovered that SyncToy was crashing immediately after running it, with no error messages or clues about what is going wrong... I didn't had much time to spend debugging it and try to figure out what was going wrong (it's not a must have tool for my work, after all...) so I simply used the laptop to synchronize folders between the two desktops, too...

    Until this morning, when I had a few minutes free and decided to get back to this problem and try to fix it once for all (and write a blog post on it, too smile_wink); anyway before even opening WinDbg, a research on the Internet brought me to this blog: http://joshmouch.wordpress.com/2007/03/27/synctoy-14-and-vista-x64-error-fixed/. I tried, and it works like a charm! thumbs_up Wonderful, thanks a lot Josh!

     

    Carlo


    Quote of the Day:
    The conventional view serves to protect us from the painful job of thinking.
    --John Kenneth Galbraith
  • Never doubt thy debugger

    Something you need to know before start debugging

    • 3 Comments

    It may appear as a contradiction after my previous post, but the first thing to do to start analyzing a memory dump is ask yourself: do I really need a dump?!? smile_omg

    Let me explain: when you need to troubleshoot an error there are a number of things to do before really going down the dump path, simply because not all problems can be resolved in that way... keep in mind that a dump is nothing more than a snapshot of a process at a certain point in time, we can try to understand what happened in the past but with some limitations (as long as the details we are looking for are still in memory), and of course we can't know what happened to the process after the dump has been taken; exactly like a picture you can take with your digital camera. For example if you are having problems to remotely debug your application, I hardly think a dump can add any value to your troubleshooting... while in case of a memory leak a dump is one of the first things I ask the customer to provide me (but again there are some things before this step).

    What's first, then? Well, as you can guess, the first step is to understand the problem and know the scenario where it reproduces. If you are troubleshooting your own application you should already know most of the details, but if you are a consultant and are helping one of your customers with a weird exception thrown one in a while, you must have an open an ongoing discussion with the people whom developed the application, and maybe with the IT pros whom are maintaining the application and the environment day by day.

    Let's assume we have the information we need to start, and we decided we need to capture a dump. But which kind? How? When? Moreover, are you sure you and your customer are talking the same language and using the same terms to name things? I tell you because I learnt this lesson on my own in the hard way... the customer was describing a crash in his application so we configured adplus to run in crash mode, but some some reason we were unable to get a dump when the crash reproduced; we kept trying, but after 3-4 runs we gave up. Finally it turned out that the crash the customer was reporting was "just" an exception not handled in a try...catch block shown to the final user (do you know the yellow/orange ASP.NET error page?) but the worker process was still happily serving requests for other users... smile_thinking

    Some terminology

    So, here is some basic terminology: if your customer is not expert in this area, assure he understands those terms and stick to them to avoid confusion. This still apply if you'll ever need to raise a support call with Microsoft CSS, this is the terminology you can expect to be used

    • crash: this refers to a process which for some reason (usually an unhandled exception) is terminated by the operating system. How to be sure? Check the TaskManger when the problem occurs, and if the process gets a new PID, it has been recycled. And check your event log: usually you'll have a message like "process xyz terminated unexpectedly"
    • hang: the application reach a status where it's unable to continue serve incoming requests (and maybe the users are getting a "server too busy" error) but the process does not crash. In such a situation the target process could simply sit there in memory doing nothing, and you have to restart it manually to restore normal application activities. Note that IIS has a mechanism to automatically detect the status of its worker processes, and if one of them for some reason does not respond to regular pings, after a certain timeout elapses IIS assumes the process is hanging and recycles it. In this latter case the symptom may looks like a crash, but it really isn't and you can tell because you'll not have the "process xyz terminated unexpectedly" message, but rather you'll have something like "a process serving application pool xyz has failed to respond to a ping"
    • deadlock: imagine thread 1 in your application has acquired a lock on resource A (a handle, a socket etc...) but to complete its work must also access at the same time resource B; now imagine this resource B is locked by thread 2 which in turn is waiting for resource A (remember it's locked by threads 1?)... we have a deadlock (is the same concept as the circular reference in a Excel sheet) because this situation does not have a solution, unless one of the two threads finally times out and release the resource it was locking
    • leak: we have a memory leak when a process keeps growing over time and never releases back the memory to the operating system, until it eventually throws an OutOfMemoryException and it finally crashes. In this case capturing a dump when the process is being terminated is almost certainly too late, so it's better to capture a manual dump when the process is approaching it's size limit, but before the actual crash. By the way, a leak can take only 5 minutes to cause the process to crash, or it might take some days; but the pattern is always the same, as the OOM exception and the crash at the end. The smaller (and slower) the leak, the more difficult will be to find the culprit(s) of the problem...

    Crash or Hang dump?

    So, now that we gathered all those details about the problem, which is the right approach to capture the dump we need? It depends on the problem, of course. There may be some variations depending on the circumstances, but basically we can capture either a crash or hang dump. What's the difference?

    We'll need a crash dump when we can't determine when the problem (typically a crash like the name implies, but that's not the only case) will happen, so we can configure the debugger in advance to monitor our target process and capture a dump when the process will be terminated, or when we need to capture a dump on a specific exception (as I'll discuss in another post). On the other hand, we can capture a hang dump after the problem has occurred but the process is still in memory, for example in a memory leak scenario but also when a process is burning our your CPU.

    First or second chance?

    As the name suggests, exceptions should be the exception rather than the rule; so for example it's always a good idea to check if an object is valid before trying to use it, rather than let the runtime throw an exception and trap it in a try...catch block. Anyway what happens when you have a debugger attached to a process? The debugger gets the first chance to handle the exception; If it allows the execution to continue and does not handle the exception, the application will see the exception as usual. If the application does not handle the exception, the debugger gets a second chance to see the exception; in this case the application would normally crash if the debugger was not present.

    This is more clear when you use adplus in crash mode: by default the debugger attaches to the target process and logs every exception thrown; if you try you'll very likely end up with quite a few minidumps (a few megabytes each) corresponding to every exception thrown and trapped in try...catch blocks in your code, and a second chance full dump (the same size as the process, the private bytes value you can see in TaskManager) when the process will crash.

    How much does this cost?

    I mean in terms of performance for your server which could potentially be a highly stressed production environment? Of course there is a cost, especially for a crash dump because you'll have a debugger attached to your worker process for the time needed to reproduce the problem, but it's hard to exactly tell how much; in my experience I just had one server where the debugger was really affecting the site and forced us to stop it. But it worth mentioning that the server was already beyond its capacity limit and was already performing badly, the debugger was just the last straw...

    Having a repro in a test environment is the ideal situation, since we'll be able to capture dumps, run tests and do whatever needed to resolve the problem without causing additional pain to your poor users.

     

    Carlo


    Quote of the Day:
    keep it real/keep it clean/keep it simple
    --rdude
  • Never doubt thy debugger

    New to debugging? How it all begun (and how could begin for you, too...)

    • 6 Comments

    When I joint Microsoft and the EMEA Internet Dev Support Team in late 2004, I soon realized that I had to build a new skillset to have a future in my new role; before that I was a kind of "self made" developer, in the sense that almost everything I learnt I did in "the hard way", buying and readings my own books, building a lot of samples, testing, making a lot of mistakes etc... I've not had a real mentor, I simply tried to learn from my own experiences, testing things "in real life" and watching the results. And for me in that period the word "debugging" meant only the Visual Studio debugger, run the application and inspect my code hunting for bugs; I never thought to WinDbg, memory dumps and deep system internals (well, to he honest I knew something about memory dumps, but I thought it was too hard for me and never really tried to get my hands dirty with that stuff).

    Then I join Microsoft and the iDev team, and my horizons expanded. Not too much time had to pass before I realized the kind of cases we were managing required some different (and deeper) skills/knowledge, and more and more often I had to seek assistance from my colleagues with this new stuff. That's how thing goes at the beginning in a new role, but I like to "walk with my own legs" (if you understand what I mean) so I started searching for manuals, how to etc..., and guess what? I found a lot of information but not really organized so that a newby like me could start learning the subject... and the more I was asking advices about training and books, the more I was getting replies like "that's a matter of experience, you need to jump on it and start digging!".

    To make a long story short, I've finally accepted the advice and made some kind of cut and paste of the information I needed to start. Here and in upcoming posts I'll try to write down what I learnt so far; of course I've not the nerve to believe this will be a definitive guide to debugging (later on this post I'll mention some docs and blogs which deals with advanced topics), but I'll be happy if I'll save you some time especially at the beginning when you're trying to figure out how this new thing works, and someone will find this useful and intriguing enough to start digging into this interesting (and sometimes surprising) world. smile_regular

    The tool case

    We have to start somewhere, so like an apprentice carpenter the first thing to do is to know our tool kit and which is the right one for the task at hand.

    Debugging tools for Windows

    (download) This is the foundation of managed and unmanaged debugging; of course you can find a lot of different debuggers out there on the Internet, but WinDbg is one of the most powerful you can find, flexible and extendible (through a sort of plugins/extensions you can develop with the SDK) which allows you to debug you managed and native code, and also the Windows kernel. Or you can attach it to a process to debug live. The setup package will install  a number of useful tools like adplus (a .vbs script which takes some arguments to capture a memory dump you can then analyze offline with WinDbg), gflags and many others.

    DebugDiag

    (download) This has originally been developed by the IIS Escalation Engineers team and has a marked shape for IIS debugging (even if it can be also used to dump other processes); it has also a couple of built-in scripts useful to automatically analyze a crash or memory leak dumps and have a nicely formatted report which can give you an idea of what's happening. To be honest I prefer to use adplus, but I find DebugDiag quite useful in some specific scenarios, particularly when we have an unmanaged memory leak: this is because DebugDiag uses a dll called LeakTrack which gets injected into the target process (usually w3wp.exe) to track memory allocations, and it can greatly help the automatic analysis to show where the memory is going, who is allocating it etc... This is not enough to resolve the problem, but it's a quick way to find some good clues and steer your next steps in the right direction.

    Reflector

    (download) This is the tool to disassemble managed code, with an interesting list of add-ins to expand it's capabilities; very handy to have a look at the source code of managed components extracted from a dump

    LogParser

    (download) I discovered this tool only recently and I'm still playing with it to explore all of its capabilities (see my previous posts here and here); anyway it's very powerful and as the name implies can be used to analyze log files and other inputs (File System, Event Logs, IIS logs, CSV and many more) with a Sql-like query syntax and can output the result in as many formats

    Sysinternals tools

    There are a lot of them, but what I use most are Process Explorer, Process Monitor and the PSTools.

    Of course there are many more tools I use in my daily job, but the above list should be enough for most occasions (and for at least half of them you'll just need WinDbg with a good extension).

    Some good docs

    As you can guess, the right tools are like a gun without bullets, if you don't know how to use them and most important how to understand and interpret the output they give; moreover, the more you know the platform (and with the term platform here I mean both the .NET world and the underlying OS where it runs) the more you'll be able to understand the values you'll get, spot possible incongruities and imagine understand why in the given scenario things are going wrong.

    Unfortunately I can't point you to the internal documentation I'm lucky enough to have access to, but you must definitely have a look at the following:

    I think that's enough for this first post, just a short introduction of what I'll try to explain in the upcoming weeks; by the way, if you have any topic about managed debugging you'd like me to discuss, feel free to drop me an email of leave a comment. If we can add some interactivity to this thing, we'll hopefully have something useful for everyone interested.

    Carlo


    Quote of the Day:
    Imagination is everything. It is the preview of life's coming attractions.
    --Albert Einstein
  • Never doubt thy debugger

    Visual Studio as the definitive text editor?

    • 0 Comments

    That's a clever question and I have to admit I've never thought to Visual Studio in this way, even if I tried different text editors and I still have more than one on my machine (Notepad2, Notepad++, PSPad).

    Anyway, while you thing about it take a few moments to leave a comment on Noha's post to express your opinion and request features you are missing to turn Visual Studio into the editor for you smile_regular

     

    Carlo


    Quote of the Day:
    What single ability do we all have? The ability to change.
    --George Leonard Andrews
  • Never doubt thy debugger

    Quickly arrange icons on your toolbars

    • 1 Comments

    This is a trick I learnt a few years ago in Office (don't remember exactly if was 97 or 2000 Thinking smile) since I like to customize my working environment "in my way" (like everyone else, I guess Winking smile), but I don't like to waste time digging into menus, config files etc if possible...

    It's easy: hold the ALT key and simply...:

    • Drag the button to move it in the position you like, also in a different toolbar
    • Slightly drag the button to the right to insert a vertical separator on the left of the icon you are touching
    • Slightly drag the button on the left to insert a vertical separator on the right of the icon you are touching
    • Drag the button down (outside the toolbar) to remove it

    arrange icons

    Of course this works only for icons and button you already have on the toolbar, if you need to add a new one you must do it the usual way (right click on the toolbar and chose the "Customize" command). Oh, this does not work with the Ribbon in Office 2007.

     

    Carlo

Page 1 of 1 (8 items)