Welcome to MSDN Blogs Sign in | Join | Help

Just wanted to post a quick note to apologize for not responding to any comments over the last 4 weeks as I have been on an extended vacation with very limited internet access.  Next week I will be attending an event so the updating will probably be slow then too, but I'll go through all the comments and emails and answer the ones I can the week after and I shoud have some new posts comming out then too.

Hope y'alls summer has been great so far

Tess

Today I got a question from a reader (Chris) about a memory leak they are seeing in their application

When we do a '!dumpheap -min 85000 -type Byte[]' we can see 100s of byte array objects using up ~545MB of memory. A majority of the objects are the same size (either 4.5 or 9MB in size). Looking at the memory addresses, they all appear to be different copies of our assemblies. And when we do !GcRoot on those addresses they all either have no results returned, or show a rooted System.Security.Policy.Evidence object:

DOMAIN(000FDC18):HANDLE(Strong):-:Root:-(System.Security.Policy.Evidence)->
(System.Collections.ArrayList+SyncArrayList)->
(System.Collections.ArrayList)->
(System.Object[])->
(System.Security.Policy.Hash)->
(System.Byte[])

Through PerfMon we can see our Large Object Heap size grow and grow, and through !dumpheap it looks like it is getting fragmented.

We have not figured out how/why so many copies of our main assembly’s would be loaded by different rooted System.Security.Policy.Evidence object over and over again and never released. Our code never creates and new AppDomains and never manually loads any assembly. It receives a lot of web service requests, makes some database queries and/or makes its own web service requests and then returns the results.

Chris did a lot of ground work here and did everything exactly right

1. Looked at the .net memory (!eeheap -gc) to find out that that is where his memory issue is, since most of the memory is managed (.net)

2. Looked at the large objects (!dumpheap -min 85000) and identified that the byte[]s on the LOH were taking up most of the .net memory

3. Compared the byte[]s to find a pattern (they were all 4,5 MB byte arrays and 9 MB byte arrays)

4. Looked at the contents of the byte arrays (dc <addr of byte array>) and noticed that the byte arrays contained text that resembled some of his assemblies, and not only that, he noticed that they were the same assemblies so it looks like the same data is loaded over and over

5. Looked at the roots for the byte[]s and found that they were rooted in System.Security.Policy.Evidence

From here it gets a bit tricky to figure out the cause of the issue.  But it turns out that when you load an assembly that reference another assembly, a security.policy.hash is generated.  The hash it generates is about the same size as the referenced dlls, so that is where the 4.5 and 9 MB come from.  This is then cached in case another dll is referencing the same dll so the hash wont be regenerated.

That is all well and good, the question is why are we then generating so many of these policy hashes?

There was an issue in .net 2.0 RTM where, if memory pressure is high (i.e. if system memory is low) we would flush some items from cache which meant that we need to regenerate these policy.hashes.  Of course this in turn may cause more memory pressure so we flush again and create again etc. etc.

Solution

This issue is fixed in http://support.microsoft.com/default.aspx/kb/929963 FIX: The ASP.NET cache may flush some assemblies if the system memory is low, which is included in SP1 for .Net framework 2.0, so preferably one should install SP1 to fix this issue.

Have a good one,

Tess

It is not often that I stray too far from debugging and ASP.NET on my blog, but this time I am going to make an exception because I am just really excited about this:) so I'll put this post in the "bag of chips" category...

I finally installed the first games on my Zune, and in reality it wasn't even a game, but a clock, a feature that I had really been missing.

I am sure that it will get a lot easier as everything goes public but for now, here is what I did to install the games with the XNA Game Studio 3.0 CTP

1. Installed Visual Studio (C#) 2008 (well actually I had it installed already).  Any version of VS 2008 will do including express versions.

2. Installed Microsoft XNA Game Studio 3.0 CTP

3. Downloaded some sample applications from www.zuneboards.com

4. Connected my Zune to the computer and updated it with the latest Zune software (2.5 1614)

5. Opened one of the projects in Visual Studio, this gives you a new toolbox item below the regular buttons

image

6. Clicked on the XNA Screen (the monitor with the X on it) to open the XNA Game Studio Device Center

image

7. Clicked on Add Device to open XNA Game Studio Devices

image

7. Chose Zune to bring up the Select your Zune window

image

8. Selected Tess Zune and clicked next.  This test the connection and connects the Zune, and from this point on the Zune will sit in a screen saying "XNA Game Studio Connect,  waiting for computer" and the Zune is now acting as the target when you deploy until you exit the wait state on the Zune.

9. Chose Build/Deploy Solution in Visual Studio which changes the Zune to say "deploying..." and sparkle really nice.

Once all this was done I exited on the Zune and now I have a new menu item on the Zune called Games where all my applications are stored, and with that, I have yet another device that I can play solitaire on, YAY!!!:)

Laters,

Tess

We just released a hotfix for an issue where you get 100% CPU in vbc.exe when compiling large applications in Microsoft Visual Studio 2008.

Apart from the 100% CPU situation you may also notice symptoms like visual studio taking a very long time to return or error messages like

Error    246    Timed out waiting for a program to execute.
The command being executed was "C:\Windows\Microsoft.NET\Framework\v3.5\vbc.exe" /noconfig @"C:\TempFiles\MyApp\8ec39ce7\e41b4e32\gpznqwbu.cmdline".

or

C:\MyProjects\MyPage.aspx(4): Build (web):
Object '/32eba493_7a69_40af_bfe6_52887ab07396/jdkqqpvt_pufe5n+ogossdnv_258.rem' has been disconnected or does not exist at the server.

The issue will not appear if you set debug="false" in web.config.

The reason for this issue is that vbc.exe (the vb.net compiler) is spending a lot of time compiling and copying pdb files (symbol files) and to confirm that you are running into this issue you can take a hang dump with adplus (adplus -hang -pn vbc.exe) while vbc.exe is running at high CPU.  If you open it up in windbg.exe and set up the symbols correctly (.symfix c:\websym) and then run ~* kb to list all stacks, you should see a stack similar to the following:

0:000> kb 2000
ChildEBP RetAddr  Args to Child             
0023ed84 00cc1b80 08a10cdc 0023f228 08a11bfc vbc!BCITER_CHILD::GetNext+0x4a
0023edbc 00cc1ad8 08a11bfc 08a10ec0 0023f228 vbc!PDBExtensionTypeIterator::FindExtensionsFromNamespace+0x237
0023edf8 00cc1ad8 08a10cdc 088a10a4 0023f228 vbc!PDBExtensionTypeIterator::FindExtensionsFromNamespace+0x22f
0023ee34 00cc1ad8 088a0ec0 087e1288 0023f228 vbc!PDBExtensionTypeIterator::FindExtensionsFromNamespace+0x22f
0023ee70 00cc1ad8 087e10a4 088a146c 0023f228 vbc!PDBExtensionTypeIterator::FindExtensionsFromNamespace+0x22f
0023eeac 00cc1ad8 088a1288 087611ac 0023f228 vbc!PDBExtensionTypeIterator::FindExtensionsFromNamespace+0x22f
0023eee8 00cc1ad8 08760fc8 088310c0 0023f228 vbc!PDBExtensionTypeIterator::FindExtensionsFromNamespace+0x22f
0023ef24 00cc1ad8 08830edc 07c21488 0023f228 vbc!PDBExtensionTypeIterator::FindExtensionsFromNamespace+0x22f
0023ef60 00cc1ad8 07c212a4 099717d4 0043a7b0 vbc!PDBExtensionTypeIterator::FindExtensionsFromNamespace+0x22f
0023ef9c 00d0c588 08cb0144 009353ec 00d0c68d vbc!PDBExtensionTypeIterator::FindExtensionsFromNamespace+0x22f
0023efa8 00d0c68d 8c83e39e 099717d4 00469a40 vbc!PDBExtensionTypeIterator::PopulateExtensionTypes+0x1c
0023f388 00d0c327 009353ec 099717d4 8c83e3ee vbc!PEBuilder::WritePDB2+0x10c
0023f3f8 00d0c554 009353ec 00000000 0023f686 vbc!PEBuilder::WritePE+0x380
0023f4f8 00d1eeb2 00000000 0023f686 0023f51c vbc!PEBuilder::Compile+0x3ea
0023f688 00d1ecce 000000e8 07643fe8 009385d4 vbc!CompilerProject::_PromoteToCompiled+0x7e
0023f6a0 00d1f0c4 8c83e716 00000001 001247f0 vbc!CompilerProject::CompileFromBoundToCompiled+0xd3
0023f700 00d41a9e 00430268 0023f750 0023f74c vbc!Compiler::Compile+0x139
0023f760 00d41cd6 00000003 00469ad0 00123e00 vbc!wmain+0x23f
0023f7a4 7587e3f3 7efde000 0023f7f0 773acfed vbc!`vector constructor iterator'+0x184
0023f7b0 773acfed 7efde000 54d0fe52 00000000 kernel32!BaseThreadInitThunk+0xe
0023f7f0 773ad1ff 00d54b2e 7efde000 ffffffff ntdll!__RtlUserThreadStart+0x23
0023f808 00000000 00d54b2e 7efde000 00000000 ntdll!_RtlUserThreadStart+0x1b

The fix involves two hotfixes (KB 951708 for Visual Studio 2008, and KB 953595 for vbc.exe). The KB articles will be published soon but even though the KB articles are not yet published you can still get these hotfixes by contacting support and supplying the kb article numbers 951708 and 953595. 

Laters,

Tess

Save the date!  TechEd Developers North America 2008 just finished but there is plenty more to come... I know it's a bit early still, but November 10th through 14th TechEd Developer 2008 is coming to Barcelona, Spain and the session list is already starting to build.

This will be my first time at TechEd so I am really excited, especially with it being in Spain (since I'm half Spanish).  

If you want to check out the sessions that are already posted you can do so here and give feedback on which ones you think you will attend.

I hope that most of you will come and see my session on "Common ASP.NET production issues and how to troubleshoot them with windbg" where I'll be talking about the same types of issues I am discussing on this blog and work through some debugging demos so you can see debugging with windbg live and learn some cool tips and tricks. (It's listed as a 300 level session but I think most people would probably consider it level 400.) 

If you read this blog, and you're going there, come by and say hi... it would be really nice to see the faces of the people I am talking to on the blog:)

If you haven't done so already I would recommend that you check out Ingo Rammers presentation on "Debugging .NET - Hardcore production debugging of .NET applications" from the last EMEA Developers TechEd. It was listed as one of the top 4 sessions, and you can watch it here

He talked, among other things, about SOSAssist, a very early version of a tool that helps you debug by getting a more visual studio like GUI for your dump debugging sessions. 

Laters,

Tess

0 Comments
Filed under: ,

This is an issue I come across once in a while and since I am not a COM expert by any means I actually had to check in with our COM guys to see what was going on in this case, and as it turned out (this is embarrassing:)) I had already written about the root cause of this issue before (here).  So what I am learning from this is that perhaps I should at least check my blog before asking questions:)

Problem description

Intermittently the application will stall, so all requests will hang until they run an IIS Reset. At this point the CPU usage for the process is very low.

Gathering memory dumps

A memory dump was gathered using Debug Diagnostics 1.1, by right clicking on the w3wp.exe process and choosing "Create Full Userdump" while the application was hanging.

Debugging the issue

I opened up the memory dump in windbg and ran the basic commands to set up the environment

.symfix c:\websymbols  (to set up the symbol path)
.reload (to reload the symbols)
.loadby sos mscorwks  (to load up the sos dll in order to view .net objects and stacks)

As always with hangs, the most interesting thing to look at is what the threads are doing, so the 3 standard commands I run are

~* kb 2000 (to see the native stacks)
~* e !clrstack (to see the .net stacks)
!threads (to get information about the managed/.net threads)

When I browse through the native call stacks I noticed that plenty of them were sitting in this type of callstack


  71  Id: 91d0.9740 Suspend: 0 Teb: 7ff0c000 Unfrozen
ChildEBP RetAddr  Args to Child             
10f2f0f0 7c822114 77e6711b 00000001 10f2f140 ntdll!KiFastSystemCallRet
10f2f0f4 77e6711b 00000001 10f2f140 00000001 ntdll!NtWaitForMultipleObjects+0xc
10f2f19c 79f8ead4 00000001 04b0e580 00000000 kernel32!WaitForMultipleObjectsEx+0x11a
10f2f204 79f17522 00000001 04b0e580 00000000 mscorwks!WaitForMultipleObjectsEx_SO_TOLERANT+0x6f
10f2f224 79f17493 00000001 04b0e580 00000000 mscorwks!Thread::DoAppropriateAptStateWait+0x3c
10f2f2a8 79f1732f 00000001 04b0e580 00000000 mscorwks!Thread::DoAppropriateWaitWorker+0x144
10f2f2f8 79f8ea4d 00000001 04b0e580 00000000 mscorwks!Thread::DoAppropriateWait+0x40
10f2f354 79e77f50 ffffffff 00000001 00000000 mscorwks!CLREvent::WaitEx+0xf7
10f2f364 7a0fd9c3 ffffffff 00000001 00000000 mscorwks!CLREvent::Wait+0x17
10f2f3f0 7a0fdbbf 0c9dbe58 ffffffff 04b0e56c mscorwks!AwareLock::EnterEpilog+0x94
10f2f40c 7a0fdd2a 83f96676 0c9dbe58 0c9dbe58 mscorwks!AwareLock::Enter+0x61
10f2f470 7a094352 ffffffff 83f96716 151d4058 mscorwks!AwareLock::Contention+0x16c
10f2f510 0be7cbbb 1d107334 00000001 00000000 mscorwks!JITutil_MonContention+0xa3
WARNING: Frame IP not in any known module. Following frames may be wrong.
10f2f540 0be7c7d2 10f2f580 00000000 15142c80 0xbe7cbbb
10f2f64c 6866b5d5 00000000 10f2f6a0 151a6e0c 0xbe7c7d2
10f2f664 6865a071 689a26e1 00000000 00000000 System_Web_ni+0xab5d5
10f2f668 689a26e1 00000000 00000000 15142724 System_Web_ni+0x9a071
10f2f688 6865a00b 4190ab00 151a6e0c 151a7980 System_Web_ni+0x3e26e1
00000000 00000000 00000000 00000000 00000000 System_Web_ni+0x9a00b 

If you have followed the blog you will probably know that the fact that we are sitting in mscorwks!AwareLock::Enter means that we are waiting for a .net lock.  i.e. on a line like the bolded line below, or alternatively in a Monitor.Enter()

lock(obj){
  // do something
}

If we switch to this thread (71) and run !clrstack on we get the .net stack so that we know where we are waiting for this lock/monitor

0:030> ~71s
eax=00001c5a ebx=10f2f140 ecx=04b0e56c edx=26064dec esi=10f2f140 edi=7ffdf000
eip=7c82ed54 esp=10f2f0f4 ebp=10f2f19c iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
ntdll!KiFastSystemCallRet:
7c82ed54 c3              ret

 

0:071> !clrstack
OS Thread Id: 0x9740 (71)
ESP       EIP    
10f2f38c 7c82ed54 [GCFrame: 10f2f38c]
10f2f4c4 7c82ed54 [HelperMethodFrame: 10f2f4c4] System.Threading.Monitor.Enter(System.Object)
10f2f518 0be7cbbb MyFramework.User.VerifyUser()
10f2f548 0be7c7d2 MyFramework.Http.HttpModules.HttpAuthenticationModule.Authenticate(System.Object, System.EventArgs)
10f2f654 6866b5d5 System.Web.HttpApplication+SyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
10f2f66c 6865a071 System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)
10f2f6ac 6865a39b System.Web.HttpApplication.ResumeSteps(System.Exception)
10f2f6f4 6865912d System.Web.HttpApplication.System.Web.IHttpAsyncHandler.BeginProcessRequest(System.Web.HttpContext, System.AsyncCallback, System.Object)
10f2f710 6865e0bd System.Web.HttpRuntime.ProcessRequestInternal(System.Web.HttpWorkerRequest)
10f2f744 6865dd72 System.Web.HttpRuntime.ProcessRequestNoDemand(System.Web.HttpWorkerRequest)
10f2f750 6865c447 System.Web.Hosting.ISAPIRuntime.ProcessRequest(IntPtr, Int32)
10f2f900 79f1ef33 [ContextTransitionFrame: 10f2f900]
10f2f950 79f1ef33 [GCFrame: 10f2f950]
10f2faa8 79f1ef33 [ComMethodFrame: 10f2faa8]

and in this case it is in the MyFramework.User.VerifyUser() method that we are waiting for the lock.

Another interesting point about this stack is that we can see that this request hasn't started executing yet, it is still in the Authenticate module which explains why the requests don't simply time out.

The next step is to figure out who is holding this lock to determine why we are stuck here, and we find the owner by running !syncblk

0:071> !syncblk
Index SyncBlock MonitorHeld Recursion Owning Thread Info  SyncBlock Owner
   42 04b0e4dc            3         1 04ba8bc8  9c88  41   1d107964 System.Object
   45 04b0e56c           61         1 04c987c0  9da0  40   1d107334 System.Object
-----------------------------
Total           74
CCW             2
RCW             0
ComClassFactory 0
Free            43

 

We have two different syncblocks (locks) in this process and it is a bit hard to guess which one is which.  The first one locking on object 1d107964 is held by thread 41 and it has 3 monitors which means that it has one waiter.  The way you calculate how many waiters a monitor has is by taking (MonitorHeld-1)/2.  This is because the owner will take one monitor, and each waiter takes two monitors.

The second lock (1d107334) is owned by thread 40 and has 30 waiters ((61-1)/2).

Oh, btw, when I say owner, I mean the thread that has entered into the lock statement and is now doing something while holding the lock so that thread is essentially keeping the "waiters" waiting.

If we look at our native stack we had

10f2f510 0be7cbbb 1d107334 00000001 00000000 mscorwks!JITutil_MonContention+0xa3

The 1st parameter to JITUtil_MonContention (the 3rd column here) identifies the object we are waiting on, so with this information we now know that our thread (thread 71) is waiting for the lock held by thread 40.

Next we check what this thread is doing and we find that this thread is also waiting for a lock, but this time in ConfigManager.get_Instance(), and lower down in the call chain we can see our VerifyUser method which is why we are holding the lock that thread 71 is waiting for.

0:071> ~40s
eax=00004a4d ebx=0c82edbc ecx=04b0e4dc edx=1ccfb4d6 esi=0c82edbc edi=7ffdf000
eip=7c82ed54 esp=0c82ed70 ebp=0c82ee18 iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
ntdll!KiFastSystemCallRet:
7c82ed54 c3              ret

 

0:040> kb
ChildEBP RetAddr  Args to Child             
0c82ed6c 7c822114 77e6711b 00000001 0c82edbc ntdll!KiFastSystemCallRet
0c82ed70 77e6711b 00000001 0c82edbc 00000001 ntdll!NtWaitForMultipleObjects+0xc
0c82ee18 79f8ead4 00000001 04b0e4f0 00000000 kernel32!WaitForMultipleObjectsEx+0x11a
0c82ee80 79f17522 00000001 04b0e4f0 00000000 mscorwks!WaitForMultipleObjectsEx_SO_TOLERANT+0x6f
0c82eea0 79f17493 00000001 04b0e4f0 00000000 mscorwks!Thread::DoAppropriateAptStateWait+0x3c
0c82ef24 79f1732f 00000001 04b0e4f0 00000000 mscorwks!Thread::DoAppropriateWaitWorker+0x144
0c82ef74 79f8ea4d 00000001 04b0e4f0 00000000 mscorwks!Thread::DoAppropriateWait+0x40
0c82efd0 79e77f50 ffffffff 00000001 00000000 mscorwks!CLREvent::WaitEx+0xf7
0c82efe0 7a0fd9c3 ffffffff 00000001 00000000 mscorwks!CLREvent::Wait+0x17
0c82f06c 7a0fdbbf 04c987c0 ffffffff 04b0e4dc mscorwks!AwareLock::EnterEpilog+0x94
0c82f088 7a0fdd2a 9f8962ea 04c987c0 04c987c0 mscorwks!AwareLock::Enter+0x61
0c82f0ec 7a094352 ffffffff 9f89638a 00000000 mscorwks!AwareLock::Contention+0x16c
0c82f18c 0be7d4b6 1d107964 00000000 00000000 mscorwks!JITutil_MonContention+0xa3
WARNING: Frame IP not in any known module. Following frames may be wrong.
0c82f1d0 79e74ebd 00000000 00000000 191240e4 0xbe7d4b6
0c82f1d8 00000000 191240e4 00000000 00000000 mscorwks!HelperMethodFrame::LazyInit+0x17

0:040> !clrstack
OS Thread Id: 0x9da0 (40)
ESP       EIP    
0c82f008 7c82ed54 [GCFrame: 0c82f008]
0c82f140 7c82ed54 [HelperMethodFrame: 0c82f140] System.Threading.Monitor.Enter(System.Object)
0c82f194 0be7d4b6 MyFramework.Configuration.ConfigManager.get_Instance()
0c82f2e4 0be7cd8a MyFramework.User..ctor()
0c82f398 0be7cbd1 MyFramework.User.VerifyUser()
0c82f3c8 0be7c7d2 MyFramework.Http.HttpModules.HttpAuthenticationModule.Authenticate(System.Object, System.EventArgs)
0c82f4d4 6866b5d5 System.Web.HttpApplication+SyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
0c82f4ec 6865a071 System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)
0c82f52c 6865a39b System.Web.HttpApplication.ResumeSteps(System.Exception)
0c82f574 6865912d System.Web.HttpApplication.System.Web.IHttpAsyncHandler.BeginProcessRequest(System.Web.HttpContext, System.AsyncCallback, System.Object)
0c82f590 6865e0bd System.Web.HttpRuntime.ProcessRequestInternal(System.Web.HttpWorkerRequest)
0c82f5c4 6865dd72 System.Web.HttpRuntime.ProcessRequestNoDemand(System.Web.HttpWorkerRequest)
0c82f5d0 6865c447 System.Web.Hosting.ISAPIRuntime.ProcessRequest(IntPtr, Int32)
0c82f780 79f1ef33 [ContextTransitionFrame: 0c82f780]
0c82f7d0 79f1ef33 [GCFrame: 0c82f7d0]
0c82f928 79f1ef33 [ComMethodFrame: 0c82f928]

We go through the same drill again, finding the owner in the !syncbkl output and this time our first parameter to JITUtil_MonContention is 1d107964 which was the one held by thread 41.

Thread 41 is a bit different than the other threads,  we can see from the .net callstack that we are doing Marshal.ReleaseComObject from the OnConfigRemoveCallback method, and going into the native callstack we can see that this in turn has to make a transition to an STA thread to clean up (GetToSTA).  The reason for this is because the component we are trying to release is an STA component.

0:040> ~41s
eax=00002be2 ebx=049d75c0 ecx=00000000 edx=00000003 esi=00001464 edi=00000000
eip=7c82ed54 esp=0a25ea30 ebp=0a25eaa0 iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
ntdll!KiFastSystemCallRet:
7c82ed54 c3              ret
0:041> kb
ChildEBP RetAddr  Args to Child             
0a25ea2c 7c822124 77e6baa8 00001464 00000000 ntdll!KiFastSystemCallRet
0a25ea30 77e6baa8 00001464 00000000 00000000 ntdll!NtWaitForSingleObject+0xc
0a25eaa0 77e6ba12 00001464 ffffffff 00000000 kernel32!WaitForSingleObjectEx+0xac
0a25eab4 776c54ef 00001464 ffffffff 0a25ebb8 kernel32!WaitForSingleObject+0x12
0a25ead0 77789905 049d75c0 0ca91600 00000000 ole32!GetToSTA+0x6f
0a25eaf0 77787ed7 0a25ebb8 0a25ecb8 049d2850 ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+0xcb
0a25ebd0 77695349 049d2850 0a25ecc8 0a25ecb8 ole32!CRpcChannelBuffer::SendReceive2+0xc1
0a25ec3c 776c4eee 049d2850 0a25ecc8 0a25ecb8 ole32!CAptRpcChnl::SendReceive+0xab
0a25ec90 77ce127e 00000001 0a25ecc8 0a25ecb8 ole32!CCtxComChnl::SendReceive+0x91
0a25ecac 77ce13ca 04be013c 0a25ecf4 0600016e rpcrt4!NdrProxySendReceive+0x43
0a25f094 77ce11bd 77676298 77679608 0a25f0cc rpcrt4!NdrClientCall2+0x206
0a25f0b4 77c53a12 0000000c 00000008 0a25f14c rpcrt4!ObjectStublessClient+0x8b
0a25f0c4 776c4c23 04be013c 0a25f100 0a25f200 rpcrt4!ObjectStubless+0xf
0a25f14c 777128f3 0008b420 79f277a5 0a25f1f4 ole32!CObjectContext::InternalContextCallback+0x126
0a25f19c 79f2340f 0008b430 79f277a5 0a25f1f4 ole32!CObjectContext::ContextCallback+0x85
0a25f1e8 79f23362 0008b430 00000000 00000000 mscorwks!CtxEntry::EnterContextOle32BugAware+0x2b
0a25f308 79f278ec 79f27898 04ae3eb8 992e6146 mscorwks!CtxEntry::EnterContext+0x2db
0a25f340 79f279dc 04ae3eb8 992e6176 00000001 mscorwks!RCW::ReleaseAllInterfacesCallBack+0x5c
0a25f370 7a042312 992e61b2 04ba8bc8 04ba8bc8 mscorwks!RCW::Cleanup+0x22
0a25f3b4 7a07484b 04ae3eb8 992e665a 1946c038 mscorwks!RCW::ExternalRelease+0x158
0:041> !clrstack
OS Thread Id: 0x9c88 (41)
ESP       EIP    
0a25f408 7c82ed54 [HelperMethodFrame_1OBJ: 0a25f408] System.Runtime.InteropServices.Marshal.InternalReleaseComObject(System.Object)
0a25f464 793b69cd System.__ComObject.ReleaseSelf()
0a25f468 79461890 System.Runtime.InteropServices.Marshal.ReleaseComObject(System.Object)
0a25f498 0c88715b MyFramework.Configuration.ConfigManager.OnConfigRemoveCallback(System.String, System.Object, System.Web.Caching.CacheItemRemovedReason)
0a25f4d8 686bd802 System.Web.Caching.CacheEntry.CallCacheItemRemovedCallback(System.Web.Caching.CacheItemRemovedCallback, System.Web.Caching.CacheItemRemovedReason)
0a25f514 686bdaad System.Web.Caching.CacheEntry.Close(System.Web.Caching.CacheItemRemovedReason)
0a25f550 689a024c System.Web.Caching.CacheSingle.UpdateCache(System.Web.Caching.CacheKey, System.Web.Caching.CacheEntry, Boolean, System.Web.Caching.CacheItemRemovedReason, System.Object ByRef)
0a25f620 686bb838 System.Web.Caching.CacheInternal.DoRemove(System.Web.Caching.CacheKey, System.Web.Caching.CacheItemRemovedReason)
0a25f62c 686c2f72 System.Web.Caching.ExpiresBucket.FlushExpiredItems(System.DateTime, Boolean)
0a25f6a0 686c310f System.Web.Caching.CacheExpires.FlushExpiredItems(Boolean, Boolean)
0a25f6e8 686c3205 System.Web.Caching.CacheExpires.TimerCallback(System.Object)
0a25f6ec 793d9c1a System.Threading._TimerCallback.TimerCallback_Context(System.Object)
0a25f6f0 793683dd System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
0a25f708 793d9c7f System.Threading._TimerCallback.PerformTimerCallback(System.Object)
0a25f898 79e88f63 [GCFrame: 0a25f898]
0a25f9dc 79e88f63 [ContextTransitionFrame: 0a25f9dc]

Preferably we shouldn't be using STA components in ASP.NET because of the need for transitions between threads and potential waits which of course cause performance degradations.  If this would have been an MTA component we could have simply released it on the same thread.

However, still it should not be sitting like this, waiting indefinitely.  The appropriate STA thread should handle the request and this thread should return, releasing the lock and thus releasing the other locks as well.

We can grab the 1st parameter to GetToSTA to figure out what thread it is trying to get to

0a25ead0 77789905 049d75c0 0ca91600 00000000 ole32!GetToSTA+0x6f

If we dump it out with dc, the 2nd and 3rd dword will tell us what process it is trying to get to as well as the thread id  (the reason we have to go in a little roundabout fashion to find these is because public symbols don't have type information, otherwise we could have dumped out the parameter with dt or looked at it in windbg, but now you will just have to rely on the fact that it is dword 2 and 3 (at least in this version of the dll)

0:041> dc 0x049d75c0
049d75c0  049d7640 049d7540 000091d0 00009a04  @v..@u..........
049d75d0  db94d897 0b8e692b e9cda17b 88696fd7  ....+i..{....oi.
049d75e0  e9cda17b 88696fd7 00001822 9a0491d0  {....oi.".......
049d75f0  80605d98 1cdb921b 00000103 03b80260  .]`.........`...
049d7600  049e86c0 00000000 00000000 00000000  ................
049d7610  00000001 ffffffff 00097ec0 04be013c  .........~..<...
049d7620  00000003 00000000 00000000 00000000  ................
049d7630  00000000 00070005 00000000 00000000  ................

The PID (process ID) here is 91d0 and the thread id is 9a04

Just to verify that we are going to a thread in this process we can run "|" to get basic process information to compare agains

0:041> |
.  0    id: 91d0    examine    name: c:\WINDOWS\system32\inetsrv\w3wp.exe

And then we can dump out the threads using !threads to identify the thread with the OS thread ID 9a04

0:041> !threads
ThreadCount: 106
UnstartedThread: 0
BackgroundThread: 44
PendingThread: 0
DeadThread: 62
Hosted Runtime: no
                                      PreEmptive   GC Alloc           Lock
       ID OSID ThreadOBJ    State     GC       Context       Domain   Count APT Exception
  30    1 9a04 049a6c28   3808220 Enabled  00000000:00000000 04b0f008     1 STA (Threadpool Worker)
  36    2 8ce8 04b41ac8      b220 Enabled  00000000:00000000 04b5a728     0 MTA (Finalizer)
  37    3 9cb0 04991518    80a220 Enabled  00000000:00000000 04b5a728     0 MTA (Threadpool Completion Port)
  38    4 984c 04a07458      1220 Enabled  00000000:00000000 04b5a728     0 Ukn
  28    6 9a18 04ba8530   880a220 Enabled  1d1c14e8:1d1c24b4 04b5a728     0 MTA (Threadpool Completion Port)
  26    8 9c04 04be0b80       220 Enabled  00000000:00000000 04b5a728     0 STA
  40    9 9da0 04c987c0   380b220 Enabled  00000000:00000000 04b0f008     2 MTA (Threadpool Worker)
  41    7 9c88 04ba8bc8   180b220 Enabled  00000000:00000000 04b0f008     1 MTA (Threadpool Worker)
  42    a 9898 04bc6030   380b220 Enabled  00000000:00000000 04b0f008     1 MTA (Threadpool Worker)
  48    d 9bb0 0ca3a3a0       220 Enabled  00000000:00000000 04b5a728     0 STA
  53    f 9044 0c99b7d8       220 Enabled  00000000:00000000 04b5a728     0 STA
  47    c 9924 04ba7f28       220 Enabled  00000000:00000000 04b5a728     0 STA
...

Looking at the 1st thread in the list (Thread 30), we can see that the OSID is 9a04 so this is our guy. 

  30    1 9a04 049a6c28   3808220 Enabled  00000000:00000000 04b0f008     1 STA (Threadpool Worker)

There are a few other things that are interesting about it as well.

1. It is an STA thread (as it should be since we are doing GetToSTA to get here) and 
2. It is a ThreadPool Worker thread, meaning that it is a regular .net thread, so not really a thread that would be pumping messages like an STA thread should

If we would look at the stack for it we would see that this thread is actually also waiting for the lock held by thread 40 so we are stuck in a deadlock where

thread 30 is waiting for a lock held by thread 40 
thread 40 is waiting for a lock held by thread 41
thread 41 needs to get to thread 30 in order to release an STA component

What about the solution?

The problem we need to solve here is that we need to make sure that thread 41 is not trying to get to a managed thread (or another thread not pumping messages) to release the STA component.

As it turns out, the reason that thread 30 is marked as an STA thread (it should really be MTA) is because the first thread created by the threadpool is created without specifying the appartment model.  If this thread happens to handle a request where it uses an STA component this thread will then be marked as STA for the rest of it's life, but unfortunately, as opposed to proper STA threads, it will not be pumping messages so you may block while trying to get to it.

To resolve the issue, install the hotfix discussed in this article http://support.microsoft.com/default.aspx?scid=kb;EN-US;928569

Laters,

Tess

In many of my posts I have discussed performance issues and memory issues caused by excessive use of the Large Object Heap, here are a few examples...

Maoni just wrote a nice article in MSDN Magazine called CLR Inside Out: Large Object Heap Uncovered, it is definitely worth a read if you are interested in how the GC works and specifically why the GC heaps are split into small object heap segments and large object heap segments.

I recently received a question from a reader who wanted to know how to set breakpoints in code for applications that crash on startup.

The email went like this:

I have a .NET app which is crashing upon startup (it even causes DW20.exe to make an appearance). When I open the .NET app in Windbg, I can see exactly what the exception is and where it happens.

What I would like to do is put a breakpoint on the method where the exception is being raised so that I can step through it. This is where I'm stuck.

I'm familiar with commands such as !name2ee and !bpmd., however, when I load the .NET app, I don't get a chance to set the breakpoint since the exception happens so early in startup code of the app. It's like as soon as Windbg finishes loading modules, the exception strikes. It would be different if this was code running after a button click, but since this runs at startup, I'm having difficulties.

I've tried Ctrl-Break during the module loading and tried to set the breakpoint, but this doesn't work - it's too early to try to set a breakpoint at that moment.

How do you load a .NET app in Windbg and halt it so that you can set a breakpoint to a method that will be called later?

The answer is:

1. Open up the application with windbg through File/Open Executable.  This will start the application under the windbg debugger (just as if you would do a debug in Visual Studio)

2. In windbg run

sxe ld:mscorlib
g

This will break in when mscorlib.dll is loaded, and at that point you can start setting breakpoints in code

3. Load up sos by running .loadby sos mscorwks which which will load sos from the directory where mscorwks.dll is loaded, i.e. the framework directory.

4. Set your breakpoint with !bpmd, for example !bpmd BadApp.exe BadApp.BadClass.CrashApp will set a breakpoint on the method CrashApp in the class BadApp.BadClass defined in the BadApp.exe module.  If your class was defined in a dll you would use the dllname.dll instead of the application name.

So all in all it looks something like this:

Microsoft (R) Windows Debugger  Version 6.8.0000.0
Copyright (c) Microsoft Corporation. All rights reserved.

ModLoad: 00400000 00408000   BadApp.exe
ModLoad: 7d4c0000 7d5f0000   NOT_AN_IMAGE
ModLoad: 7d600000 7d6f0000   C:\WINDOWS\SysWOW64\ntdll32.dll
ModLoad: 79000000 79046000   C:\WINDOWS\SysWOW64\mscoree.dll
ModLoad: 7d4c0000 7d5f0000   C:\WINDOWS\syswow64\KERNEL32.dll
(1bc4.11a4): Break instruction exception - code 80000003 (first chance)
eax=7d600000 ebx=7efde000 ecx=00000005 edx=00000020 esi=7d6a01f4 edi=00221f38
eip=7d61002d esp=0012fb4c ebp=0012fcac iopl=0         nv up ei pl nz na po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000202
ntdll32!DbgBreakPoint:
7d61002d cc              int     3

0:000> sxe ld:mscorlib
0:000> g

ModLoad: 790c0000 79bf6000   C:\WINDOWS\assembly\NativeImages_v2.0.50727_32\mscorlib\32e6f703c114f3a971cbe706586e3655\mscorlib.ni.dll
eax=00000000 ebx=00000000 ecx=00000000 edx=00000000 esi=7efdd000 edi=20000000
eip=7d61cb85 esp=0012e9a0 ebp=0012e9e0 iopl=0         nv up ei pl nz na po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000202
ntdll32!ZwMapViewOfSection+0x12:
7d61cb85 c22800          ret     28h

0:000> .loadby sos mscorwks

0:000> !bpmd BadApp.exe BadApp.BadClass.CrashApp
Adding pending breakpoints...

0:000> g
CLR notification: module 'mscorlib' loaded
(1bc4.11a4): CLR notification exception - code e0444143 (first chance)
CLR notification: module 'BadApp' loaded
(1bc4.11a4): CLR notification exception - code e0444143 (first chance)
CLR notification: module 'System' loaded
(1bc4.11a4): CLR notification exception - code e0444143 (first chance)
CLR notification: module 'System.Drawing' loaded
(1bc4.11a4): CLR notification exception - code e0444143 (first chance)
CLR notification: module 'System.Windows.Forms' loaded
(1bc4.11a4): CLR notification exception - code e0444143 (first chance)
CLR notification: module 'System.Configuration' loaded
(1bc4.11a4): CLR notification exception - code e0444143 (first chance)
CLR notification: module 'System.Xml' loaded
(1bc4.11a4): CLR notification exception - code e0444143 (first chance)
CLR notification: module 'sorttbls.nlp' loaded
(1bc4.11a4): CLR notification exception - code e0444143 (first chance)
CLR notification: module 'sortkey.nlp' loaded
(1bc4.11a4): CLR notification exception - code e0444143 (first chance)
CLR notification: method 'BadApp.BadClass.CrashApp()' code generated
(1bc4.11a4): CLR notification exception - code e0444143 (first chance)
JITTED BadApp!BadApp.BadClass.CrashApp()
Setting breakpoint: bp 02400270 [BadApp.BadClass.CrashApp()]
Breakpoint 0 hit
eax=01d76970 ebx=02703558 ecx=02725518 edx=00262498 esi=02725518 edi=02725518
eip=02400270 esp=0012ef2c ebp=0012ef78 iopl=0         nv up ei pl nz na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000206
BadApp!BadApp.BadClass.CrashApp():
02400270 83ec08          sub     esp,8

 

I would say though that this particular scenario would be much easier to go through in Visual Studio, dbgclr or mdbg so if you have the option of using any of those I would recommend them over using windbg in this case.

 

Happy National day tomorrow to all the Swedes,  it's promising to be a nice sunny day off:)

ASP.NET Memory: Thou shalt not store UI objects in cache or session scope

Surprisingly the issue I wrote about in "the eventhandlers that made the memory baloon" (Jan 2006) is something that still happens very frequently, I reference it in cases at least a few times a month.  Just this last week I had different variations of it crop up in different cases so in this post I will show a different variation, what to look out for and how to identify it.

Problem description:

The issue here is a quick growth in memory, eventually causing OutOfMemory exceptions.

The problem:

The problem in this case was that the application stored UI controls in session scope, i.e. Labels, DataGrids etc. in order to keep relevant information about the user throughout the session.  The data they want to store is not that big, a subset of a dataset, some user related strings etc. and since it was already nicely contained in the controls there was no reason to extract them into a separate object to store in session scope.

WRONG!

If we look at a dropdownlist for example in memory, this is what a dropdownlist contains  (the Name column contains the member variable names, and the Type column shows the type of the member variable)

0:032> !do 06f4fb14
Name: System.Web.UI.WebControls.DropDownList
MethodTable: 663b95bc
EEClass: 663b952c
Size: 124(0x7c) bytes
(C:\WINDOWS\assembly\GAC_32\System.Web\2.0.0.0__b03f5f7f11d50a3a\System.Web.dll)
Fields:
      MT    Field   Offset                 Type VT     Attr    Value Name
790fd8c4  4002144        4        System.String  0 instance 02f1fd64 _id
790fd8c4  4002145        8        System.String  0 instance 02f1fd64 _cachedUniqueID
663a66b8  4002146        c ...em.Web.UI.Control  0 instance 06f4f47c _parent
6641d56c  4002147       2c         System.Int32  1 instance        6 _controlState
6641194c  4002148       10 ...m.Web.UI.StateBag  0 instance 06f51c5c _viewState
663a66b8  4002149       14 ...em.Web.UI.Control  0 instance 06f4dba0 _namingContainer
663a40c4  400214a       18   System.Web.UI.Page  0 instance 06f4dba0 _page
6641f33c  400214b       1c ...+OccasionalFields  0 instance 06f53958 _occasionalFields
663b63cc  400214c       20 ...I.TemplateControl  0 instance 00000000 _templateControl
6639b220  400214d       24 ...m.Web.VirtualPath  0 instance 00000000 _templateSourceVirtualDirectory
66417cf0  400214e       28 ...rs.ControlAdapter  0 instance 00000000 _adapter
663bfea4  400214f       30 ...SimpleBitVector32  1 instance 06f4fb44 flags
790fd0f0  400213e      d08        System.Object  0   shared   static EventDataBinding
    >> Domain:Value  00198418:NotInit  001c6fd0:02f06fd0 <<
790fd0f0  400213f      d0c        System.Object  0   shared   static EventInit
    >> Domain:Value  00198418:NotInit  001c6fd0:02f06fdc <<
790fd0f0  4002140      d10        System.Object  0   shared   static EventLoad
    >> Domain:Value  00198418:NotInit  001c6fd0:02f06fe8 <<
790fd0f0  4002141      d14        System.Object  0   shared   static EventUnload
    >> Domain:Value  00198418:NotInit  001c6fd0:02f06ff4 <<
790fd0f0  4002142      d18        System.Object  0   shared   static EventPreRender
    >> Domain:Value  00198418:NotInit  001c6fd0:02f07000 <<
790fd0f0  4002143      d1c        System.Object  0   shared   static EventDisposed
    >> Domain:Value  00198418:NotInit  001c6fd0:02f0700c <<
7912d8f8  4002150      d20      System.Object[]  0   shared   static automaticIDs
    >> Domain:Value  00198418:NotInit  001c6fd0:02f07018 <<
790fd8c4  40025c5       34        System.String  0 instance 00000000 tagName
663e95f0  40025c6       44         System.Int32  1 instance       76 tagKey
663eb1d0  40025c7       38 ...tributeCollection  0 instance 00000000 attrColl
6641194c  40025c8       3c ...m.Web.UI.StateBag  0 instance 00000000 attrState
66401a78  40025c9       40 ...WebControls.Style  0 instance 00000000 controlStyle
663bfea4  40025ca       48 ...SimpleBitVector32  1 instance 06f4fb5c _webControlFlags
790fd0f0  40025cc       4c        System.Object  0 instance 00000000 _dataSource
7910be50  40025cd       50       System.Boolean  1 instance        1 _requiresDataBinding
7910be50  40025ce       51       System.Boolean  1 instance        1 _inited
7910be50  40025cf       52       System.Boolean  1 instance        1 _preRendered
7910be50  40025d0       53       System.Boolean  1 instance        0 _requiresBindToNull
7910be50  40025d1       54       System.Boolean  1 instance        0 _throwOnDataPropertyChange
790fd0f0  40025cb      ec0        System.Object  0   shared   static EventDataBound
    >> Domain:Value  00198418:NotInit  001c6fd0:02f209c8 <<
66404954  40025d3       58 ...UI.DataSourceView  0 instance 06f51c90 _currentView
7910be50  40025d4       55       System.Boolean  1 instance        0 _currentViewIsFromDataSourceID
7910be50  40025d5       56       System.Boolean  1 instance        1 _currentViewValid
663c876c  40025d6       5c ...eb.UI.IDataSource  0 instance 06f51c80 _currentDataSource
7910be50  40025d7       57       System.Boolean  1 instance        1 _currentDataSourceValid
663ba948  40025d8       60 ...ceSelectArguments  0 instance 06f539fc _arguments
7910be50  40025d9       64       System.Boolean  1 instance        1 _pagePreLoadFired
7910be50  40025da       65       System.Boolean  1 instance        0 _ignoreDataSourceViewChanged
6641a730  4002653       68 ...istItemCollection  0 instance 06f4fb90 items
79102290  4002654       74         System.Int32  1 instance       -1 cachedSelectedIndex
790fd8c4  4002655       6c        System.String  0 instance 00000000 cachedSelectedValue
79104368  4002656       70 ...ections.ArrayList  0 instance 06f539b0 cachedSelectedIndices
7910be50  4002657       66       System.Boolean  1 instance        0 _stateLoaded
790fd0f0  4002651      ee0        System.Object  0   shared   static EventSelectedIndexChanged
    >> Domain:Value  00198418:NotInit  001c6fd0:02f20bfc <<
790fd0f0  4002652      ee4        System.Object  0   shared   static EventTextChanged
    >> Domain:Value  00198418:NotInit  001c6fd0:02f20c08 <<

So if I have a dropdownlist that contain all the US states and their abbreviations, and I want to store this in cache to avoid having to rebuild the list, a logical step to take might be to cache the dropdownlist.   

However, if I do so, anything that it references will be referenced as long as the dropdownlist is in cache, so it can't be garbage collected.  In the case of web controls, they all have a _parent member variable which links back to the parent control, and a _page that links back to the page where the control was originally instantiated.  In other words, this means that as long as your control is in cache, you will also be caching the page, any other controls that it uses, any data bindings to data grids etc. etc. 

As you can see above you will also be caching the statebag (viewstate), the IDataSource and anything it references, etc. etc.  all in all it builds up to quite a bit more data than we originally bargained for, when in reality, what I wanted to store was just the list of states and abbreviations and maybe the selected item (if storing it in session scope).

The resolution:

Whenever you store anything in cache or session scope, make sure that you store just what you need and nothing more.  If you store an object that is not a simple type like a string or an int, make sure that you know all about that object and it's member variables so that you know exactly what it is that you will be holding on to.  UI objects are never good to store in cache or session scope since they contain quite a bit of extra data about their appearance and location, instead just extracting the ItemsColection and the selected item would be appropriate in the case above.

How you can identify the problem in a memory dump: 

You can get a memory dump when memory is high either with Debug Diagnostics 1.1, or by running adplus -hang -pn w3wp.exe (from the debugging tools for windows).

If you open it up in windbg, and load sos (.loadby sos mscorwks) these issues are fairly easy to identify.

Running !dumpheap -stat you will see a lot of System.Web.UI items at the bottom of the output in this case. 

65412bb4    2,461    3,004,252 System.Data.RBTree`1+Node[[System.Data.DataRow, System.Data]][]
7a75a878  189,495    3,031,920 System.Collections.Specialized.NameObjectCollectionBase+NameObjectEntry
65408b8c   47,914    3,066,496 System.Data.DataRow
663c1f78   80,093    3,524,092 System.Web.UI.WebControls.TableStyle
6640776c  226,668    3,626,688 System.Web.UI.Pair
66412f04  140,180    3,925,040 System.Web.UI.WebControls.ListItem
663eb1d0  278,998    4,463,968 System.Web.UI.AttributeCollection
654359c8    3,203    4,682,276 System.Data.RBTree`1+Node[[System.Int32, mscorlib]][]
663c3f04  397,066    4,764,792 System.Web.UI.WebControls.FontInfo
663bb498  133,543    4,807,548 System.Web.UI.WebControls.TableRow+CellControlCollection
663d5cb0   64,087    5,126,960 System.Web.UI.WebControls.HyperLink
1c1165bc   51,295    5,129,500 SomeControlLibrary.WebControls.SomeControl
79102290  551,137    6,613,644 System.Int32
663d6ad4   78,635    6,919,880 System.Web.UI.WebControls.Table
79104368  312,558    7,501,392 System.Collections.ArrayList
66411ac8  390,338    9,368,112 System.Web.UI.WebControls.Unit
663f2408  112,314    9,434,376 System.Web.UI.WebControls.Image
66401a78  251,127   11,049,588 System.Web.UI.WebControls.Style
663ec8e8  133,546   11,217,864 System.Web.UI.WebControls.TableRow
663dd2b0  259,638   11,424,072 System.Web.UI.WebControls.TableItemStyle
663bd0a0  149,928   11,994,240 System.Web.UI.WebControls.Label
664140a4  263,577   15,814,620 System.Web.UI.LiteralControl
7912d9bc   42,506   16,614,336 System.Collections.Hashtable+bucket[]
663c7308  462,037   16,633,332 System.Web.UI.ControlCollection
6641194c 1,269,095  20,305,520 System.Web.UI.StateBag
663d7328  341,364   28,674,576 System.Web.UI.WebControls.TableCell
7a7580d0 1,494,283  29,885,660 System.Collections.Specialized.HybridDictionary
6641f33c  751,897   33,083,468 System.Web.UI.Control+OccasionalFields
7a75820c 1,309,784  36,673,952 System.Collections.Specialized.ListDictionary
663c1de8 2,427,266  38,836,256 System.Web.UI.StateItem
7912d8f8 1,016,027  47,522,388 System.Object[]
7a7582d8 2,671,736  53,434,720 System.Collections.Specialized.ListDictionary+DictionaryNode
000e1b50    6,519   69,179,268      Free
790fd8c4 1,674,890  213,206,880 System.String

And if you run !dumpheap -type aspx or !dumpheap -type ascx to list all the aspx/ascx pages on the heap you will likely find many of them

0:034> !dumpheap -type aspx
...
Statistics:
      MT    Count    TotalSize Class Name
1d32425c        8        5,088 ASP.register_aspx
1b39666c      175       86,800 ASP.login_aspx
1ceab19c      199       93,928 ASP.products_aspx
1c15b6cc      280      222,880 ASP.home_aspx
1cea17ac      532      261,744 ASP.default_aspx
Total 1,194 objects, Total size: 670,440

The next step is to take any of these pages, preferably the ones that are most plentiful like ASP.default_aspx in this case, and !gcroot some of them to figure out why they are still around

0:034> !dumpheap -mt 1cea17ac     
Using our cache to search the heap.
   Address         MT     Size  Gen
...
040d0eb4 1cea17ac      492    2 ASP.default_aspx
04110d38 1cea17ac      492    2 ASP.default_aspx
0412b918 1cea17ac      492    2 ASP.default_aspx
04201758 1cea17ac      492    2 ASP.default_aspx
04213ea8 1cea17ac      492    2 ASP.default_aspx
0421943c 1cea17ac      492    2 ASP.default_aspx
042496bc 1cea17ac      492    2 ASP.default_aspx
042ae71c 1cea17ac      492    2 ASP.default_aspx
042ec8f4 1cea17ac      492    2 ASP.default_aspx
043310b4 1cea17ac      492    2 ASP.default_aspx
043c7f74 1cea17ac      492    2 ASP.default_aspx
0442d27c 1cea17ac      492    2 ASP.default_aspx
0453cb08 1cea17ac      492    2 ASP.default_aspx
04617168 1cea17ac      492    2 ASP.default_aspx
...
 

0:034> !gcroot 042ae71c
...
ESP:1fbf894:Root:  109420b0(System.Threading._TimerCallback)->
  10942068(System.Threading.TimerCallback)->
  10940004(System.Web.Caching.CacheExpires)->
  10940024(System.Object[])->
  109407b4(System.Web.Caching.ExpiresBucket)->
  06f05928(System.Web.Caching.ExpiresPage[])->
  06f059ac(System.Web.Caching.ExpiresEntry[])->
  04616834(System.Web.Caching.CacheEntry)->
  04616804(System.Web.SessionState.InProcSessionState)->
  0e474dd4(System.Web.SessionState.SessionStateItemCollection)->
  0e474e34(System.Collections.Hashtable)->
  0e481110(System.Collections.Hashtable+bucket[])->
  04616698(System.Collections.Specialized.NameObjectCollectionBase+NameObjectEntry)->
  045ea5d8(System.Web.UI.WebControls.DropDownList)->
  042ae71c(ASP.default_aspx)->

If we follow the root chain from the bottom up we see that ASP.default_aspx is a member variable of the DropDownList (the _page member variable) and that this in turn is a member variable of a NameObjectEntry in a hash table in session state.  The NameObjectEntry is a key-value pair with the key and the value of a session variable, so through this we can gather that we are storing the DropDownlist in session scope.

If we look at the NameObjectEntry we can also see exactly which session variable it is (LstStates in this case)

0:034> !do 04616698
Name: System.Collections.Specialized.NameObjectCollectionBase+NameObjectEntry
MethodTable: 7a75a878
EEClass: 7a7c51c4
Size: 16(0x10) bytes
GC Generation: 2
(C:\WINDOWS\assembly\GAC_MSIL\System\2.0.0.0__b77a5c561934e089\System.dll)
Fields:
      MT    Field   Offset                 Type VT     Attr    Value Name
790fd8c4  400117c        4        System.String  0 instance 04968898 Key
790fd0f0  400117d        8        System.Object  0 instance 045ea5d8 Value
0:034> !do 04968898
Name: System.String
MethodTable: 790fd8c4
EEClass: 790fd824
Size: 38(0x26) bytes
GC Generation: 2
(C:\WINDOWS\assembly\GAC_32\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll)
String: LstStates
Fields:
      MT    Field   Offset                 Type VT     Attr    Value Name
79102290  4000096        4         System.Int32  1 instance       11 m_arrayLength
79102290  4000097        8         System.Int32  1 instance       10 m_stringLength
790ff328  4000098        c          System.Char  1 instance       4c m_firstChar
790fd8c4  4000099       10        System.String  0   shared   static Empty
    >> Domain:Value  000db310:790d884c 00109200:790d884c <<
7912dd40  400009a       14        System.Char[]  0   shared   static WhitespaceChars
    >> Domain:Value  000db310:109203f4 00109200:109242f8 <<
 

In summary, be careful with what you store away,

Laters,

Tess

The guys at Oredev are in the midst of preparing for the next Oredev conference in November.  Sadly I won't be presenting there this year because of other comittments. I say sadly because it was an awesome event last year.

If you want to check out my presentation from last year you can do that here.  The page says "Due to technical problems with our video equipment, this session was never recorded" but that is incorrect. The video is there...   Disclaimer:  I had some issues with the mike not fitting properly so it starts off a little bit shaky:) but hopefully you can oversee the presentation skills and look at the content instead

There are also plenty of other videos from the other presentations here.

After having seen a lot of really good presentations, like Scott Hanselmans MVC presentation, I realize I have a lot to learn as far as presentation skills go, and next time I present I will for sure use Mark Russinovich zoomit tool to make the commands I use more readable on the screen.

Speaking of Mark, if you haven't seen it already, I have to recommend his presenation on troubleshooting with process explorer,  it's called "The case of the Unexplained..."  I just watched it the other day and it's an eye opener.      

If you are curious about his title "Technical Fellow", it's a very prestigious title among Microsoft employees, it's awarded to a selected few and he is in the good company of other people like Patrick Dussud (among other things he is the GC architect).    

Laters,

Tess

I started this blog 2.5 years ago today, mostly because I felt that the same types of issues came up over and over and over in our support cases. I figured that if I started writing about them, a lot of people would be able to resolve them on their own, or even better avoid them in the first place.

A lot of water passed under the bridge since then, but looking back at some of those earlier posts they are still very applicable today, and they still seem to continue to get a lot of hits.  Here is a list of the 21 most popular ones...

  1. ASP.NET 2.0 Crash case study: Unhandled exceptions
  2. ASP.NET Case Study: Lost session variables and appdomain recycles
  3. ASP.NET Memory: If your application is in production… then why is debug=true
  4. .NET Memory Leak Case Study: The Event Handlers That Made The Memory Baloon
  5. ASP.NET Performance Case Study: Web Service calls taking forever
  6. .NET Memory usage - A restaurant analogy
  7. .NET Garbage Collector PopQuiz - Followup
  8. .Net memory leak: Unblock my finalizer
  9. .NET Hang Debugging Walkthrough
  10. ASP.NET Case Study: Bad perf, high memory usage and high CPU in GC - Death By ViewState
  11. ASP.NET Crash: Bad CacheItemRemovedCallback - Part II
  12. ASP.NET 2.0 - Investigating .NET Exceptions with WinDbg (Compilation and Load Exceptions)
  13. .NET Memory Leak: XmlSerializing your way to a Memory Leak
  14. .NET Garbage Collection PopQuiz
  15. A .NET Crash: How not to write a global exception handler
  16. Things to ignore when debugging an ASP.NET hang
  17. Are you aware that you have thrown over 40,000 exceptions in the last 3 hours?
  18. Short note on some debugging related stuff...
  19. ASP.NET Memory Leak Case Study: Sessions Sessions Sessions…
  20. ASP.NET Case Study: High CPU in GC - Large objects and high allocation rates
  21. A Case of Invalid Viewstate

In 2.5 more years I'll post the answer to the meaning of life, the universe and everything:)

and..., while I'm on the topic... what would you like to see more of in the future?

Tess

If you have read any of my posts you have probably noticed that I am very partial to windbg and the debugging tools for windows.  I often get friendly nudges from the developers of debugdiag when I suggest using adplus and windbg on internal discussion lists, and to be fair I have to beat on the drum a bit for debug diag as well.

My answer to the question "Should I use DebugDiag 1.1 or Windbg" is both... it just depends on the scenario.  I often lean towards windbg but to a large extent that is because that is what I use all the time, so in many cases where both fit the bill equally well I simlply haven't invested the time to see how it can be done with debug diag and therefore I suggest what I know works.

Before I start the comparison, I just want to mention that dumps created with debug diag can be analysed in windbg and vice versa.  They use the same APIs and create the same types of memory dumps.  Where they differ largely is how you configure them to gather these dumps and logs, how you analyse them in the different tools and by the fact that DebugDiag has a nice feature that allows you to monitor memory leaks in the process.

I personally use debug diag to gather dumps for native memory leaks, and I use it to analyse asp and other native issues in conjunction with windbg.  For everything else I use windbg because a) I like the logs that adplus creates and b) I like the fact that I have full control and can execute any commands I want.

Having said this I know that it is much easier to gather dumps in debug diag, and it has excellent automated analysis features which is great if you don't do post-mortem debugging on a daily bases, so I would strongly recommend trying both and making up your own mind about which one fits your style best.

 

Where do you get the tools?

Debug diag can be downloaded and installed here.

Windbg comes with the debugging tools for windows which you can download here.

Installation on production machines

Windbg

Windbg and the debugging tools for windows can also be installed on the production machine through the msi downloaded above.  If this is not allowed you can install the package on a developer machine and copy the "debugging tools for windows" folder to the production machine so no installation neccessary.  The installation package creates a few registry keys etc. that are used for versioning control, saving workspaces etc. but they are not neccesary to run the tools.

Windbg comes with a vbscript file called adplus.vbs that you use to automate taking memory dumps.

Debug diag

Debug diag needs to be installed on the production machine through the msi downloaded above.  This can sometimes prove to be a problem in cases where you are not allowed to install external tools on the server. 

Symbol setup

Both windbg and debugdiag will use the _NT_SYMBOL_PATH environment variable. 

Windbg

If you don't have this set you can set the symbol path in windbg through the userinterface file/symbol file path...  or using the command .sympath.  In adplus config files you would set the symbol path by adding .sympath "path" to the <PreCommands> section.

Note: you only need to set up symbols if you need to show callstacks or set breakpoints on certain functions so for most cases you will not need to set up a symbol path for adplus config files.

Debug diag

In debug diag you have two options for symbol paths (you set this up in the user interface through tools/options and settings)

The "Symbol Search Path" for Analysis is used when you run a dump through an analysis script.  The "Symbol Search Path for Debugging" is used when you gather dumps, set up breakpoints and for leaktracking.  It is very important that you set up a proper symbol path if you do leaktracking in order to get correct data.

I am not going to go into what symbols are and how they are used in any more depth as this is not what this post is about, but if you are interested, have a look at this post 

Gathering memory dumps for hangs

Windbg

For windbg/adplus you would open a command prompt, browse to the debugging tools for windows folder and run

adplus -hang -pn w3wp.exe   to get a memory dump of all w3wp.exe processes

adplus -hang -p 1244           to get a memory dump of the process with PID 1244

adplus -hang -IIS                to get a dump of all IIS related processes like w3wp.exe, inetinfo.exe, aspnet_wp.exe and dllhost.exe

You can also get a memory dump directly from windbg if you attach to the process and run .dump /ma "path to store dump"

For more info see this post on gathering dumps with adplus

Debug diag (copied from the debug diag instructions)

To debug a process hang, use one of the following:

  1. Create a hang rule. (this feature is available only for IIS processes) . The hang monitoring feature for IIS processes is designed to troubleshoot performance issues when users browse to pages in a web application and the pages take a long time to respond, or do not respond at all. During the hang rule creation wizard, specify the target processes to dump and a URL to be monitored along with a “Ping” interval and a response “timeout” The tools will send a request to the server at the configured intervals, and if the server does not respond before the configured timeout, a memory dump will be generated for the processes specified process(es).
  2. Create a manual memory dump during the slow or hang state by right-clicking the process name in the processes view and choosing the “Create Full Userdump” option.

Gathering memory dumps for crashes

Windbg

For windbg/adplus you would open a command prompt, browse to the debugging tools for windows folder and run

adplus -crash -pn w3wp.exe   to get a memory dump of all w3wp.exe processes when a crash occurs

adplus -crash -p 1244           to get a memory dump of the process with PID 1244

adplus -crash -IIS                to get a dump of all IIS related processes like w3wp.exe, inetinfo.exe, aspnet_wp.exe and dllhost.exe

Debug diag (copied from the debug diag instructions)

A process crash is usually indicative of an unhandled exception occurring in a process or code running in a process that actively terminates the process. To debug a process crash, start by creating a crash rule against the process(s) in question. Similar to previous debuggers, DebugDiag will attach to a specific process (s) and will monitor the process for multiple types of exceptions or any custom breakpoints that cause the process(s) to terminate unexpectedly. When the crash occurs, a full memory dump file will be created, in the directory specified when setting up the crash rule.

Analysing hangs and crashes

Debugdiag has some nice automated scripts (under the advanced analysis tab) that go through the dump and identify the most common causes for hangs and crashes. 

The analysis produces an HTML page with all the callstacks for the threads, info about http requests etc. and for ASP it also produces the ASP callstack.  In addition it gives you information about what it thinks is the problem.

Windbg on the other hand does not have any automated analysis features so you have to manually look at the threads, stacks and other data to figure out what happened.

Why would anyone in their right mind use windbg for hang analysis then???? 

1. Sometimes you need more information than just the callstack so you need to deepdive in the stacks, parameters, local variables etc.

2. Debug diag will not produce managed stacks.  It is in the works, and sometimes it will produce managed stacks but not always.

3. It takes a while for debugdiag to produce the report, and although this is not an issue for most people (waiting 10 mins for dump analysis), i use windbg in some cases because I can identify most common causes by looking at the stacks faster than running debug diag, and I know that I will need to dive deeper anyways, so it is a matter of effectivity.  If you don't work with post-mortem debugging on a daily bases though debug diag is unbeatable since it analyses the dump for you.

Gathering memory dumps for memory leaks and analysing the data

If you have a native memory leak you should definitely use leaktracking with debugdiag.

From the debug diag instructions

To debug memory and handle usage, use one of the following:

  1. Create a leak rule against the process in question. The leak monitoring feature will track memory allocations inside the process. Tracking is implemented by injecting a DLL (leaktrack.dll) into the specified process and monitoring memory allocations over time. When configuring a memory and handle leak rule, you can specify memory dump generation based on time or memory usage.
  2. Using the “processes” view, right-click the process in question and select the “monitor for leaks” option. When the process has grown to the suspected problem size, manually dump the process by right-clicking on the same process in the processes view and choosing the “Create Full Userdump” option.

Note that leaktracking needs to be enabled while you are leaking memory, preferably from right after the start of the process in order to get the most out of the data.  When you analyze the memory leak you will get information about how much you have allocated and not released while leaktracking was turned on.  You will also get sample stacks showing you who allocated this data.

In order to avoid getting info about startup data, leaktracker does not track during the first 15 minutes.  You can change this behavior in tools/options and settings by checking the checkbox "Record call stacks immediately when monitoring for leaks"

 

If you have a managed (.net) memory leak, leaktracking will not be very useful.  If you use leaktracking for a .net leak you will simply get a message that mscorwks.dll has allocated most of the memory and that you should follow up with the author of mscorwks.dll which happens to be Microsoft.

The reason you get this is because mscorwks, the core of the CLR will allocate memory in chunks for you to allocate .net objects from.  If you are "leaking" .net objects, the GC allocates more heap segments so the in the native world it looks like mscorwks is leaking.  In reality you need to go in and look at what is on the heaps to find out why there is so much, preferably with windbg.exe

 

If you have a memory dump that did not have leaktracking enabled, you can still analyse it using the memory analysis script in debug diag, it will tell you some data like how much is allocated, what heaps they are on etc. but of course it can not give you the call stacks since they were not recorded in the process.

Gathering exception information (.net)

I will write a post shortly on how to gather .net exception information in debug diag.  In short, you create a crash rule and set up exception breakpoints.

In windbg you would set up an adplus configuration file, to either log call stacks for all .net exceptions or a specific .net exceptions, or alternatively create dumps for a specific exception depending on your needs.   The details on how to set this up can be found here.

 

Laters y'all

Tess

Issues caused by frequent application restarts are pretty common. If you are interested in what causes application restarts in general, and how you can monitor application restarts and what causes them you should read this post:

ASP.NET Case Study: Lost session variables and appdomain recycles

The most recent one was one where the ASP.NET site would respond slowly at regular intervals (read: it grinded to a halt every few hours). 

First shot at gathering data:

We started off by getting a memory dump with debug diag when the process was responding slowly.  Unfortunately at the time that the dump was taken no requests were executing (i.e. ~* e !clrstack showed no managed stacks and ~* kb showed all the threads in their idle states), so we didn't get much of a clue as to why the process was responding so slowly. 

If you are debugging a performance issue it is key to try to get dumps while there are requests executing since memory dumps are just snapshots of the process so other than what is on the heap you really don't get any history in a memory dump.

Luckily one of my colleagues (who was looking at the issue before me) had noticed that there were an extreme number of exceptions in the dump.  Perhaps I shouldn't call it luck really...  if we get memory dumps that at first sight seem to not be all that useful we will still try to poke around in them to see if we can gather any clues at all as to what is going on, in order to make a more targeted approach in the next round of collecting data.

Either case, running !dumpheap -type *Exception revealed that there had recently been a lot (~1000) of TypeLoadExceptions

Exception MethodTable: 791241c0
Exception object: 02d6c4bc
Exception type: System.TypeLoadException
Message: Could not load type 'System.Web.UI.WebControls.WebParts.Content' from assembly 'System.Web, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a'.
InnerException: 
StackTrace (generated):
    SP       IP       Function
    00000000 00000001 System.Reflection.Assembly._GetType(System.String, Boolean, Boolean)
    1D0AE8D8 661680FB System.Web.UI.NamespaceTagNameToTypeMapper.System.Web.UI.ITagNameToTypeMapper.GetControlType(System.String, System.Collections.IDictionary)
	 

So the first working theory was that the exceptions themselves caused the poor performance, or at least had something to do with the poor performance.

Figuring out the cause of the TypeLoadExceptions

This exception tells us that someone is trying to create a System.Web.UI.WebControls.WebParts.Content object, but it can't because that type does not exist in System.Web.dll (the assembly that implements the System.Web.UI.WebControls.WebParts namespace). 

I looked up the System.Web.UI.WebControls.WebParts namespace on msdn and sure enough it didn't contain any Content class.  The only content class I could think of that was even remotely related was the one used for <asp:content> </asp:content> that you use with master pages.

I started playing around a bit with a simple application with master pages and found that if I browse to my application, and then add a new page with a master page and asp:content tags I will get this exception when this page compiles.

I know this is a bit of a mental leap, but I figured it probably had something to do with this tag and its namespace mappings so I ran !dumpheap -stat I found a type called System.Web.UI.TagNamespaceRegisterEntry.  I had 3 of them on the heap for my simple sample each with a tag prefix, namespace and assembly name

0:005> !dumpheap -type System.Web.UI.TagNamespaceRegisterEntry
Loading the heap objects into our cache.
------------------------------
Heap 0
   Address         MT     Size
02f04df8 663b6148       28    1 System.Web.UI.TagNamespaceRegisterEntry 
02f04e14 663b6148       28    1 System.Web.UI.TagNamespaceRegisterEntry 
02f04f3c 663b6148       28    1 System.Web.UI.TagNamespaceRegisterEntry 
total 3 objects
------------------------------
Heap 1
   Address         MT     Size
total 0 objects
------------------------------
total 3 objects
Statistics:
      MT    Count    TotalSize Class Name
663b6148        3           84 System.Web.UI.TagNamespaceRegisterEntry
Total 3 objects, Total size: 84
0:005> !do 02f04df8 Name: System.Web.UI.TagNamespaceRegisterEntry MethodTable: 663b6148 EEClass: 663b60c8 Size: 28(0x1c) bytes GC Generation: 1 (C:\WINDOWS\assembly\GAC_32\System.Web\2.0.0.0__b03f5f7f11d50a3a\System.Web.dll) Fields: MT Field Offset Type VT Attr Value Name 790fd8c4 4002029 4 System.String 0 instance 00000000 _virtualPath 79102290 400202a 8 System.Int32 1 instance 0 _line 790fd8c4 400202b c System.String 0 instance 02f04bc8 _tagPrefix 790fd8c4 400202c 10 System.String 0 instance 02f04be0 _ns 790fd8c4 400202d 14 System.String 0 instance 02f04c24 _assemblyName

I dumped out all 3 of them and found the following mappings

asp -> System.Web.UI.WebControls.WebParts
asp -> System.Web.UI.WebControls
mobile -> System.Web.UI.MobileControls

So, the tagprefix asp: maps to both WebControls.WebParts which explains why it tries to look in the WebParts namespace for Content.  Since the process doesn't fail because of this, but rather just continues happily delivering the right Content object it also means that this exception is benign, i.e. it'll just try this first, if it fails it continues looking in the WebControls namespace.

However... we still haven't answered the most important questions

1. Why are there so many of them?  In my little sample I only get a new one if i modify a page so that it has to be recompiled.  If I compile normally I will only get one independently of how many pages I have.

2. Why would it be compiling these pages at regular intervals, especially a few hours into the life of the process, all pages should be compiled by then

The only thing that seemed to fit was if perhaps the appdomain was recycling as this would answer both these questions...     

Figuring out if the appdomain is recycling

I dumped out the System.Web.HttpRuntimes (!dumpheap -type System.Web.HttpRuntime).  There is one per application loaded in the process.

The HttpRuntime object has a member variable called _firstRequestStartTime and the expected start time should be some time around the start of the process.  If the _firstRequestStartTime is within the last few minutes, that means