If broken it is, fix it you should

Using the powers of the debugger to solve the problems of the world - and a bag of chips    by Tess Ferrandez, ASP.NET Escalation Engineer (Microsoft)

ASP.NET Performance issue: Large number of application restarts due to virus scanning

ASP.NET Performance issue: Large number of application restarts due to virus scanning

Rate This
  • Comments 12

Issues caused by frequent application restarts are pretty common. If you are interested in what causes application restarts in general, and how you can monitor application restarts and what causes them you should read this post:

ASP.NET Case Study: Lost session variables and appdomain recycles

The most recent one was one where the ASP.NET site would respond slowly at regular intervals (read: it grinded to a halt every few hours). 

First shot at gathering data:

We started off by getting a memory dump with debug diag when the process was responding slowly.  Unfortunately at the time that the dump was taken no requests were executing (i.e. ~* e !clrstack showed no managed stacks and ~* kb showed all the threads in their idle states), so we didn't get much of a clue as to why the process was responding so slowly. 

If you are debugging a performance issue it is key to try to get dumps while there are requests executing since memory dumps are just snapshots of the process so other than what is on the heap you really don't get any history in a memory dump.

Luckily one of my colleagues (who was looking at the issue before me) had noticed that there were an extreme number of exceptions in the dump.  Perhaps I shouldn't call it luck really...  if we get memory dumps that at first sight seem to not be all that useful we will still try to poke around in them to see if we can gather any clues at all as to what is going on, in order to make a more targeted approach in the next round of collecting data.

Either case, running !dumpheap -type *Exception revealed that there had recently been a lot (~1000) of TypeLoadExceptions

Exception MethodTable: 791241c0
Exception object: 02d6c4bc
Exception type: System.TypeLoadException
Message: Could not load type 'System.Web.UI.WebControls.WebParts.Content' from assembly 'System.Web, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a'.
InnerException: 
StackTrace (generated):
    SP       IP       Function
    00000000 00000001 System.Reflection.Assembly._GetType(System.String, Boolean, Boolean)
    1D0AE8D8 661680FB System.Web.UI.NamespaceTagNameToTypeMapper.System.Web.UI.ITagNameToTypeMapper.GetControlType(System.String, System.Collections.IDictionary)
	 

So the first working theory was that the exceptions themselves caused the poor performance, or at least had something to do with the poor performance.

Figuring out the cause of the TypeLoadExceptions

This exception tells us that someone is trying to create a System.Web.UI.WebControls.WebParts.Content object, but it can't because that type does not exist in System.Web.dll (the assembly that implements the System.Web.UI.WebControls.WebParts namespace). 

I looked up the System.Web.UI.WebControls.WebParts namespace on msdn and sure enough it didn't contain any Content class.  The only content class I could think of that was even remotely related was the one used for <asp:content> </asp:content> that you use with master pages.

I started playing around a bit with a simple application with master pages and found that if I browse to my application, and then add a new page with a master page and asp:content tags I will get this exception when this page compiles.

I know this is a bit of a mental leap, but I figured it probably had something to do with this tag and its namespace mappings so I ran !dumpheap -stat I found a type called System.Web.UI.TagNamespaceRegisterEntry.  I had 3 of them on the heap for my simple sample each with a tag prefix, namespace and assembly name

0:005> !dumpheap -type System.Web.UI.TagNamespaceRegisterEntry
Loading the heap objects into our cache.
------------------------------
Heap 0
   Address         MT     Size
02f04df8 663b6148       28    1 System.Web.UI.TagNamespaceRegisterEntry 
02f04e14 663b6148       28    1 System.Web.UI.TagNamespaceRegisterEntry 
02f04f3c 663b6148       28    1 System.Web.UI.TagNamespaceRegisterEntry 
total 3 objects
------------------------------
Heap 1
   Address         MT     Size
total 0 objects
------------------------------
total 3 objects
Statistics:
      MT    Count    TotalSize Class Name
663b6148        3           84 System.Web.UI.TagNamespaceRegisterEntry
Total 3 objects, Total size: 84
0:005> !do 02f04df8 Name: System.Web.UI.TagNamespaceRegisterEntry MethodTable: 663b6148 EEClass: 663b60c8 Size: 28(0x1c) bytes GC Generation: 1 (C:\WINDOWS\assembly\GAC_32\System.Web\2.0.0.0__b03f5f7f11d50a3a\System.Web.dll) Fields: MT Field Offset Type VT Attr Value Name 790fd8c4 4002029 4 System.String 0 instance 00000000 _virtualPath 79102290 400202a 8 System.Int32 1 instance 0 _line 790fd8c4 400202b c System.String 0 instance 02f04bc8 _tagPrefix 790fd8c4 400202c 10 System.String 0 instance 02f04be0 _ns 790fd8c4 400202d 14 System.String 0 instance 02f04c24 _assemblyName

I dumped out all 3 of them and found the following mappings

asp -> System.Web.UI.WebControls.WebParts
asp -> System.Web.UI.WebControls
mobile -> System.Web.UI.MobileControls

So, the tagprefix asp: maps to both WebControls.WebParts which explains why it tries to look in the WebParts namespace for Content.  Since the process doesn't fail because of this, but rather just continues happily delivering the right Content object it also means that this exception is benign, i.e. it'll just try this first, if it fails it continues looking in the WebControls namespace.

However... we still haven't answered the most important questions

1. Why are there so many of them?  In my little sample I only get a new one if i modify a page so that it has to be recompiled.  If I compile normally I will only get one independently of how many pages I have.

2. Why would it be compiling these pages at regular intervals, especially a few hours into the life of the process, all pages should be compiled by then

The only thing that seemed to fit was if perhaps the appdomain was recycling as this would answer both these questions...     

Figuring out if the appdomain is recycling

I dumped out the System.Web.HttpRuntimes (!dumpheap -type System.Web.HttpRuntime).  There is one per application loaded in the process.

The HttpRuntime object has a member variable called _firstRequestStartTime and the expected start time should be some time around the start of the process.  If the _firstRequestStartTime is within the last few minutes, that means that the application has restarted very recently.

0:005> !do 0x6eb2d14
Name: System.Web.HttpRuntime
MethodTable: 6639a67c
EEClass: 6639a60c
Size: 152(0x98) bytes
GC Generation: 2
 (C:\WINDOWS\assembly\GAC_32\System.Web\2.0.0.0__b03f5f7f11d50a3a\System.Web.dll)
Fields:
      MT    Field   Offset                 Type VT     Attr    Value Name
791132b4  40010f8        4 ...amedPermissionSet  0 instance 00000000 _namedPermissionSet
6639b12c  40010f9        8 ...ileChangesMonitor  0 instance 06eb3218 _fcm
6639e2c8  40010fa        c ...ing.CacheInternal  0 instance 06eb4d74 _cacheInternal
6639d878  40010fb       10 ...Web.Caching.Cache  0 instance 06eb4c8c _cachePublic
7910be50  40010fc       74       System.Boolean  1 instance        0 _isOnUNCShare
6639ad78  40010fd       14 ...Web.Util.Profiler  0 instance 06eb2dac _profiler
6639ae78  40010fe       18 ...estTimeoutManager  0 instance 06eb2dc8 _timeoutManager
663aebe4  40010ff       1c ....Web.RequestQueue  0 instance 02ed438c _requestQueue
7910be50  4001100       75       System.Boolean  1 instance        0 _apartmentThreading
7910be50  4001101       76       System.Boolean  1 instance        1 _processRequestInApplicationTrust
7910be50  4001102       77       System.Boolean  1 instance        0 _beforeFirstRequest
7910c878  4001103       84      System.DateTime  1 instance 06eb2d98 _firstRequestStartTime
7910be50  4001104       78       System.Boolean  1 instance        1 _firstRequestCompleted
7910be50  4001105       79       System.Boolean  1 instance        0 _userForcedShutdown
...

It's a bit tricky to convert the time (a value type) to an actual time stamp, but it goes like this...

dq <address of firstrequeststarttime>  (dump the content of the start time as a quad word to get the number of ticks.)
.formats <quad word>&&0x3FFFFFFFFFFFFFFF  (run .formats on the number of ticks with the 2 highest bits filtered out)

0:005> dq 06eb2d98
06eb2d98  48ca8304`65439a9d 00000000`00000000
06eb2da8  6639ad78`00000000 0000000a`02ed4c5c
06eb2db8  01000000`00000000 00000000`00000000
06eb2dc8  06eb2dec`6639ae78 00000000`06eb2f54
06eb2dd8  00000000`00000005 00000000`08f0d180
06eb2de8  7912d8f8`00000000 6639af40`0000000d
06eb2df8  06eb2e44`06eb2e30 06eb2e6c`06eb2e58
06eb2e08  06eb2e94`06eb2e80 06eb2ebc`06eb2ea8

0:005> .formats 48ca8304`65439a9d&0x3FFFFFFFFFFFFFFF 
Evaluate expression:
  Hex:     08ca8304`65439a9d
  Decimal: 633462752501013149
  Octal:   0043124060214520715235
  Binary:  00001000 11001010 10000011 00000100 01100101 01000011 10011010 10011101
  Chars:   ....eC..
  Time:    Tue May 13 13:34:10.101 3608 (GMT+2)
  Float:   low 5.77321e+022 high 1.21882e-033
  Double:  2.56941e-266

The first request for this appdomain came in at 13:34:10, now we can compare this to the current time

0:005> .time
Debug session time: Tue May 13 13:35:47.210 2008 (GMT+2)
System Uptime: 33 days 11:04:15.423
Process Uptime: 0 days 5:42:08.328
  ...

And we find that the appdomain restarted less than a minute ago, so it is definitely a case of process restarts.

Why did the appdomain restart?

After adding logging per the "ASP.NET Case Study: Lost session variables and appdomain recycles" article the following showed up in the eventlog

_shutDownStack:    at System.Environment.GetStackTrace(Exception e, Boolean needFileInfo)
   at System.Environment.get_StackTrace()
   at System.Web.Hosting.HostingEnvironment.InitiateShutdownInternal()
   at System.Web.Hosting.HostingEnvironment.InitiateShutdown()
   at System.Web.HttpRuntime.ShutdownAppDomain(String stackTrace)
   at System.Web.HttpRuntime.OnCriticalDirectoryChange(Object sender, FileChangeEvent e)
   at System.Web.FileChangesMonitor.OnCriticaldirChange(Object sender, FileChangeEvent e)
   at System.Web.DirectoryMonitor.FireNotifications()
   at System.Web.Util.WorkItem.CallCallbackWithAssert(WorkItemCallback callback)
   at System.Web.Util.WorkItem.OnQueueUserWorkItemCompletion(Object state)
   at System.Threading._ThreadPoolWaitCallback.WaitCallback_Context(Object state)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading._ThreadPoolWaitCallback.PerformWaitCallbackInternal(_ThreadPoolWaitCallback tpWaitCallBack)
   at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback(Object state)
_shutDownMessage: Change Notification for critical directories.
bin dir change or directory rename
HostingEnvironment initiated shutdown
HostingEnvironment caused shutdown
Change in C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\Temporary ASP.NET Files\root\5fe0fce6\53087c60\hash\hash.web

The next step was to figure out who was touching the hash.web by monitoring the file with process monitor, and It turned out, it was antivirus scanning the file, which in turn triggered the filechangesmonitor to recycle the appdomain.

Solutions

Normally I would suggest looking into if it is possible to exclude the Temporary ASP.NET Files directory and subdirectories from automatic virus scanning as well as the web content directories. 

In 2.0 it is also possible to disable filechangesmonitoring through the HKLM\Software\Microsoft\ASP.NET\FCNMode key.  This KB article lists the different values that this can be set to,  a value of 1 will disable file changes monitoring.  The drawback of disabling this is that if you disable it, the application will not restart if you change web.config etc. so you need to carefully consider what you want to set this to.

  • Couldn't you also just turn off Last Access updates in the NTFS settings accomplish the same thing? That is the only thing the virus scanner is changing on the file, correct? (And who needs Last Access anyway?)

  • Thanks Mufasa,  I think you're absolutely correct

  • Wow!  I hope you realize that there are probably not 2 developers in a thousand that could do this level of debugging and analysis.  Thanks for sharing your knowledge and techniques.

  • And this is one of many reasons why you shouldn't be running virus scanners on production boxes.  Why wasn't the solution to uninstall the scanner?

    Very interesting and great debugging regardless.

  • I am not sure that I would recommend anyone to not run a virus scanner on a production box, but there as Mufasa mentioned you can turn off Last Access updates, and work with the virus scanner vendor to find out what you need to do to avoid updating attributes on files in the web content directories and the temporary asp.net files.

  • 【原文地址】 May 20th Links: ASP.NET, ASP.NET AJAX, .NET, Visual Studio, Silverlight, WPF 【原文发表日期】 Tuesday

  • Mijn verontschuldigingen voor de weinige posts de laatste weken. Mijn werk en leven zijn enorm de druk

  • I just read a post on If broken it is, fix it you should , where he&#39;s debugging a performance problem

  • Mes excuses pour le peu de publications au cours des dernières semaines – le travail et la vie en générale

  • Can you explain the difference between FCNMode=0 and FCNMode=2 in terms of functionality? We have hit the following problem: http://www.dotnetnuke.com/Community/Forums/tabid/795/forumid/-1/threadid/182071/scope/posts/Default.aspx

    and it seems like setting it to 2 solves it. But we are not quite sure what's the difference between them.

  • The difference between 0 and 2 is just how many objects are monitoring file change notifications. i.e. one per sub directory (0) or one object monitoring all directories (2)

    Without knowing the exact details I am assuming that the reason 2 solves the problem is probably because something in the setup you have causes the objects to lock up if you have multiple objects working as they would probably have to synchronize then...  

  • But the difference between 0 and 2, when set to 2 doesn't mean that the appdomain will restart whenever ANY file in all the subdirectories gets changed (does it)?

Page 1 of 1 (12 items)
Leave a Comment
  • Please add 6 and 2 and type the answer here:
  • Post