If broken it is, fix it you should

Using the powers of the debugger to solve the problems of the world - and a bag of chips    by Tess Ferrandez, ASP.NET Escalation Engineer (Microsoft)

ASP.NET Case Study: Lost session variables and appdomain recycles

ASP.NET Case Study: Lost session variables and appdomain recycles

Rate This

Last night I got a question from one of the readers of the blog that went like this:

 

 

“We are facing a problem that i cannot understand, every now and than i see that my app domain is recycled (i have a log in the application_end), I check the IIS logs and i don't see a restart of IIS and i know that no one is changing configuration (web.config).

 

I wanted to know if you know of any way that i can pinpoint the reason for that app domain to die?

 

The application pool that i am using only have the recycle every 20 minutes of idle time enabled“

 

 

 

..and I thought that since I haven’t written for a while (due to a really nice and long vacationJ) and this is a pretty common scenario I would write a post on it…

 

Before we go into the details of how to figure out why it is recycling I want to bring up two things

 

  1. What happens when an application domain is recycled
  2. What are the reasons an application domain recycles

 

What happens when an application domain is recycled?

 

In ASP.NET each individual asp.net application resides in its own application domain, so for example if you have the following website structure

 

WebSite root

          /HrWeb

          /EmployeeServices

          /FinanceWeb

          /SalesWeb

 

…where each of the subwebs HrWeb, EmployeeServices etc. are set up as an application in the internet service manager, you will have the following application domains (appdomains) in your asp.net process

 

System Domain

Shared Domain

Default Domain

Root

HrWeb

EmployeeServices

FinanceWeb

SalesWeb

 

Apart from the first three domains (in italic) which are a bit special, each of the other ones contain the data pertinent to that application (Note: this is a bit simplified for readability), specifically they contain these things worth noting…

 

  1. All the assemblies specific to that particular application
  2. A HttpRuntime object
  3. A Cache object

         

When the application domain is unloaded all of this goes away, which means that on the next request that comes in all assemblies need to be reloaded, the code has to be re-jitted and the cache including any in-proc session variables etc. are empty.  This can be a pretty big perf-hit for the application so as you can imagine it is important to not have the application domain recycle too often.

 

Why does an application domain recycle?

 

An application domain will unload when any one of the following occurs:

 

  • Machine.Config, Web.Config or Global.asax are modified
  • The bin directory or its contents is modified
  • The number of re-compilations (aspx, ascx or asax) exceeds the limit specified by the <compilation numRecompilesBeforeAppRestart=/> setting in machine.config or web.config  (by default this is set to 15)
  • The physical path of the virtual directory is modified
  • The CAS policy is modified
  • The web service is restarted
  • (2.0 only) Application Sub-Directories are deleted (see Todd’s blog http://blogs.msdn.com/toddca/archive/2006/07/17/668412.aspx for more info)

 

There may be some reasons in 2.0 that I have missed but hopefully this should cover most scenarios.

 

Specific issues

 

I want to pay a bit more attention to a few of these, which seem to be especially popularJ

 

Unexpected config or bin directory changes

 

You swear on all that is holy that no-one is touching these, but still when we start logging (as I’ll show later) the reason for the app domain recycle is a config change… how the heck can that be?

 

Elementary, Dr. Watson… something else is touching them… and that something else is usually a virus scanning software or backup software or an indexing service.  They don’t actually modify the contents of the files, but many virus scanners etc. will modify attributes of files which is enough for the file changes monitor to jump in and say “aha !, something changed, better recycle the appdomain to update the changes”.

 

If you have a virus scanner that does this, you should probably consider removing the content directories from the real-time scan, of course after carefully making sure that no-one can access and add any virus software to these directories.

 

Web site updates while the web server is under moderate to heavy load

 

Picture this scenario:  You have an application with 10 assemblies in the bin directory  a.dll, b.dll, c.dll etc. (all with the version number 1.00.00). Now you need to update some of the assemblies to your new and improved version 1.00.12, and you do so while the application is still under heavy load because we have this great feature allowing you to update assemblies on the go…  well, think again...

 

Say you update 7 of the 10 assemblies and for simplicity lets say this takes about 7 seconds, and in those 7 seconds you have 3 requests come in… then you may have a situation that looks something like this…

 

Sec 1.           a.dll and b.dll are update to v 1.00.12    - appdomain unload started (any pending requests will finish before it is completely unloaded)

Sec 2.           Request1 comes in and loads a new appdomain with 2 out of 7 of the dlls updated

Sec 3.           c.dll is updated                                    - appdomain unload started         (any pending requests will finish before it is completely unloaded)

Sec 4.           d.dll is updated

Sec 5.           Request2 comes in and loads a new appdomain, now with 4 out of 7 dlls updated

Sec 6.           e.dll and f.dll is updated                       - appdomain unload started         (any pending requests will finish before it is completely unloaded)

Sec 7.           f.dll is updated

Sec 8.           Request3 comes in and loads a new appdomain with all 7 dlls updated

 

So, many bad things happened here…

 

First off you had 3 application domain restarts while you probably thought you would only have one, because asp.net has no way of knowing when you are done.  Secondly we got a situation where Request1 and Request2 were executing with partially updated dlls, which may generate a whole new set of exceptions if the dlls depend on updates in the other new dlls, I think you get the picture…  And thirdly you may get exceptions like “Cannot access file AssemblyName because it is being used by another process” because the dlls are locked during shadow copying.  http://support.microsoft.com/kb/810281

 

In other words, don’t batch update during load…

 

So, is this feature completely worthless?  No… if you want to update one dll, none of the problems above occur… and if you update under low or no load you are not likely to run into any of the above issues, so in that case you save yourself an IIS restart… but if you want to update in bulk you should first take the application offline.

 

There is a way to get around it, if you absolutely, positively need to update under load, and it is outlined in the kb article mentioned above…

 

In 1.1 we introduced two new config settings called <httpRuntime waitChangeNotification= /> and <httpRuntime maxWaitChangeNotification= />. 

 

The waitChangeNotification indicates how many seconds we should wait for a new change notification before the next request triggers an appdomain restart. I.e. if we have a dll updated at second 1, and then a new one at second 3, and our waitChangeNotification is set to 5… we would wait until second 8 (first 1+5, and then changed to 3+5) before a new request would get a new domain, so a request at second 2 would simply continue using the old domain. (The time is sliding so it is always 5 seconds from the last change)

 

The maxWaitChangeNotification indicates the maximum number of seconds to wait from the first request. If we set this to 10 in the case where we update at second 1 and 3, we would still get a new domain if a request came in at second 8 since the waitChangeNotification expired. If we set this to 6 however, we would get a new domain already if a request came in at second 7, since the maxWaitChangeNotification had then expired.  So this is an absolute expiration rather than a sliding… and we will recycle at the earliest of the maxWaitChangeNotification and waitChangeNotification.

 

In the scenario at the beginning of this section we could have set the waitChangeNotification to 3 seconds and the maxWaitChangeNotification to 10 seconds for example to avoid the problems.

 

(I know this explanation might have been a bit confusion but I hope you catch the drift)

 

A few things are important if you fiddle with these settings

 

  1. They default to 0 if not set
  2. maxWaitChangeNotification should always be >= waitChangeNotification
  3. If these settings are higher than 0 you will not see any changes until the changeNotifications expire. i.e. web.config changes and dll changes etc. will appear cached.

 

Re-compilations

 

A common scenario here is that you have a set of aspx pages (containing some news items and what not) and you have a content editor that goes in periodically and updates the news with some new articles or other new content.   Every time you update an aspx page it has to be recompiled, because again, asp.net has no way of knowing if it was a code update or just update of some static text… all it knows is that someone updated the files.

 

If you have followed some of my previous posts you know that assemblies can not be unloaded unless the application domain is unloaded, and since each recompile would generate a new assembly there is a limit to how many recompiles you can do, to avoid generation of too many assemblies (and thus limiting the memory usage for these).  By default this limit is 15.

 

If the contents of the page is constantly updated I would recommend to dynamically get the content from a database or file rather than actually modifying the aspx pages. Or alternatively using frames with HTML pages for this content.

 

How do you determine that you have application recycles?

 

If you experience cache or session loss, it is probably a good bet, but to make sure you can look at the perfmon counter ASP.NET v…/Application Restarts.

 

How do you determine what caused an appdomain restart?

 

In ASP.NET 2.0 you can use the built in Health Monitoring Events to log application restarts along with the reason for the restart.  To do this you change the master web.config file in the C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\CONFIG directory and add the following to the <healthMonitoring><rules> section

 

                <add name="Application Lifetime Events Default" eventName="Application Lifetime Events"

                    provider="EventLogProvider" profile="Default" minInstances="1"

                    maxLimit="Infinite" minInterval="00:01:00" custom="" />

 

For a web.config change this generates an event like so:

 

Event Type:  Information

Event Source:         ASP.NET 2.0.50727.0

Event Category:      Web Event

Event ID:      1305

Date:            2006-08-02

Time:           13:33:19

User:            N/A

Computer:    PRATHER

Description:

Event code: 1002

Event message: Application is shutting down. Reason: Configuration changed.

Event time: 2006-08-02 13:33:19

Event time (UTC): 2006-08-02 11:33:19

Event ID: 6fc2b84de5b74b5ba65b21804d18b7bf

Event sequence: 8

Event occurrence: 1

Event detail code: 50004

 

Application information:

    Application domain: /LM/w3svc/1/ROOT/DebuggerSamples-9-127989919076505325

    Trust level: Full

    Application Virtual Path: /DebuggerSamples

    Application Path: c:\inetpub\wwwroot\DebuggerSamples\

    Machine name: PRATHER

 

Process information:

    Process ID: 4876

    Process name: w3wp.exe

    Account name: NT AUTHORITY\NETWORK SERVICE

 

Custom event details:

 

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

 

 

There is a lot of nice events to capture and you can even write your own providers and events. To get more info about this and other events you can enable, you can check out this article: http://msdn2.microsoft.com/en-us/library/ms228103.aspx

 

For ASP.NET 1.1 you can make use of private reflection to get a hold of the shutdown message (this works in 2.0 as well btw, but I wanted to show you both ways).

 

If you are not interested in the details and just want to cut to the chase and log it, check out ScottGu’s blog http://weblogs.asp.net/scottgu/archive/2005/12/14/433194.aspx on how to do this (super nice with ready-to-go code samples that you just plug in to your app).

 

If you are like me and need to know every little detail of every little thing… here is how it’s doneJ

 

As I mentioned before, each domain has a HttpRuntime object… 

 

0:014> !do 0x04f6f324

Name: System.Web.HttpRuntime

MethodTable 0x00e39df4

EEClass 0x0b608028

Size 116(0x74) bytes

GC Generation: 2

mdToken: 0x02000078  (c:\winnt\assembly\gac\system.web\1.0.5000.0__b03f5f7f11d50a3a\system.web.dll)

FieldDesc*: 0x00e3955c

        MT      Field     Offset                 Type       Attr      Value Name

0x00e39df4 0x4000680      0x4                CLASS   instance 0x00000000 _namedPermissionSet

0x00e39df4 0x4000681      0x8                CLASS   instance 0x01031904 _fcm

0x00e39df4 0x4000682      0xc                CLASS   instance 0x01031b64 _cache

0x00e39df4 0x4000683     0x54       System.Boolean   instance 0 _isOnUNCShare

0x00e39df4 0x4000684     0x10                CLASS   instance 0x01033c88 _profiler

0x00e39df4 0x4000685     0x14                CLASS   instance 0x01033ca4 _timeoutManager

0x00e39df4 0x4000686     0x18                CLASS   instance 0x0104ded4 _requestQueue

0x00e39df4 0x4000687     0x55       System.Boolean   instance 0 _apartmentThreading

0x00e39df4 0x4000688     0x56       System.Boolean   instance 0 _beforeFirstRequest

0x00e39df4 0x4000689     0x60            VALUETYPE   instance start at 0x010318c8 _firstRequestStartTime

0x00e39df4 0x400068a     0x57       System.Boolean   instance 1 _firstRequestCompleted

0x00e39df4 0x400068b     0x58       System.Boolean   instance 0 _userForcedShutdown

0x00e39df4 0x400068c     0x59       System.Boolean   instance 1 _configInited

0x00e39df4 0x400068d     0x50         System.Int32   instance 0 _activeRequestCount

0x00e39df4 0x400068e     0x5a       System.Boolean   instance 0 _someBatchCompilationStarted

0x00e39df4 0x400068f     0x5b       System.Boolean   instance 0 _shutdownInProgress

0x00e39df4 0x4000690     0x1c                CLASS   instance 0x00000000 _shutDownStack

0x00e39df4 0x4000691     0x20                CLASS   instance 0x00000000 _shutDownMessage

0x00e39df4 0x4000692     0x68            VALUETYPE   instance start at 0x010318d0 _lastShutdownAttemptTime

0x00e39df4 0x4000693     0x5c       System.Boolean   instance 1 _enableHeaderChecking

0x00e39df4 0x4000694     0x24                CLASS   instance 0x01033e44 _handlerCompletionCallback

0x00e39df4 0x4000695     0x28                CLASS   instance 0x01033e60 _asyncEndOfSendCallback

0x00e39df4 0x4000696     0x2c                CLASS   instance 0x01033e7c _appDomainUnloadallback

0x00e39df4 0x4000697     0x30                CLASS   instance 0x00000000 _initializationError

0x00e39df4 0x4000698     0x34                CLASS   instance 0x00000000 _appDomainShutdownTimer

0x00e39df4 0x4000699     0x38                CLASS   instance 0x0104dc60 _codegenDir

0x00e39df4 0x400069a     0x3c                CLASS   instance 0x00fc3c48 _appDomainAppId

0x00e39df4 0x400069b     0x40                CLASS   instance 0x00fc3ca4 _appDomainAppPath

0x00e39df4 0x400069c     0x44                CLASS   instance 0x00fc3d8c _appDomainAppVPath

0x00e39df4 0x400069d     0x48                CLASS   instance 0x00fc3d04 _appDomainId

0x00e39df4 0x400069e     0x4c                CLASS   instance 0x00000000 _resourceManager

0x00e39df4 0x400069f     0x5d       System.Boolean   instance 0 _debuggingEnabled

0x00e39df4 0x40006a0     0x5e       System.Boolean   instance 0 _vsDebugAttach

0x00e39df4 0x400067b        0                CLASS     shared   static _theRuntime

    >> Domain:Value 0x0014af68:NotInit  0x0017cd60:0x04f6f324 0x002165d0:0x04fcb660 <<

0x00e39df4 0x400067c      0x4                CLASS     shared   static s_autogenKeys

    >> Domain:Value 0x0014af68:NotInit  0x0017cd60:0x04f6ef28 0x002165d0:0x04fcb474 <<

0x00e39df4 0x400067d      0xc       System.Boolean     shared   static s_initialized

    >> Domain:Value 0x0014af68:NotInit  0x0017cd60:1 0x002165d0:1 <<

0x00e39df4 0x400067e      0x8                CLASS     shared   static s_installDirectory

    >> Domain:Value 0x0014af68:NotInit  0x0017cd60:0x04f6f19c 0x002165d0:0x04fcb4d8 <<

0x00e39df4 0x400067f     0x10       System.Boolean     shared   static s_isapiLoaded

    >> Domain:Value 0x0014af68:NotInit  0x0017cd60:1 0x002165d0:1 <<

 

 

The HttpRuntime object is a static, and to get to the particular HttpRuntime object for our domain we can dump out any HttpRuntime object and look at the _theRuntime static member…  static member variables look a little bit special when you dump with !do… instead of getting the address straight away you get a list like this:

 

0x00e39df4 0x400067b        0                CLASS     shared   static _theRuntime

    >> Domain:Value 0x0014af68:NotInit  0x0017cd60:0x04f6f324 0x002165d0:0x04fcb660 <<

 

This means that in domain 0x0014af68 we haven’t initialized this object yet,  in domain 0x0017cd60 it is located at address 0x04f6f324, and in domain 0x002165d0 it is located at address 0x04fcb660.

 

You can get the address of your domain from !dumpdomain, for example this one is for Domain 3, which we can see is HrWeb

 

Domain 3: 0x2165d0

LowFrequencyHeap: 0x00216634

HighFrequencyHeap: 0x0021668c

StubHeap: 0x002166e4

Name: /LM/W3SVC/1/HrWeb-127976921852307107

 

 

Now why am I bothering with this HttpRuntime?  Well… there are a lot of goodies found in the HttpRuntime object, things like _debugginEnabled to see if debug=true, and the address of the cache object, but particularly interesting for this case, it also contains two member variables named _shutDownStack and _shutDownMessage which we make use of when logging the event.

 

So in our Application_End in global.asax we can put code like this to first get the _theRuntime object for our domain,

 

     HttpRuntime runtime = (HttpRuntime) typeof(System.Web.HttpRuntime).InvokeMember("_theRuntime", BindingFlags.NonPublic | BindingFlags.Static | BindingFlags.GetField, null, null, null);

 

…and then get the contents of the shutDownMessage and shutDownStack like this…

 

    string shutDownMessage = (string) runtime.GetType().InvokeMember("_shutDownMessage", BindingFlags.NonPublic | BindingFlags.Instance | BindingFlags.GetField, null, runtime, null);

    string shutDownStack = (string) runtime.GetType().InvokeMember("_shutDownStack", BindingFlags.NonPublic | BindingFlags.Instance | BindingFlags.GetField, null, runtime, null);

 

And then it is just a matter of logging them to the event log or a log file or similar and wait for the recycle to happen.

 

 

Oh, just one more detail before I sign off… if you see config change notifications and you are not sure who is touching your files, I would recommend running filemon from http://www.sysinternals.com/ to log file access.  Great tool for a lot of security and file related issues.

 

Laters y’all.





  • Nice article Tess. Got a really good understanding about application domain recycling.

    I have been working on a content management website since the past few weeks and my part involves the file handling functionalities.

    Our website actually has to create files, folders etc. in a 3-level hierarchy and also manipulate them based on what the user does.

    While everything was working fine, I kept getting puzzled as to why, everytime a directory was renamed (moved) successfully, I used to lose everything in the session... Debugging only revealed that the session was getting invalidated.

    Only after I read this and Scott Forsyth's article, did the problem become clear to me.

    But I have to admit that this is a sticky problem. And there is no easy way to work around this one.

    Keep writing!

  • Thanks for sharing knowledge.

    It's really good article.

  • Hi!

    I have a question, maybe you could help me. For example, if I try to submit a job (request to the server) after a recycle, my job is lost - no trace of it, no errors are logged, nothing but the information in event viewer that the application has recycled. I understand why this happens, but I would like to know if there's by any chance a way of showing a message to the user that something happened and he will be logged out when submitting the job? Or maybe after he logged back in?

    I cannot see a way of doing this, since all the dlls need to be reloaded, but I thought it's worth a shot to ask :)

    Thanks for the article, it's really good and revealing :)

  • i dont really see a way that you could do that... unless perhaps you keep some session variables and detect that they are lost for example

  • Great article Tess!  I have been troubleshooting recycle events for a few days.  A combination of virus scanning and mis-used app pools in IIS6 were the trouble.  Thanks again.

  • Great blog Tess.  I have learnt so much from it.  I have a few questions.  Is there any disadvantage of setting the value for maxWaitChangeNotification and waitChangeNotification to be really high (like 10 hours)?  We are seeing a lot of Application restarts in our DotNetNuke environment and we are not making any changes.  Is it because the different user controls on different pages are getting compiled and causing the application to be reloaded.  

    What is the disadvantage (other than the memory footprint) of increasing numRecompilesBeforeAppRestart to a really high number like 500 or 1000?  

    Thanks !

  • csaxena,

    There are basically two disadvantages.  

    1. Setting numRecompilesBeforeAppRestart to something really high, as you pointed out will mean that you store more copies of stale dlls in the process leading to higher memory usage.  That is pretty much the only drawback you will see from that...

    2. setting maxWaitChangeNotification and waitChangeNotification to something really high like 10 hrs means that changes made during that time that require appdomain restarts wont take effect.  For example if you modify the web.config, the old settings will be "cached" in that case for 10 hrs and new ones wont take effect until the appdomain is restarted.   Same with dll updates.  As long as you are aware of that and restart the appdomain manually (by restarting the process) when changes that you need to see go through immediately are made, you'll be ok.  In general I really wouldn't recommend setting this higher than 5-10 minutes though...

  • Great article Tess...

    I have an ASP.NET application (in IIS7) where content editors make regular updates to .resx files.

    The problem is that this eventually causes the app domain to recycle.

    With changes to .aspx pages, I can set compilationMode="never" in web.config and I avoid the problem of excess recompiles.

    Is there a solution for .resx files??

    I would prefer not to set numRecompilesBeforeAppRestart to a high value (or 0) because I don't want to use up RAM -- I need RAM for ASP.NET cache.

  • unfortunately i dont think there is, but even if there was, that would mean that the updates wouldnt go through until you manually recycle.  So given that, perhaps it is better to have the content editors edit to a staging location and copy them all in one go at some point.

  • ''Sec 1.           a.dll and b.dll are update to v 1.00.12    - appdomain unload started (any pending requests will finish before it is completely unloaded)

    Sec 2.           Request1 comes in and loads a new appdomain with 2 out of 7 of the dlls updated

    Sec 3.           c.dll is updated                                    - appdomain unload started         (any pending requests will finish before it is completely unloaded)

    Sec 4.           d.dll is updated

    Sec 5.           Request2 comes in and loads a new appdomain, now with 4 out of 7 dlls updated

    Sec 6.           e.dll and f.dll is updated                       - appdomain unload started         (any pending requests will finish before it is completely unloaded)

    Sec 7.           f.dll is updated

    Sec 8.           Request3 comes in and loads a new appdomain with all 7 dlls updated

    So, many bad things happened here…

    First off you had 3 application domain restarts while you probably thought you would only have one, because asp.net has no way of knowing when you are done.''

    -----------------------------------

    In the above scenario, don't you feel there are more than 3 appdomain restarts, but you've stated there are 3. Am I getting this wrong ?

  • Then how about the web.config <system.web><hostingEnvironment idleTimeout="Infinite" approach? The Microsoft docs suggest that this can be set in the application level web.config and will prevent application shutdown. Problem is, I tried this on a Server2003/IIS 6.0 hosting service setup and it had no effect. The result is that every few minutes the website page requests take 20 seconds or so for a page to load up as the whole site recompiles or whatever it needs to do. The config setting seems to have no effect. Would this be something that would be overridden higher up the chain?

  • Hi Tess

    Came across your blog by chance and found it amazing! Was hoping you could help us with a weird problem we're encountering on our production server which has IIS6 installed.

    What we deal with is a simple WCF application which is hosted on default app pool in IIS. This is the only application hosted btw.

    Once in a while the server requests just hang up for a while, a blank period of total inactivity and at the end of the lull period activity suddenly shoots up and request completes successfully. But why this delay? There's nothing in the logs in event viewer or in the logs of the application which suggests anything. The time in between is just LOST. Nothing in the dump suggested deadlocks. Can you please help me figure out the root cause? Thanks a bunch!

  • Hi,

    I am updating style.css dynamically which is in App_Themes, whenever style.css is updated user gets logged out frequently.

    I think application restarts on style.css changes since it is in App_Themes folder.

    Is there any workaround for this issue?

  • Very Bad to the user.It can't give clear view to the user.

  • Thank you, Tess!

    The explanation of when an application domain is unloaded helped me solve a performance problem for a customer.

Page 7 of 8 (108 items) «45678
Leave a Comment
  • Please add 6 and 8 and type the answer here:
  • Post