Recycling (AppDomains, not cans)

There are two core concepts when it comes to understanding the shared application hosting model of ASP.NET and IIS: you’ve got Applications, which correspond to individual AppDomains and Application Pools, which correspond to the worker process instances in which those applications live (forget about Web Gardens for a moment, which lets you have n number of worker process instances servicing a particular App Pool – it’s easier to talk about this stuff when you assume that AppPool == worker process).

 

One of the nice things about the shared hosting infrastructure is that (unlike, say, a console app) the lifetime of the application is decoupled from the lifetime of the process in which it runs. When WAS spins up an instance of w3wp.exe in response to a request that’s destined for a currently dormant App Pool, the hosting environment only loads an AppDomain for the application that will eventually process that request. All the other apps living in that pool will remain dormant until needed. Applications that don’t receive messages for a while will eventually get unloaded to free up process resources for other apps that are actually doing useful work at the time. The goal here is to only keep applications around when they’re needed and make them go away quickly when they’re not.

 

As a consequence of this pooling architecture, applications hosted in ASP.NET start up and shut down fairly frequently. There are many reasons why a hosted app might suddenly find its AppDomain being unloaded:

 

  • Someone touched the web.config file
  • The contents of a critical directory (e.g. \App_Code or \bin) have changed
  • A ‘critical file’ has changed (if you’re writing your own build provider, you can force the app to recycle when your file changes by overriding BuildProvider.GetResultFlags and returning BuildProviderResultFlags.ShutdownAppDomainOnChange – the build provider for .svc files does this, for example)
  • The AppDomain’s IdleTimeout period has expired (this is configurable too – under <system.web.hostingEnvironment>
  • The worker process is being recycled. WAS keeps track of several factors to figure out when this should happen – request count, memory consumption, how long the process has been alive, etc. All of these things are configurable just like IIS6.
  • The admin stops the WAS service
  • Your app code calls Process.Exit()...etc

The point is that AppDomain recycling is a fact of life for hosted apps, and any application that runs in the shared hosting environment must be cognizant of the fact that, sooner or later, their app domain will get unloaded for no readily apparent reason (from the app’s perspective, at least).

 

For stateless apps or apps that can completely externalize their instance state, recycling presents no significant issues. After all, if you’re not storing any local state then requests aren’t affinitized to the currently running AppDomain instance and recycling is basically transparent to the application code.

 

Recycling can bite you in the ass if you make the (mistaken) assumption that there will only ever be one running instance of your app at a time. Fortunately, not many apps make this assumption because AppDomain recycles can and do overlap. When a recycle occurs, the AppDomain doesn’t go away immediately. It can potentially hang around for a while (by default this is capped at 30 seconds, but is configurable though the shutdownTimeout value on the system.web.hostingEnvironment config element). If a request for that application comes in while the shutdownTimeout is elapsing, then a new AppDomain will be created to service that request and live alongside the old AppDomain until the old one unloads or the shutdownTimeout expires. During that brief period of time, you have two copies of your app running. If you depend on exclusive access to an external resource, overlapped recycling can potentially cause that resource acquisition to fail.

 

Applications that maintain state in memory tend to have more significant issues with recycling, even in the non-overlapped case. Unless that state can be easily recreated (e.g., it’s just a cache – in which case recycling will cause cache misses in the new AppDomain but not failure), conversations (long-running sequences of correlated requests) that depend on access to that state are hosed unless they can complete before the AppDomain gets unloaded. In order to make that happen, the hosting environment must be able to notify application code that a shutdown is imminent so that the app can execute the code required to shut down outstanding conversations gracefully.

 

Traditional ASP.NET applications can hook application lifecycle events (application startup/shutdown) by implementing methods like Application_Start and Application_Stop in global.asax. However, global.asax is for application code. Infrastructure pieces (of which the WCF hosting system is one) need a mechanism of hooking AppDomain lifecycle events that do not involve dumping infrastructure code in your global.asax file. That space is reserved for you, the user, and it would be rude of use to pollute that with a bunch of hosting goo we need to make the whole thing work. Instead, the ASP.NET folks did some great work during the Whidbey release to open up the hosting API’s and make it easy for people like WCF to come along and hook these lifecycle events in a way that’s invisible to application code.

 

There are two possible events in the lifecycle of an AppDomain that platform infrastructure like WCF could potentially care about: startup and shutdown (duh). It turns out that we actually don’t care about startup at all because we lazy-init all of our infrastructure on receipt of the first WCF message by the app. There’s a common method that all of our transports call to activate a hosted service (ServiceHostingEnviroment.EnsureServiceAvailable() if you must know), and we use the first call to that method as the cue to execute whatever startup we need.

 

Shutdown notifications are handled by means of a callback mechanism. The important interface here is System.Web.Hosting.IRegisteredObject. This is the callback on which hosted AppDomains receive notifications of their imminent demise. This interface exposes one method – void Stop( bool immediate ). Applications can call System.Web.HostingEnvironment.RegisterObject() in their startup path to provide their implementation. When it’s time for the AppDomain to go away, the hosting environment will enumerate all the registered objects in the doomed AppDomain and call IRegisteredObject.Stop( false ) to notify them the app is about to die. Once all objects have been notified, the hosting environment starts a timer that waits a configurable amount of time for the app to clean itself up. Finally, the hosting environment enumerates the registered objects again – this time calling IRegisteredObject.Stop( true ) and then ultimately calls AppDomain.Unload().

 

WCF has an implementation of IRegisteredObject that we store as a singleton instance inside of the ServiceHostingEnvironment static. We lazy-init this instance during the first call to EnsureServiceAvailable() and call HostingEnvironment.RegisterObject() during its construction – the net effect is that we register exactly one IRegisteredObject per hosted AppDomain, and only then when we know that WCF is actually going to be active in that AppDomain.  This is also where we keep references to every instance of ServiceHost we spin up during the lifetime of the application.

 

When ASP.NET decides its time for our AppDomain to go away, it notifies us of this fact by calling IRegisteredObject.Stop( false ). In response to this, we enumerate all the running ServiceHost’s we know about and call BeginClose() on them. If there are active sessions in a given ServiceHost, the call to Close will block until the session either completes or times out. If all goes well, this whole process takes place within the time span of the ASP.NET shutdown timeout and every ServiceHost in the app gets closed gracefully. The response to the eventual IRegisteredObject.Stop(true) is a no-op in this case. If we take to long and the ASP.NET timeout expires, then we respond to the “shutdown now” request by calling Abort() on as many services as we can prior to ASP.NET unloading our AppDomain. There’s an inherent race here between Abort and Unload which we don’t try to fix. Our implementation needs to be robust in the face of AppDomain unloads anyway so this is not a problem for us from a reliability perspective – everything important gets cleaned up via critical finalizers when the AppDomain unloads anyway.

I think that's about enough on the whole recycling thing for now...sometime in a subsequent blog post I'll spend some time talking about how our port sharing and non-HTTP activation story plays into this whole hosting thing...

#1 James on 3.08.2006 at 5:00 PM

First of all, great article.In my web application, I have made long running threads relatively resilient to restarts. But the web application comes with a few services running as long threads in the background but after a period of inactivity for the web app, the application seems to shut down until a request is made. I had to have these long running threads because one of the deployment requirements is that this will need to be made installable on any given server and the same server should be able to have multiple copies of the application running for multiple clients. For this reason I've decided to implement the long running threads as part of the website.Do you have a recommended strategy for keeping the web application alive? On top of my head, i guess I could create a seperate windows service to poll the different installation for my web application.Any suggestion is appreciated.Cheers,James

#2 Steve Maine on 3.08.2006 at 10:43 PM

Hi James, thanks for the compliment.What are these long running threads doing? How are you starting them up? Send me mail if you want to discuss this offline.