Disclaimer: The article below discusses pre-release features. Some of these features could change on the RTM version.
The sample scenario we considered here is the hypothetical page in which we need to aggregate the results of five remote feeds, in a second tier. Requesting the feeds has a latency, which we set to 100ms in our test. Requesting this asynchronously have a clear advantage, since five request will take around 100ms. If we requested them synchronously, in sequence, it will take 500ms. The figure below shows the scenario:
Asynchronous request in ASP.Net 4.0 has some scalability issues when a huge load (beyond the hardware capabilities) is put on such scenario. The problem is due to the nature of allocation on asynchronous scenarios. In these conditions, allocation will happen when the asynchronous operation starts, and it will consumed when it completes. By that time, it’s very possible the objects have been moved to generation 1 or 2 by the garbage collector.
When this happens, increasing the load will show increase on request per second (rps) until a point. Once we pass that point, the time spend in the garbage collector will start to become a problem and the rps will start to dip, having a negative scaling effect. The figure below shows the effect of increasing the load in our particular scenario. The pick is achieved in the sample data with 770 clients.
To try to fix this problem, we implemented a throttling mechanism in 4.5. Simply, when the CPU usage went beyond a point, requests were sent to the ASP.NET native queue. This queue already existed, for IIS integrated pipeline mode only, and is used also to limit concurrency based on the parameters maxConcurrentRequestsPerCPU and maxConcurrentThreadsPerCPU in aspnet.config (or through registry keys).
We added an additional setting in aspnet.config for the percentage of CPU usage we will monitor to start the throttling. The new setting is called percentCpuLimit. So whenever the CPU usage goes beyond that point, we will start sending some of the request to the queue, effectively reducing the concurrency and the amount of managed buffers that get allocated and collected by the GC. This improved considerably the scalability of this type of scenarios in our tests.
The default value of the setting is 99%. To disable the throttling, set percentCpuLimit to 0:
<!-- ... -->
In order to know if throttling is happening, the existing “ASP.NET\Requests Queued” counter can’t be used. The counter includes the requests waiting for a CLR thread pool thread and the requests in the ASP.NET queue. So we added the counter below:
“ASP.NET\Requests In Native Queue”
Requests queued in ASP.NET because the concurrency limits have been exceeded.
Note the new counter will also include requests queued for any other concurrency limiting setting. That is maxConcurrentRequestsPerCPU and maxConcurrentThreadsPerCPU. The checks are made in that order, then percentCpuLimit is checked.
There are a few settings that affect this type of scenarios. Some of them in the framework, some of them in IIS/http.sys. We review them and tried to change the default values of some of them, so they work better out of the box.
System.Net.ServicePointManager.DefaultConnectionLimit: This setting controls how many concurrent outbound connections you can start with System.Net classes, such as HttpWebRequest. The setting can also be set in machine.config setting <system.net/connectionManagement/add[@address='*'] maxconnection />, as long as <processModel autoConfig=false />. In .Net 4.0 we used to set this parameter to 12 times the number of logical CPUs, when autoConfig=true. We have changed that in 4.5 to Infinite (actually is int32.MaxValue).
requestQueueLimit: This setting can be set in <processModel requestQueueLimit /> on machine.config or in <system.web/applicationPool requestQueueLimit /> in aspnet.config. The setting used to work on 4.0 and below as a fix setting, meaning that once we reach that many total requests (5000 by default) we started to return 503 status code. We fixed this in ASP.NET 4.5 to return 503 when the number of requests on the ASP.NET queue is bigger than this value, rather than using the total number of requests.
Http.sys queueLength: <system.applicationHost/applicationPools/applicationPoolDefaults.queueLength /> parameter can be used in IIS to set this. The default in IIS 7.5 is 1000. This can sometimes not be sufficient on a managed application with heavy garbage collection. This is cause by some GC threads using higher priority threads, which can make the IIS threads processing the queue to starve for CPU. However, the setting affects non-paged memory. The default value was not changed. If your web site returns 503 status code, you can try increasing this value.
System.Data MaxPoolSize: The number of database connection on the pool can also affect the scalability of these types of scenarios. The value can be set inside the connection string. The value to choose depends primarily by the Sql server, rather than the client machine, so the default was not changed in 4.5.
To review, a new concurrency setting was added in 4.5 in aspnet.config named percentCpuLimit, to alleviate async scenarios scalability. A new performance counter named “ASP.NET\Requests In Native Queue” was also added, to know when the throttling is happening. And some default values were changed for some .Net settings. All this is only valid for IIS integrated pipeline mode.
Does this mean if we set the CPU throttle somewhere around 80-90% that we can finally use Tasks/threads without the risk of starving our IIS server of worker threads as they overflow will just queue requests instead of crashing?
The CPU check happens on the CLR worker thread. So for the check to happen your request needs to have a worker thread allocated. The thread will be released quickly if the requests is queued (if the requests is not queued, it will be allowed to continue to execute managed code). But you still might have a backlog of requests coming and being queued, all those needs a threads before they are queued. So use the feature with caution. It hasn't been tested for that scenario.
@josere2 that's very disappointing to hear it has such a large caveat. When is iis EVER going to have the appropriate level of support to do parallelization inside http requests.
I don't understand why IIS makes nearly impossible (sanely that is) to do anything other than serve serial synchronous requests, or calling asynchronous web services.