Throttling, generally speaking, is tricky. Get the limits low and you may be prone to DoS and clients timing out trying to connect to your service in vain; Get them high and you may end up with an overloaded service that’s eating up machine resources until it crashes. There’s a sweet spot in between that will give you an optimum throughput and high availability at the same time.
The ServiceThrottlingBehavior in WCF enables you to modify three important settings that you should consider tweaking to suit your application and resources. These settings are: MaxConcurrentCalls, MaxConcurrentInstances, and MaxConcurrentSessions. Since there are many considerations involved in choosing the values of these settings, and the fact that they may vary from a machine to another in a production environment, it’s recommended that you use the configuration file for your application (let it be app.config or web.config) to configure these limits. Here’s an example:
As you may have noticed I didn’t change the value of MaxConcurrentInstances and accepted the default which is 26. That’s because I set InstanceContextMode to Single, which means that there will be only one service instance. All calls are handled by this single instance and this can be a problem if the ConcurrencyMode is set to Single (the default value for this property) which means that the service is single-threaded and it can’t handle more than one call at a time, while other calls will have to wait until they get their turn or timeout. To avoid this problem, I set it to Multiple which allows the service to be multithreaded. Using multithreading comes with the usual responsibilities in design time (maintaining state consistency and avoiding synchronization problems) and in runtime (correctly throttling the service so that it doesn’t create many threads that can eat up machine resources).
Back to MaxConcurrentInstances: Its value depends on InstanceContextMode:
On the other hand, MaxConcurrentSessions and MaxConcurrentCalls depend on the SessionMode of the service contract, there are 3 possible values:
By default, all operation initiate sessions (according to the SessionMode of the service contract) but none is terminating, hence, the first call always initiates a session. If MaxConcurrentSessions is 100 and your client don’t terminate the session, your service will only handle 100 sessions then all subsequent messages will timeout. A client can close the session by calling one of the following methods:
The client should be a good citizen and always close the connection, even if the operation is terminating. The advantage of the 3rd option here is the enforcement of the termination by the service contract, so even if the client didn’t behave as expected, the service sets a timer and the channel aborts the client after a certain period. Setting the IsTerminating attribute to true in an operation contract require the SessionMode of the service contract to be set to Required.
The default values of MaxConcurrentSessions and MaxConcurrentCalls are 10 and 12 respectively. The higher you go with these values, the higher the throughput will be. You will need to understand the rates of resources consumption at higher throughput rates and the correlation between them (for example an exponential growth means that you have a problem). Also, the nature of the operations that your service execute play a big role too (whether they are I/O intensive or CPU intensive). On the other hand, leaving the values low makes your service prone to DoS attacks or mistakes like clients not closing the sessions. IMO, the following would help you find that sweet spot: