We’ve had a few users complaining about the throughput of the adapters (the Adapter Pack ones, and/or custom ones) coming to a standstill during normal operation within BizTalk. In this post, I’ll explain why that happens, and some workarounds.
Let’s say BizTalk has received 100 messages which it needs to send to the adapter on a Send Port. Assume that it queues 50 work items on 50 thread pool threads, with each work item containing 2 messages to be processed.
Now, on each of the above thread pool threads, the logic used is something similar to:
1. for (int i=0; i < numberOfMessages; i++)
2. {
3. IRequestChannel channel = CreateNewIRequestChannel(); 4. channel.Open(); 5. channel.BeginRequest(message[i], callbackFunction);
3. IRequestChannel channel = CreateNewIRequestChannel();
4. channel.Open();
5. channel.BeginRequest(message[i], callbackFunction);
6. }
IRequestChannel above maps to an adapter channel – IOutboundHandler in the Adapter SDK.
Now, many of the LOB systems for which the adapters are written are do not have an asynchronous version of the LOB API or SDK, i.e., the LOB API is synchronous – blocks the current thread while the invocation occurs. However, as can be seen above, a call to BeginRequest() is made, which means that the caller wants the adapter channel implementation to return immediately. Therefore, in the Adapter SDK, the implementation is:
7. BeginRequest(Message message, Callback callbackFunction)
8. {
9. ThreadPool.QueueUserWorkItem(channel.Request, message, callbackFunction); 10. return immediately;
9. ThreadPool.QueueUserWorkItem(channel.Request, message, callbackFunction);
10. return immediately;
11. }
What the Adapter SDK does is, since it needs to return immediately, it queues an additional work item in the thread pool queue. The work item, when run, will call Request(), which is a synchronous call, which maps to IOutboundHandler::Execute() in the adapter, with the message as the parameter. Once the adapter finishes processing the message (synchronously) on the thread pool thread, the Adapter SDK takes care of invoking the callback function to notify BizTalk that the message has completed processing, and to hand the response message to BizTalk.
Also, one more piece of information – the channel.Open() call in line number 4 above, is handled by the Adapter SDK. When the Open() call reaches the Adapter SDK, it tries to obtain a free connection from the connection pool, or create a new connection to the LOB provided the maximum number of connections hasn’t been reached. If it is unsuccessful in both attempts, it blocks until a connection becomes available. Note - The adapters in the Adapter Pack expose a setting (typically named MaxConnectionsPerSystem, or MaxPoolSize, etc); the value of which is passed on to the Adapter SDK via a setting it exposes; custom adapters might expose something similar which the end user can tweak.
Given the two blocks of code above, and the behavior of the Open() call, it is now easy to come up with a situation where you can see a throughput stall.
Assume that the maximum number of threads in the thread pool is 100. Also, let’s say that the maximum number of connections allowed to the LOB system is 10. You receive a large number of messages from your Receive Location (for example, 500?), and BizTalk, in all its enthusiasm, queues up all messages (one message per work item – lets refer to these work items as W1) on threads from the thread pool. Therefore, you have 100 threads (suppose, since its the maximum number of allowable threads in the thread pool), all of them beginning to execute lines 1 to 6 above. Only 10 threads are able to proceed past line number 4, since they were able to open a connection to the LOB (your maximum number of connections is configured to be 10, remember?). The other 90 threads are blocked at line number 4, waiting for a connection to become available.
Of the 10 which passed 4, they proceeded to 5, which means they now enter lines 7 to 11. They all queue up work items (lets call these work items W2) on the the thread pool (courtesy line number 9), and return immediately. These 10 threads are now freed up, which enables them to go and pick more items to work on from the thread pool queue.
Now, the actual processing against the LOB (and the real usage of the connection which was successfully opened in step 4) only happens when W2 gets a chance to do its work. However, the 10 threads which were freed up, they won’t process W2. Why? Because ahead of W2, there are yet 400 more W1 work items (BizTalk had queued 500 items, remember) in the queue. Hence, the 10 threads will pick up 10 more W1 work items. These work items of course can’t progress since they are in the same state as the other 90 – there is no connection available. And a connection won’t become available until a W2 work item will relinquish it, but it can’t since it can only do its work once a thread becomes available, and that will only happen when one of the current 100 threads (which are all stuck on line 4) gets freed up, and …… and you have a thread pool starvation problem.
How do you work around this? A number of ways, and possibly, you need to use a combination of some of them (at least the ones in your control) to work around this:
The safest way would be to ensure that the number of messages BizTalk hands to the adapter (lines 1 to 6) is the same as the maximum number of connections allowed to the LOB. I’m not 100% sure, but I think the In-Process Messages Per CPU setting can be used for this purpose; I’ll have to go through all the documentation for the throttling settings a little more thoroughly – meanwhile, you can do the same (the link available in point number 3 above seems to be the place to start).