Customers reported the following WCF performance issue recently:

·        A WCF service over TCP or Named Pipe idles for over 15 seconds and the first request after that becomes very slow. Why?

This can be easily reproduced. Basically, if you send two consecutive requests to a WCF service over TCP or Named Pipe transport, you would see something as following:

First message invocation:

Latency for thread 0: 0.0758616 seconds

Sleep for 16 ms

Latency for thread 0: 0.0030432 seconds

However, if it sleeps for 15 seconds, you would get:

First message invocation:

Latency for thread 0: 0.0774229 seconds

Sleep for 15000 ms

Latency for thread 0: 0.4179955 seconds

Here it takes about 418 milliseconds to complete the second request. Sometimes this delay can be even longer. Why does this happen?

The main reason for this is that the CLR ThreadPool has a 15-second timeout for idle threads. It releases all of the I/O threads except for one so that this last one can process I/O requests immediately. If this last I/O thread is used somehow, the ThreadPool would take time to create new threads to handle new WCF requests and also introduced random delays when creating a new thread, even if the number of active threads is smaller than the MinIOthreads setting (can be set through ThreadPool.SetMinThreads). This is a known CLR issue and it is unfortunately that it is still not fixed in 4.0. For WCF services over TCP or Named Pipe, however, WCF does hold one I/O thread to manage timers for the sessionful channels internally. If you attached the debugger to the service, you would see the following active I/O thread:

0:009> !clrstack

OS Thread Id: 0x1b90 (9)

Child-SP         RetAddr          Call Site

000000001c3aedc0 000007fef4e0c29c System.Threading.WaitHandle.WaitAny(System.Threading.WaitHandle[], Int32, Boolean)

000000001c3aee20 000007fef4d9713a System.ServiceModel.Channels.IOThreadTimer+TimerManager.OnWaitCallback(System.Object)

000000001c3aee80 000007fef73395a9 System.ServiceModel.Channels.IOThreadScheduler+CriticalHelper+WorkItem.Invoke2()

000000001c3aef00 000007fef4d9708f System.Security.SecurityContext.Run(System.Security.SecurityContext, System.Threading.ContextCallback, System.Object)

000000001c3aef40 000007fef4d96fb0 System.ServiceModel.Channels.IOThreadScheduler+CriticalHelper+WorkItem.Invoke()

000000001c3aef90 000007fef4d96e32 System.ServiceModel.Channels.IOThreadScheduler+CriticalHelper.ProcessCallbacks()

000000001c3af000 000007fef4d96dc1 System.ServiceModel.Channels.IOThreadScheduler+CriticalHelper.CompletionCallback(System.Object)

000000001c3af050 000007fef8028815 System.ServiceModel.Channels.IOThreadScheduler+CriticalHelper+ScheduledOverlapped.IOCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)

000000001c3af080 000007fef730a71e System.ServiceModel.Diagnostics.Utility+IOCompletionThunk.UnhandledExceptionFrame(UInt32, UInt32, System.Threading.NativeOverlapped*)

000000001c3af0e0 000007fef8e1d502 System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)

 

Fortunately there is a workaround for this issue, as provided by Eric Eilebrecht. Basically you need to keep another I/O thread idle so that it won’t get timed out (in a 15-second internal). It is not ideal but it solves the problem. You can add the following code to the WCF service side:

    static class ThreadPoolTimeoutWorkaround

    {

        static ManualResetEvent s_dummyEvent;

        static RegisteredWaitHandle s_registeredWait;

 

        public static void DoWorkaround()

        {

            // Create an event that is never set

            s_dummyEvent = new ManualResetEvent(false);

 

            // Register a wait for the event, with a periodic timeout. This causes callbacks

            // to be queued to an IOCP thread, keeping it alive

            s_registeredWait = ThreadPool.RegisterWaitForSingleObject(

                s_dummyEvent,

                (a, b) => {

                    // Do nothing

                },

                null,

                1000,

                false);

        }

    }

Now you can invoke the above workaround method anywhere in your code:

            ThreadPoolTimeoutWorkaround.DoWorkaround();

 

This workaround uses a wait thread to wait for an event which is never set. When the thread times out (in one second), it queues up a packet to the IOCompletionPort. The packet does nothing but causes the IOCompletionPort to wake up and thus keeps the I/O thread from being timed out (with 15 seconds timeout as mentioned above). Since this is a repeating wait that will always time out, we’ll keep queueing packets to the IOCP, and keep at least one thread alive forever. As long as a single I/O thread stays “free” in this manner, a new I/O thread would be created quickly. Note that for .NET 3.5, you also need the QFE that I mentioned in my previous blog entry to be installed.

With this workaround, you would notice the following result:

First message invocation:

Latency for thread 0: 0.0300781 seconds

Sleep for 15000 ms

Latency for thread 0: 0.0032901 seconds    

The sample code is attached.