Background:

 

Customer has a WCF service which communicates with a COM developed by VB6.  Customer complains that the performance is poor after it runs for some time, and the WCF response becomes longer and longer.

We capture hang dump and found most of WCF worker threads are pending on a call to the STA thread of VB6 COM.

 

Root Cause:

 

The WCF worker thread is a kind of MTA thread by default. So when multiple WCF MTA threads try to communicate with a  STA COM thread, all the MTA threads are serialized to marshal with STA COM thread via proxy&stub, just as below diagram shows:

 

Let’s explain this in a plain text:

 

Suppose we have 100 WCF worker threads which all communicate with the STA COM concurrently. So 99 threads will be queued while only 1 active WCF worker thread is marshaling with STA COM thread. This could be a potential performance issue.

We can reproduce this issue by WCF service calling a STA COM developed by VB6.

The typical call stack of WCF MTA thread that is pending on target STA thread looks like below:

 

0447e9c8 7c827b79 ntdll!KiFastSystemCallRet

0447e9cc 77e61d1e ntdll!ZwWaitForSingleObject+0xc

0447ea3c 77e61c8d KERNEL32!WaitForSingleObjectEx+0xac

0447ea50 7769c7a5 KERNEL32!WaitForSingleObject+0x12

0447ea6c 7778b5cb ole32!GetToSTA+0x7c

0447ea8c 7778c38b ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+0xcb

0447eb6c 776c0575 ole32!CRpcChannelBuffer::SendReceive2+0xd3

0447ebd8 776c050a ole32!CAptRpcChnl::SendReceive+0xab

0447ec2c 77ce347f ole32!CCtxComChnl::SendReceive+0x1a9

0447ec48 77ce352f RPCRT4!NdrProxySendReceive+0x43

0447f038 77ce35a6 RPCRT4!NdrClientCall2+0x206

0447f058 77c65037 RPCRT4!ObjectStublessClient+0x8b

0447f068 79f18cf0 RPCRT4!ObjectStubless+0xf

0447f150 009cb982 mscorwks!CLRToCOMWorker+0x19a

WARNING: Frame IP not in any known module. Following frames may be wrong.

0447f224 5136ae48 <Unloaded_ure.dll>+0x9cb981

0447f2d8 5136a662 System_ServiceModel_ni+0xb6ae48

0447f334 50af00c3 System_ServiceModel_ni+0xb6a662

 

Solution:

 

We need to enforce WCF worker thread to STA apartment model when communicating with the STA COM thread. After that, no apartment model switch is required, and the call to STA COM will be processed under the same WCF STA thread context.  Multiple WCF worker threads will communicate with STA COM within each own thread context concurrently.

The sample code looks like below:

 

using System;

using System.Collections.Generic;

using System.Linq;

using System.Text;

using System.ServiceModel.Description;

using System.ServiceModel.Dispatcher;

using System.Threading;

 

    public class STAOperationBehaviorAttribute : Attribute, IOperationBehavior

    {

         public void AddBindingParameters(OperationDescription operationDescription,

           System.ServiceModel.Channels.BindingParameterCollection bindingParameters)

        {

         }

         public void ApplyClientBehavior(OperationDescription operationDescription,

           System.ServiceModel.Dispatcher.ClientOperation clientOperation)

        {

        }

         public void ApplyDispatchBehavior(OperationDescription operationDescription,

           System.ServiceModel.Dispatcher.DispatchOperation dispatchOperation)

        {

            // Change the IOperationInvoker for this operation.

             dispatchOperation.Invoker = new STAOperationInvoker(dispatchOperation.Invoker);

         }

         public void Validate(OperationDescription operationDescription)

        {

             if (operationDescription.SyncMethod == null)

            {

                 throw new InvalidOperationException("The STAOperationBehaviorAttribute " +

                     "only works for synchronous method invocations.");

             }

         }

     }

 

    public class STAOperationInvoker : IOperationInvoker

    {

         IOperationInvoker _innerInvoker;

         public STAOperationInvoker(IOperationInvoker invoker)

        {

             _innerInvoker = invoker;

         }

         public object[] AllocateInputs()

        {

             return _innerInvoker.AllocateInputs();

         }

         public object Invoke(object instance, object[] inputs, out object[] outputs)

        {

             // Create a new, STA thread

             object[] staOutputs = null;

             object retval = null;

             Thread thread = new Thread(

                 delegate()

                {

                     retval = _innerInvoker.Invoke(instance, inputs, out staOutputs);

                 });

             thread.SetApartmentState(ApartmentState.STA);

             thread.Start();

             thread.Join();

             outputs = staOutputs;

             return retval;

         }

         public IAsyncResult InvokeBegin(object instance, object[] inputs,

           AsyncCallback callback, object state)

        {

             // We don't handle async...

             throw new NotImplementedException();

         }

 

        public object InvokeEnd(object instance, out object[] outputs, IAsyncResult result)

        {

             // We don't handle async...

             throw new NotImplementedException();

         }

         public bool IsSynchronous

        {

             get { return true; }

         }

     }

 }

 [STAOperationBehavior]

 public string SleepSTA()

        {

            VBCOMSleep.HelloWorldClass obj = new VBCOMSleep.HelloWorldClass();

            return result;          

        }

 

References:

Running ASMX Web Services on STA Threads
http://msdn.microsoft.com/en-us/magazine/cc163544.aspx

Calling an STA COM Object from a WCF Operation
http://www.scottseely.com/Blog/09-07-17/Calling_an_STA_COM_Object_from_a_WCF_Operation.aspx

FIX: Severe Performance Issues When You Bind Session State to Threads in ASPCompat Mode
http://support.microsoft.com/default.aspx?scid=kb;EN-US;817005

 

 Regards,

RenHe from APGC DSI Team