How fast is interop code? If you’re in one kind of code and your calling another, what is the cost of the interop?
For example, .Net code can call native C++ code (like Windows APIs) and vice versa. Similarly with Foxpro and C++ code. .Net code is often referred to as Managed code because much is managed for the programmer, such as memory allocation. That leaves C++ code to be called “Unmanaged”. An easy way to interop with C++ code is to use COM (Component Object Model, or sometimes ActiveX) as glue. Whether it’s COM calling .Net or vice versa, the managed boundary is traversed twice: there and back. Similarly with Fox code calling COM code.
Fox code calling .Net code (e.g. A Visual Basic COM object is simple to create, call and debug from Excel) will have both Fox to COM and COM to .Net interop.
I want to measure raw interop performance, so I want to remove memory allocation and Unicode/String marshalling issues from the tests. I want to have a loop on one side call a very fast method on the other, so that most of the execution time is in the interop, not the loop or the method call. I want to use in-process, same thread calls, so remote procedure calls/marshalling are not being measured.
We’ll create a native C++ method that just returns consecutive integers. A simple loop in the .Net or Fox client that calls this method keeps a running total would be a good perf test.
Start with this sample ActiveX control code: Create an ActiveX control using ATL that you can use from Fox, Excel, VB6, VB.Net. You don’t need the events and methods from that sample, just the control itself.
(If you’re using VS2008, in the ATL Project wizard, select DLL and just choose “Finish”. When adding a method in Class View, make sure to choose the ITestCtrl Interface (defined in MyCtrl.IDL, not ITestCtrl VCCodeStruct defined in MyCtrl_i.h. Similarly, if you’re adding an event, make sure to choose the _ITestCtrlEvents interface under MyCtrlLib in Class View. Also, you need to run the “Implement ConnectionPoint Wizard” and change the call to “Fire_MyEvent”, see http://msdn.microsoft.com/en-us/library/9h7xedd1.aspx)
When COM code is called from VB.Net or FoxPro, the calls are not quite direct: COM is used for creating the object and initialization and there is some parameter/return value massaging required per call. Then it’s either a straight virtual function call (vTable) call to IDispatch (late bound) or IUnknown (early bound). IOW, the performance would be slower than the a direct PInvoke or DECLARE DLL call.
Let’s add a simple method RetInt with no parameters that just returns an int. Add a method to our COM Control by right clicking on the ITestCtrl interface in Class View and choosing Add->Method to start the “Add Method Wizard”
Since all COM interface method calls return HRESULTS, to return a value an additional parameter is added and marked with the RetVal attribute and is passed by ref. So make the Method Name “RetInt”, the Parameter type “LONG *”, and the Parameter Name “RetVal”. Choose the Retval checkbox. Then choose “Add” to add the param to the method.
Add another method DoSum similarly. This method will run with no interop, so we have a baseline for comparison. (It runs the loop multiple times because it goes so much faster, but the timing measurement divides out the multiple runs.)
The resulting code is added to TestCtrl.CPP. Add the implementation:
static LONG g_Int = 0;
STDMETHODIMP CTestCtrl::RetInt(LONG* RetVal)
*RetVal = ++g_Int; // just return consecutive integers
// DoSum will calculate the value with no interop whatsoever
STDMETHODIMP CTestCtrl::DoSum(LONG nTimes,LONG nInternalLoopCount, DOUBLE* Retval)
for (LONG j = 0 ; j < nInternalLoopCount ; j++) // this code runs so fast we have to do it multiple times
nSum = 0;
for (LONG i = 1 ; i <= nTimes ; i++)
nSum += i;
*Retval = (DOUBLE)nSum;
// RetIntStatic can be called directly via PInvoke or Declare Dll
extern "C" HRESULT __declspec(dllexport) WINAPI RetIntStatic(LONG *RetVal)
*RetVal = ++g_Int;
You can add more methods, like a way to reset g_Int to get more accurate results, but I don’t really care about the results, just how long it takes to get them.
Of course, you’ll want to run perf tests using optimized Release builds, so you’re not including debug asserts, etc. A really smart optimizing compiler would remove the loops in DoSum altogether!
If you have Foxpro, try running this Fox code. Notice that DoLoop can take either the Form or the Control as a parameter. There’s a RetInt method on each.
MODIFY COMMAND PROGRAM() NOWAIT
_screen.FontName="Courier New" && Make font monospace, not proportional
SET DECIMALS TO 6
zObj=ox.oc && use temp var so we don't deref ox.oc in loop
r = zObj.DoSum(nLoops,nInternalLoopCnt)
?"Internal DoSum ",r,(SECONDS()-ns)/nInternalLoopCnt
?DoLoop(ox,nLoops, "With No Interop" )
?DoLoop(ox.oc,nLoops,"With COM Interop")
*Use early binding:
?DoLoop(oy,nLoops,"With COM Interop Early Bound")
*Try Declare DLL: like PInvoke
DECLARE integer _RetIntStatic@4 IN "d:\dev\vc\myctrl\release\myctrl.dll" as RetIntStatic integer @ Retval
?DoLoopStatic(nLoops,"With DeclareDLL interop")
FUNCTION DoLoop(zObj as object,nTimes as Integer, sDesc as String) as String
FOR i = 1 TO nTimes
nSum = nSum + zObj.RetInt()
RETURN sDesc+" Sum= "+TRANSFORM(nSum) + " "+ TRANSFORM(SECONDS()-ns)
FUNCTION DoLoopStatic(nTimes as Integer, sDesc as String) as String
nSum = nSum + nRetval
DEFINE CLASS MyForm as Form
ADD OBJECT OC as olecontrol WITH ;
PROCEDURE RetInt as Integer
g_Int = g_Int+1
The DoSum call (Fox and VB) was consistent as expected: they both execute in about the same time because there is only one interop call.
I consistently saw the COM Interop loop taking about 50% longer than the non interop loop. This makes sense. The code that calls the COM object has to deal with all sorts of parameter types, marshalling, etc. The non interop did the entire calculation within Fox code.
The DoSum method has its own internal loop to do the calculation, which does NO interop of any kind in the loop, runs roughly 2000 times faster. That implies there are about 2000 times more instructions executed in the loop.
Now I want to run a similar test using VB.Net. Let’s add a new project to the ActiveX control project from above.
Choose the Solution Explorer, right click on the solution, choose Add New Project, VB->Windows Forms Application. I put my VB Project within the folder of the TestCtrl project.
Right click on the project, and choose “Set As Startup Project” so hitting F5 will start this project.
If you’re on a 64 bit OS, then make sure you target x86 (Project->Properties->Compile->Advanced Compile Options->Target CPU->x86
Add the ActiveX control to your toolbox: Right click on the toolbox, choose items\COM Components…TestCtrl class.
Now drag the control from the toolbox onto the form. Dbl Click on the form and paste in this code:
Public Class Form1
'Note the path: "..\..\..\Release\MyCtrl.dll"
Friend Shared Function RetIntStatic(ByRef RetVal As Integer) As Integer
Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
Dim nLoops = 1000000
Dim nInternalLoopCnt = 1000
Dim sStopWatch = Stopwatch.StartNew
Dim r = Me.AxTestCtrl1.DoSum(nLoops, nInternalLoopCnt)
Console.WriteLine("Internal DoSum Native=" + r.ToString + " " + (sStopWatch.ElapsedMilliseconds / 1000 / nInternalLoopCnt).ToString)
sStopWatch = Stopwatch.StartNew
r = Me.DoSum(nLoops, nInternalLoopCnt)
Console.WriteLine("Internal DoSum .Net =" + r.ToString + " " + (sStopWatch.ElapsedMilliseconds / 1000 / nInternalLoopCnt).ToString)
Console.WriteLine(DoLoop(Me, nLoops, "With Late bound No interop, calling local VB.Net method"))
Console.WriteLine(DoLoop(Me.AxTestCtrl1, nLoops, "With Late bound COM interop"))
Console.WriteLine(DoLoopEarlyForm(Me, nLoops, "With Early bound No interop, calling local VB.Net method"))
Console.WriteLine(DoLoopEarlyCtrl(Me.AxTestCtrl1, nLoops, "With Early bound COM interop"))
Console.WriteLine(DoLoopPInvoke(nLoops, "With PInvoke interop"))
Function DoLoop(ByVal zObj As Object, ByVal nTimes As Integer, ByVal sDesc As String) As String
Dim nSum = 0L
For i = 1 To nTimes
nSum += zObj.RetInt
Return sDesc + " Sum = " + nSum.ToString + " " + (sStopWatch.ElapsedMilliseconds / 1000).ToString()
Function DoLoopEarlyForm(ByVal zObj As Form1, ByVal nTimes As Integer, ByVal sDesc As String) As String
Function DoLoopEarlyCtrl(ByVal zObj As AxMyCtrlLib.AxTestCtrl, ByVal nTimes As Integer, ByVal sDesc As String) As String
Dim nSum = 0L 'L for Long so doesn't overflow 32 bits
Function DoLoopPInvoke(ByVal nTimes As Integer, ByVal sDesc As String) As String
Dim RetVal = 0
nSum += RetVal
Private Shared g_Int As Long
Public Function RetInt() As Long
g_Int += 1
Function DoSum(ByVal nTimes As Long, ByVal nInternalLoopCount As Long) As Double
Dim nSum As Long
For j = 1 To nInternalLoopCount ' calculated multiple times because it's fast
nSum = 0
For i = 1 To nTimes
nSum += i
Run the code with the Output Window visible. Here, the VB code with interop ran maybe 40% slower, and several times slower than the Fox code. I realized that this was because of the late binding calls the VB code does. The VB DoLoop method takes zObj as an Object, and I invoke the RetInt method on it. That means, the VB runtime latebinder code is called to reflect on the object and see if it has a Retint method on it that can be called. Both the Form and the control have a method with this name. The latebinding was code that I didn’t want to measure, so I added some strongly typed calls that forced the calls to be early bound direct calls, which were much faster. For the Non-interop code doing the entire calculation within VB, the late bound was around 1000 times slower than the early bound, due to the late binder code. For the Interop case, the late bound was about 30 times slower than the early.
(Comparing .Net speed with native, the DoSum call (with no interop at all) in .Net was almost 3 times slower than Native, but that’s expected too: native code runs faster than managed.)
These early bound calls are several times faster than the Fox code too: they don’t have to do any parameter packing/checking.
However, even the Fox code doing early binding, Fox still has to do a lot of parameter translation between fox types and COM types.
The Fox and VB calls via PInvoke/Declare DLL were the fastest of all. They have to do the least parameter translation/packing/checking. This makes sense: the method call is declared to have N parameters of certain types, so less work needs to be done.
Using ILDasm to see the IL for the RetInt, you can see that there isn’t much code. The Fox code for RetInt, however, causes much more code to run.
.method public instance int64 RetInt() cil managed
// Code size 24 (0x18)
.locals init ( int64 RetInt)
IL_0001: ldsfld int64 WindowsApplication1.Form1::g_Int
IL_0009: stsfld int64 WindowsApplication1.Form1::g_Int
IL_000e: ldsfld int64 WindowsApplication1.Form1::g_Int
IL_0014: br.s IL_0016
} // end of method Form1::RetInt
Or use the debugger to see the native code in DoSum: (cdq is ConvertDoubleToQuadWord)
for (LONG i = 1 ; i <= nTimes ; i++)
692B1DF1 8B C1 mov eax,ecx
692B1DF3 99 cdq
692B1DF4 03 D8 add ebx,eax
692B1DF6 13 EA adc ebp,edx
692B1DF8 8D 41 01 lea eax,[ecx+1]
692B1DFB 99 cdq
692B1DFC 03 F0 add esi,eax
692B1DFE 8B 44 24 20 mov eax,dword ptr [esp+20h]
692B1E02 13 FA adc edi,edx
692B1E04 83 C1 02 add ecx,2
692B1E07 48 dec eax
692B1E08 3B C8 cmp ecx,eax
692B1E0A 7E E5 jle CTestCtrl::DoSum+21h (692B1DF1h)
692B1E0C 3B 4C 24 20 cmp ecx,dword ptr [esp+20h]
692B1E10 7F 0B jg CTestCtrl::DoSum+4Dh (692B1E1Dh)
nSum += i;
692B1E12 8B C1 mov eax,ecx
692B1E14 99 cdq
692B1E15 89 44 24 10 mov dword ptr [esp+10h],eax
692B1E19 89 54 24 14 mov dword ptr [esp+14h],edx
This (optimized) code sums 32 bit values to a 64 bit running sum, so you can see instructions like “ADC”, which is AddWithCarry.
As an exercise, on 64 bit, create code like DoSum that natively handles 64 bit ints (or modify this code to use just 32 bits). You’ll see that the loop is trivial.
Hint: make sure you have the 64 bit tools installed.