Introduction: What does ‘foreach’ actually do?

 

It is not uncommon for a new group to want to use manage code to pepper the CLR team with performance questions.  They want to know how expensive ‘foreach’ is, or whether certain methods get inlined or a variety of other questions about the code quality of.

 

Sometimes, people look at the IL in attempt to answer questions about the code quality of managed code, however this approach is misguided.   The JIT compiler transforms the code tremendously going from IL to native, and thus for most performance questions the quality of the IL does not correlate strongly with the ultimate performance of the code.   For codegen performance questions, what matters most is the quality of the JIT compiler.

 

Often internal partners will simply ask us how questions about what code gets generated for a particular construct, and every time they do this I think ‘Why should I have to do this?  Run it under a debugger and see for yourself!’   This blog is the step-by-step instructions for doing just this in Visual Studio. 

 

Conceptually, the process of determining what Native code is generated by a construct is very straightforward

  1. Find or generate code that has the construct you wish to understand
  2. Compile the code with optimizations turned on
  3. Run it under a debugger, stepping to the code interesting code an view it in the disassembly window.

 

Unfortunately, while all of these are possible to do in Visual Studio, the defaults work against you and you have to know how to change them. 

 

To make things concrete, I will go through the steps for an example.  I happen to interested in looking at the the overhead of a foreach() for List<int>. 

 

[System.Runtime.CompilerServices.MethodImpl(System.Runtime.CompilerServices.MethodImplOptions.NoInlining)]

static void F(int arg) {}

 

static void ForEachTest(List<int> list) {

    foreach (int elem in list) {

        F(elem);

    }

}

 

Compiling the code with optimizations:

This part is almost straightforward in Visual Studio.  You would have liked it to be the case that simply compiling the ‘Debug’ to ‘Release’ in the Solution Configuration window at the top-center of the VS IDE would be sufficient.   Unfortunately this is simply not true for C# projects (I have not checked for VB or C++ projects).  

 

The issue is that by default release builds do NOT build the program data base (PDB) files that hold the line number and local variable information.  This makes it impossible for the debugger to correlate the assembly code back to line numbers.  This makes it much more difficult to read the assembly code for large methods.  To fix this going to the properties for the project (right click on the project in the Solution Explorer), select the ‘Build’ tab and click the ‘Advanced’ button at the bottom.  In the dialog box that comes up there will be a line ‘Debug Info’.  It should be set to ‘pdb-only’.  This tells the compiler to compile with optimizations, however also create a pdb file. 

 

With that done, make certain your Solution Configuration  window says ‘Release’ and compile your solution (F6). 

 

Running optimized managed code:

 

Sadly, Visual Studio makes debugging optimized code harder than you would like.  First, by default, if you launch a managed program under Visual Studio, it will force the JIT compiler to created non-optimized native code.  This improved debuggability, but makes it impossible to see that code that actually runs when the app is not run under Visual Studio.  The other problem is that Visual Studio has a feature called ‘Just My Code’ in which the debugger will not step into code that it does not believe is being developed.  In particular any optimized code is not considered ‘My Code’ and is skipped.  Again this makes it impossible to actually stop and see the optimized code.

 

To fix both of these problems

 

1)      Go to Tools -> Options -> Debugging -> General  

2)      Make certain that box labeled  ‘Suppress JIT optimization on module load’ is UNCHECKED.

3)      Also make certain the box labled ‘Enable Just My Code’ is also UNCHECKED.  

 

Note that these are global settings (the affect all solutions), however, I have found I don’t miss either of these ‘features’ (Code compiled ‘Debug’ is still not optimized by the JIT so you still get good debugging there).  

 

Once you have disabled this ‘features’, set a breakpoint at an interesting spot (F9) and run the program (F5).   

 

Examining the optimized code:

 

Once your breakpoint is hit, it is a simple matter of switching to the disassembly window (Ctrl D CtrlD, or Debug -> Windows -> Disassembly) to look at your code.

 

You should not see and ‘nop’ instructions in the code.  If you do this means you are looking at a ‘Debug’ build of your code.    It is not a bad idea to run a simple test by creating a function that returns a constant integer and confirm that it gets inlined in the resulting assembly code.  This confirms that all the preceding steps were successful.

 

You will notice that the assembly code does not have symbolic names for the call targets.  Often this is not a hardship, because you can deduce it from the corresponding source line associated with the assembly code.  If for things like ‘foreach’ or other constructs where the IL compiler is doing significant work, it is a hardship not to know what the symbolic name of the target is.    In the case of my example above the interesting code is

 

00000077  mov         ecx,dword ptr [ebp-24h]

                F(elem);

0000007a  call        dword ptr ds:[00913028h]

            foreach (int elem in list) {

00000080  lea         ecx,[ebp-30h]

00000083  call        7878BFF4

00000088  test        eax,eax

0000008a  jne         00000077

 

So we are pretty sure that we know that dword ptr ds:[00913028h] is really our F() method, however we don’t know what this 7878BFF4 that we call a bit further down.  You also might not know what instructions like ‘lea’ do, or how the ‘elem’ parameter is passed to our ‘F’ method.  

 

Luckily all of these issues (and others) will be the topic of my next blog entry. 

 

Recap:

 

In this blog entry I have given step-by-step instructions for using Visual Studio to look at optimized managed code.   Hopefully you will agree that is really is not that bad at all, and it now gives you the tool to answer a whole host of questions such as

 

1)      Is a particular method inlined or not?

2)      Does the JIT perform optimization X?

 

Note that the example above was for X86, but the technique above works just as well for X64.  Note that the X64 JIT compiler and the X86 JIT compiler are completely different, so you if you care about performance on both platforms, you will need to repeat your experiment on both platforms.