Foreach, Garbage, and the CLR Profiler

 I've been doing a lot of thinking lately about foreach, and in what situations it may or may not create garbage. There have been some posts on the forums asking about it, and I recently found out that for some types, the enumerator that is created is a value type, not a reference type. So, foreach loops over those types shouldn't make garbage at all. Why does this matter? In an XNA Framework forum thread, Stephen Strychak, an XNA Framework Developer, explains:

"...the performance of your game will degrade if you allow garbage to be generated during gameplay. The problem again is not the for or foreach. The problem is that the garbage collector on the Xbox 360 is less efficient than the one on Windows, and doing a collection can take a long time. When you have many objects in your game, it can take longer than 1/60th of a second to do a collection. That means that you will drop frames. The more garbage you generate, the more frequently the GC will have to do a collection, and the more frequently your game will drop frames.

So the bottom line is that you should avoid generating garbage, and using for instead of foreach is one way to do that."

Shawn posted this in a follow up, which is an important point:

"...bear in mind that just because foreach creates garbage, this does not neccessarily mean you should avoid using it! I think people often get too hung up on optimisation. Fast code is good, but readable code is more important. If your program is already running fast enough, it's sometimes ok to do things the easy way, even if you know that is less efficient than it could be..."

I agree with Shawn: when I write code, I want it to be maintainable, easy to read, and easy to modify. But still, it's good to know what things may have a potential speed impact. And plus, since I'm a nerd, I wanted to look into this: the whole situation seemed like a lot of special cases and conjecture: you will make garbage if you go over this collection, but you won't if you go over that collection (but only if x is true). I wanted to get the whole story. Plus, this was an ideal reason to learn to use the Windows and Xbox 360 CLR profiling tools, which I've put off doing for some time. So, I figured I'd do just that: do a foreach loop over a bunch of collections, use the profilers to see if I'm making garbage, and hopefully we'd all learn something in the process.

I started with Windows, so the first thing I needed was the Windows CLR Profiler. I needed to learn how to use it, too. It comes with a pretty hefty document explaining it, but I found this link on msdn tv, where a friendly man gave me an overview. It's about half an hour long, and was worth watching, I think.

After watching the video, I felt fairly confident that I would be able to get the information I needed. It was time to start mucking around, to see what I can find out. In C# express, I created a new console app, and filled it in with what is quite possibly the world's dumbest code ever:

class Program
{
class GameEntity
{
public void Update()
{
}
}

static GameEntity[] entities = new GameEntity[100];
static Program()
{
for (int i = 0; i < entities.Length; i++)
{
entities[i] = new GameEntity();
}
}

static void Main(string[] args)
{
byte[] byteArray = new byte[1];
for (int i = 0; i < entities.Length; i++)
{
entities[i].Update();
}
}
}

I wanted to do this for a quick sanity check. If I had any idea what I was doing, when I use the CLR profiler on this app, I should see Main making one allocation, the 1 byte array. So I built it, and ran the CLR Profiler on it. Once it was done, I pulled up the allocation graph using by clicking this button:

 

My first look at the allocation graph was fairly terrifying. It's full of colored lines and boxes for methods I'm fairly sure I didn't write.

 

After fiddling for a minute, it turned out that as always, right clicking is the answer. I did "Find routine" and looked for Main, which gives me this view. Much better.

I've got no idea why my 1 byte array actually took up 13 bytes, but I'm not particularly worried about it. (I'm curious to know, however, so if anyone out there in readerland has any ideas, send them my way.)

 

As my next sanity check, I changed my program from an array to a Collection<GameEntity>, and changed Main to do a simple foreach over the Collection:

 

static void Main(string[] args)
{
byte[] byteArray = new byte[1];
foreach (GameEntity e in entities)
{
e.Update();
}
}

Under the hood, that should create an enumerator on the managed heap, which should show up in the profiler. So, following the exact same steps in the CLR profiler, I get this display:

There's that pesky enumerator right there, just as we expected!

Now, here comes the tricky part. Rumor has it that many of the types in System.Collections.Generic will not allocate an enumerator when using foreach. List's GetEnumerator, for example, returns a struct, which will just sit on the stack. Look for yourself with .NET Reflector, if you don't believe me. To prove to myself that a foreach over a List doesn't cause any heap allocations, I changed entities to be a List, did the exact same foreach loop, and ran the profiler. No enumerator! I tested a few other types as well, LinkedList and Queue, with the same result. Foreach loops over most of the types in System.Collections.Generic do not generate garbage. The most prominent exception is Collection. Collection<T> cannot declare its enumerator type as a struct, because Collection<T> is designed to be easily overridden.

However, there is definitely a caveat to the above. Foreach loops over Lists can still generate garbage. Take this code, for example:

const int NumEntities = 100;
static List<GameEntity> list = new List<GameEntity>();

static Program()
{
for (int i = 0; i < NumEntities; i++)
{
list.Add(new GameEntity());
}
}

static void Main(string[] args)
{
UpdateIEnumerable(list);
}

private static void UpdateIEnumerable(IEnumerable<GameEntity> enumerable)
{
foreach (GameEntity e in enumerable)
{
e.Update();
}
}

this will create garbage. Even though we're still doing a foreach over a List, when the list is cast to an interface, the value type enumerator must be boxed, and placed on the heap.

So, as far as I know, here's the final word:

When doing a foreach over a Collection<T> an enumerator WILL be allocated.
When doing a foreach over most other collections, including as arrays, lists, queues, linked lists, and others:
    if the collections are used explicitly, an enumerator will NOT be allocated.
    if they are cast to interfaces, an enumerator WILL be allocated.

Hope this helps some people, and maybe you're a little less terrified of the CLR profiler. Next time, I'd like to do the same thing for the .NET CF on the Xbox 360, or maybe figure out how to use the CLR profiler to track down the methods that are doing the most allocations. I have a pretty terrible track record with "Next times", though, so don't keep your fingers crossed.

Published 17 March 07 01:20 by etayrien

Comments

# Ultrahead said on March 17, 2007 12:30 AM:

Nice read.

One question: which would be the results if you use, say, a "List<IGameEntity> entities" collection? With something like:

foreach (IGameEntity e in entities)

{        

  e.Update();    

}

# snprbob86 said on March 17, 2007 1:23 AM:

Very interesting! Don't forget to investigate the "yield" key word :-)

# snprbob86 said on March 17, 2007 1:24 AM:

Oh, and I forgot to mention: You are missing sign-in/log-out links on your blog. Since you have annonymous comments dissabled, I had to go to another MSDN blog to log in, then come back in order to post a comment.

# zygote said on March 17, 2007 1:56 AM:

Nice article. It seems foreach() isnt as bad as people have been making it sound :)

# Mykres Space said on March 17, 2007 9:45 PM:

Here we go with another Weekly Roundup from the XNA World. 05-03-2007 Code Release George has posted

# MichaelGiagnocavo said on March 18, 2007 4:00 AM:

It's been years since I look at anything on the CLR level, but every object in the CLR requires 12 bytes for the object header (syncblock and other stuff?). This is in addition to any actual data the object has.

# ShawnHargreaves said on March 19, 2007 11:03 AM:

Yeah. The one byte array will need some extra data for CLR infrastructure. This is true for any reference type or boxed value type, but not for value types that are living on the stack.

I don't know precisely what that header contains, but if I had to guess I'd say maybe a four byte vtable pointer (which identifies the type of the object, and contains the indirection pointers for any virtual methods it may contain), plus a 4 byte array size counter, plus maybe 4 bytes for the garbage collector to tag which objects it has marked, implement write barriers and so on? (I'd guess the GC would pack quite a lot of data into an integer bitfield).

# Shawn Hargreaves Blog said on March 19, 2007 5:09 PM:

Performance never ceases to be a fascinating topic. This forum thread inspired Eli to write about foreach

# Mykres Space said on March 25, 2007 9:31 PM:

Ok here we go with this weeks update… It has been a relatively quite week in the world of XNA; this must

# Cornflower Blue said on April 18, 2007 9:41 PM:

As promised, here's a bit more about foreach and garbage. Last time I wrote about the Windows Desktop

# Shawn Hargreaves Blog said on July 2, 2007 10:14 PM:

In my previous post I described how to tell if your Xbox garbage collection is taking too long. We saw

# cnblogs.com said on August 7, 2007 1:22 PM:

事实上对Array进行foreach的时候根本不会调用GetEnumerator()方法, 也就是被编译器给内联了. 而对于List&lt;&gt;, 虽然没有编译器内联GetEnumerator()

Anonymous comments are disabled

Search

Go

This Blog

Tags

No tags have been created or used yet.

Syndication

Page view tracker