Twin paths to garbage collector nirvana

Twin paths to garbage collector nirvana

Rate This
  • Comments 10

In my previous post I described how to tell if your Xbox garbage collection is taking too long. We saw how the total amount of time spent on garbage collection is the product of the number of collections with the collection latency.

This suggests two radically different approaches for improving overall garbage collector performance. You can either reduce the number of collections, or you can reduce the collection latency. As long as you make one of these values nice and small, it doesn't matter how big the other is: either one alone is enough to achieve good performance.

 

Path 1: right frequency

How often the garbage collector runs is directly proportional to how often you allocate memory. If you never allocate anything, it will never run.

To reduce the frequency of garbage collections, you should allocate everything you are ever going to need up front while loading your levels, and avoid allocations during gameplay.

There are several aspects to avoiding memory allocation:

Don't allocate memory (duh!)

This is simple: do not call new on reference types. It is ok to new value types such as Matrix, Vector3, and Color, however.

Any time you find yourself wanting to new a reference type, use an object pool to reuse existing instances instead. The Particle and Audio 3D samples on creators.xna.com use this technique, and SwampThingTom blogged about a reusable pool collection.

Don't use classes that allocate on your behalf

When you add data to a collection class such as List<T> or Dictionary<K,V>, this may have to allocate memory to expand the collection. You can avoid that by using the collection constructor overloads which have explicit capacity parameters. Specify a capacity to preallocate as much memory as you will ever need for the maximum number of objects you intend to store in the collection.

Beware of string formatting. It is hard to manipulate strings in .NET without causing allocations.

Don't make the CLR runtime allocate

The CLR runtime allocates memory when boxing occurs. Avoid this like the plague! Boxing can happen for many reasons, some obvious, others less so:

  • If you assign a value type to an object variable, it gets boxed.
  • If you store value types in one of the old non-generic collection classes, they will be boxed.
  • Accessing value types via an interface causes them to be boxed.
  • If you use an enum type as a dictionary key, internal dictionary operations will cause boxing. You can avoid this by using integer keys, and casting your enum values to ints before adding them to the dictionary.

Don't make the C# compiler allocate

It can be tricky to use delegates (especially delegates which form lexical closures) without causing the compiler to allocate memory. This is a whole topic in itself, but if in doubt you should avoid using delegates or events.

Generator methods will always require the compiler to allocate memory.

The foreach statement can allocate if used carelessly. But note, this does not mean you should avoid using foreach! It is often the fastest way to iterate over a collection. Learn the rules and use it appropriately.

Everything that does not allocate is ok

Disciples of the frequency path are allowed to have large and complex data structures. They can happily allocate hundreds of thousands of objects while their game is loading, filling the heap with a crazy maze of cross-linked object references. As long as they allocate nothing after loading has completed, the garbage collector will never run, so it makes absolutely no difference how large or complex their heap may be.

 

Path 2: right latency

How long the garbage collector takes to run is directly proportional to how complex your heap is. If the heap is empty, it will have no work to do.

Note that I said "how complex your heap is" and not "how large". Heap complexity is a combination of the number of live objects and the number of object references. It makes no difference how many bytes each object happens to take up: it is the total number of objects (because the garbage collector must examine each one) and references to objects (because the garbage collector must follow each reference to see what it is pointing to) that matter.

When measuring heap complexity, an array of 100,000 integers is very cheap. Even though that is a lot of memory, the garbage collector only has to examine the object once, and does not need to bother looking inside it.

100,000 arrays each containing a single integer would be far more expensive, because the garbage collector would have more objects to examine.

A single array of 100,000 object references is also more expensive, because although the array itself is only a single object, now the garbage collector must also scan the contents of the array to see if any of the objects inside it need to be kept alive. Even if the entire array contains nothing but null values, the garbage collector must still examine each element to make sure.

Here are some things that can reduce heap complexity:

  • A few large objects are better than many small ones.
  • Value types are better than reference types (within reason, anyway).
  • The fewer object references, the better.
  • Arrays of value types are great!
  • Consider replacing object references with integer handles. Instead of storing a direct reference to the ship that created each bullet, instead you can store "I was created by ship #23" as an integer index.
  • Prefer T[] or List<T> to LinkedList<T> or Dictionary<K,V>.

Disciples of the latency path are free to allocate memory while their game is running. They are allowed to call new, to cause boxing, and to use anonymous delegates and generator methods. Who cares if this will cause garbage collections, as long as their heap is so simple that these collections will finish quickly?

 

Compromise will get you nowhere

You can achieve good performance by avoiding allocations, so the garbage collector will never run. This works regardless of how complex your heap may be.

You can also achieve good performance by keeping your heap nice and simple, so the garbage collector will finish quickly. This works regardless of whether you allocate memory during gameplay.

But you cannot achieve good performance by mixing these two approaches! There is little to be gained by allocating only a medium amount of memory and having only a medium complex heap. That will produce a game which has a medium size glitch every medium number of seconds.

If your goal is to avoid garbage collection glitches altogether, you must pick just one path and follow it to the bitter end.

 

Give yourself a head start

Programmers new to the world of garbage collection often think they may be able to improve performance by calling GC.Collect at carefully chosen times.

They are almost always wrong. Explicitly forcing garbage collection is a recipe for confusing the collector and hurting your performance.

But...

On Xbox, the .NET Compact Framework performs a garbage collection after one megabyte has been allocated.

Let's say we are optimizing our game to avoid allocations. After careful use of the CLR Profiler we have reduced our allocations to only 50 bytes per frame, but we simply can't figure out any way to improve beyond this point.

Furthermore, let's say this game runs at 60 frames per second, and that a typical level takes 2 minutes to complete. By the end of the level we will have allocated only 352k: not enough to cause a garbage collection. In fact our gameplay can last 5 minutes without a collection, so the only time the player will see a garbage collection glitch is if they deliberately mess around and waste time.

Seems reasonable, neh? We can probably live with that.

But...

We will be allocating plenty of memory while the level is loading, and thus causing many collections. That is no problem itself: it is fine for the garbage collector to kick in at this point, because the loading screen doesn't care about the resulting framerate glitch.

But what if our loading happens to allocate slightly less than an even number of megabytes, let's say 24,452k in total? After the last collection at the 23 megabyte mark, this load operation allocates too little to trigger another collection, but enough to leave us without any headroom. Now our game can only allocate 124k before triggering a collection, so the player will see a glitch just 42 seconds into the level.

The solution is to manually call GC.Collect at the very end of your load method. That will clean up any leftover loading garbage, resetting the clock so you can last longer before the next collection threshold will be reached.

  • PingBack from http://maheshcr.wordpress.com/2007/07/03/xbox-garbage-collector/

  • "It can be tricky to use delegates (especially delegates which form lexical closures) without causing the compiler to allocate memory"

    Shawn, that needs some more explanation and a few examples. So long as I don't allocate the event parameters when the event is raised (i.e. they are pooled somewhere ready for use), I can't seem to make the event model generate garbage! What am I missing here?

  • Thanks for the excellent summary of GC issues.  For anyone who is looking at my Generic Pool collection class, you should be aware that the current version on my blog does generate some garbage in its iterators.  Not enough to cause problems in most cases -- I use it in Snowball Fight without problems; and I've verified that it doesn't change the allocation footprint of the Creator's Club Particle Sample when it's used in there.  But it's still something to be aware of.

    I have a new version that uses structs for its iterator, similar to the List class.  Although that does generate much less garbage, a quick test showed some boxing that I haven't had time to track down.  I'll post it once I can explain the unexpected boxing.

    I also would like to understand more about the conditions under which events and delegates cause allocations.  I find the event system to be a powerful part of the language and would like to know when I can and can't rely on it without having to worry about possible gc issues.

  • Hello, ive got a quick question!

    you suggest using an identifier to cut down on the ammount of object referances, but allso stay to keep away from dictionarys.

    I would of thought dictionarys would be the best way to link referances to id numbers.

  • Another thought - what happens if you take the 'I won't produce any garbage' path, and then you integrate some lovely 3rd party game component (lets say collision detection) that assumes the other path ('I'll produce garbage but my object structure is simple'). All hell is going to break loose with the GC, isn't it?!

    For me, it seems a lot of time is going into garbage collection avoidance when coding, when all that is is really needed is a generational GC. Are you guys at least looking into the possibility of a generational GC? Right now, given all of the things I have to be careful of on the 360 CLR, and the lack of comprehensive libraries for things like 3d collision detection, 6 months of coding tells me that writing a complex XNA game for the 360 is no faster or simpler than C++/DirectX would be!

  • > you suggest using an identifier to cut down on

    > the ammount of object referances, but allso stay

    > to keep away from dictionarys.

    >

    > I would of thought dictionarys would be the best

    > way to link referances to id numbers.

    I don't think that would really help, because you'd still have the reference in the dictionary. Maybe if you had many hundreds of objects all referencing the same thing, and could change that so they all just stored an id and then the dictionary had only one mapping of that id to the actual reference, that could possibly simplify things, but in general you'd probably lose as much by adding the dictionary as you'd gain from removing the references.

    I was suggesting using actual array indices rather than just id codes. If you have an array of players, an array of enemies, an array of bullets, etc, you can reference specific objects just by storing an index into one of these arrays.

  • Some commenters on my previous post asked for more information about the situations where delegates allocate

  • Some commenters on my previous post asked for more information about the situations where delegates allocate

  • You can use enums as dictionary keys without causing boxing, just implement an IEqualityComparer for your enum type. See my post:

    http://beardseye.blogspot.com/2007/08/nuts-enum-conundrum.html

  • I ran my game with CLR Profiler since I've serious performance problems. Is an average value of 1500 "Boxed Value Types" to much?

Page 1 of 1 (10 items)
Leave a Comment
  • Please add 7 and 2 and type the answer here:
  • Post