Larry Osterman's WebLog

Confessions of an Old Fogey
Blog - Title

Managed Code can't have memory leaks, right?

Managed Code can't have memory leaks, right?

  • Comments 15

One of the axioms of working with managed code is that since you’re using managed code, you don’t have to worry about memory leaks.

This can’t be further from the truth.  It’s totally possible to write code that leaks memory, even with managed code.

The trick is when you realize that your code can leak, even if the CLR hasn’t.

Here’s some code I wrote earlier today that leaks memory, for example:

            public static void AppendXxxNodes(XmlNode Node)
            {
                  if (Node.SelectSingleNode("Xxx") == null)
                  {
                        if (Node.HasChildNodes)
                        {
                              foreach (XmlNode childNode in Node.ChildNodes)
                              {
                                    XmlNode systemProtectionNode = Node.OwnerDocument.CreateElement("Xxx");
                                    Node.AppendChild(systemProtectionNode);
                              }
                        }
                  }
            }

The problem is of course that Node.AppendChild adds an entry to the ChildNode array, so the foreach node is effectively in an infinite loop.  The CLR hasn’t leaked any memory; it knows where all the objects are.  But my app leaked memory just the same.

I had this happen to me in the Exchange NNTP server many years ago in unmanaged code.  I had a credentials cache that was used to hold onto the logon information for users connecting to the NNTP server.  But there was a bug in the cache that prevented cache hits.  As a result, every user logon caused a new record to be added to the credentials cache.

Now, the memory didn’t “leak” – when we shut down the server, it quite properly cleaned out all the entries in the credentials cache.  My code didn’t forget about the memory, it just was using too much of it.

There are LOTS of ways that this can happen, both obvious (like my example above), and subtle.

Edit: Fixed stupid mistake :)

  • Wouldn't the foreach loop throw an exception once you add a new child node? AFAIK foreach loops (actually the enumerators used) do not let you iterate over a collection that's changing.
  • It depends on the implementation of the enumerator. Most of the BCL collections keep track of changes and will throw an InvalidOperationException if they are used in after the underlying collection has been modified.

    I don't believe however that enumerators are required to be implemented this way; MSDN says that foreach "should not be used to change the contents of the collection to avoid unpredictable side effects."

    http://msdn.microsoft.com/library/default.asp?url=/library/en-us/csref/html/vclrftheforeachstatement.asp

  • Apparently not if you're using System.Xml.

    I'd have been much happier if it HAD thrown an exception actually :)
  • Some enumerators (especially those which hide a DB connection behind them) are known to take a snapshot of the underlying data and give that to you through the enumerator without re-checking whether or not the data is current on each enumeration...

    Foreach can be dangerous for changing data; it's nice for a lot of other stuff though...
  • "It’s totally possible to write code that leaks memory, even with unmanaged code" I assume you mean managed?
  • Whoops :) I'll fix that asap.
  • we have this problem all the time in GDI+ when creating dynamic System.Drawing.Images at a high rate. calling Dispose() on the image is not sufficient..GC.Collect() has to be called or one encounters tremendous pile-up, again not technically a 'leak' as mentioned above...
  • Josh you're right, enumerators don't have to reset on a collection change and XmlNodeList is one of those that don't. Cool :) I actually like features that let you shoot yourself in the foot if you want to.
  • But this isn't really a memory leak in the traditional sense, it's just a bad piece of code sucking up more memory than you might otherwise think is correct. The memory isn't being lost like a traditional memory leak, and the GC will collect it when everything is disposed, right?

    Say, if you dispose the document that all these nodes are being created on. Right? Or am I wrong in thinking that and this really is a leak in the traditional sense?
  • My app was consuming 1.5G of memory before I realized what was wrong. While you're right that it's not TECHNICALLY a memory leak, it might as well have been one.

    And some of the subtle bugs I mentioned above can result in situations that are indistinguishable from a memory leak.
  • Larry,

    thats why i'd really like to see a tool in visual studio that shows the interconnections of allocated objects....

    WM_THX
    thomas woelfer
  • This is especially true when dealing with COM Interop in managed code. I once rewrote some old vb6 code in vb.net using ado. Of course, at the time, I didn't realize that the ado interop was *not* the ado.net lib. Long story short, I created about 50,000 ado objects w/o a single COM Release.

    Even later when I rewrote that part, I had to call GC.Collect() very often to keep the app from blowing up to ~800 meg.
  • Absolutely BJ. A very good point.

    Btw, you do know about System.InteropServices.Marshal.ReleaseComObject(), right?
  • Ya, that's what i was refering to with 'COM Release'. I made sure to call it each time I was finished with an ADO object the second time around, but even doing that the GC liked to keep them around for way too long considering the memory the application was consuming. Even on the server I ran the app on (4 gig ram) it crashed well before finishing.

    Going for a 3rd try, I called GC.Collect at the end of every proc that used ADO, and even tho it still ate up quite a lot of memory, it did at least finish.
  • This is kind of funny to read... I co-wrote an article about this back in, oh, 2000. Though you might have skipped it - it was an article about Java.

    http://www.ddj.com/documents/s=888/ddj0002l/0002l.htm

    You'll have to register to get at it though.

    Anyway, this problem is fairly common in Java apps, especially J2EE apps. There are "memory profiling" tools like JProbe and OptimizeIt that can help track them down. I used to work with the JProbe team (now part of Quest Software).

    We called these kinds of problems "loitering objects" to distinguish them from traditional C++ memory leaks where memory can't be freed because the pointer has been lost.

    Managed code development is very, very similar to Java development. I highly recommend some of the advanced Java development books to people getting into managed code for the first time. Many concepts are identical.
Page 1 of 1 (15 items)