Included in version 2.0 of the .NET framework is a Ping class (System.Net.NetworkInformation namespace) that can be used to monitor the up/down status of a network connection to a server.  We recently realized that we had a minor bug in the class that can present itself as a major problem - a memory leak.  Fortunately there is an easy workaround to this problem. I thought I would take this as an oportunity to demonstrate how to use the windbg debugger as I explain how we figured out what was going wrong.

Lets consider this simple console ping application that sends a series of ping requests to a server using the SendAsync method.

using System;
using System.Threading;
using System.Net.NetworkInformation;


class program {
 static int PingsOutstanding;

 static void Main(string[] args)
 {
  string targetServer = "www.contoso.com";
  for (int i = 0; i < 1000; i++) {
   PingServer(targetServer);
  }

  int currentcount = PingsOutstanding;
  while (currentcount > 0) {
   Console.WriteLine("waiting for {0} ping requests to complete", currentcount);
   Thread.Sleep(1000);
   currentcount = PingsOutstanding;
  }

  //the calls to the GC are just for debugging purposes
  Console.WriteLine("Calling GC.Collect()");
  GC.Collect();
  GC.WaitForPendingFinalizers();

  Console.WriteLine("Press <enter> to exit application");
  Console.ReadLine();
 }

 static void PingServer(string hostnameoraddress)
 {
  try {
   Ping pingsender = new Ping();
   pingsender.PingCompleted += new PingCompletedEventHandler(OnPingCompleted);
   pingsender.SendAsync(hostnameoraddress, null);
   Interlocked.Increment(ref PingsOutstanding);
  } catch (Exception ex) {
   Console.WriteLine(ex);
  }
 }


 static void OnPingCompleted(object sender, PingCompletedEventArgs e)
 {
  Interlocked.Decrement(ref PingsOutstanding);

  try {
   Ping pingsender = (Ping)sender;

   pingsender.Dispose();
  } catch (Exception ex) {
   Console.WriteLine(ex);
  }
 }
}

This application will not consume a ton of memory under normal operation but if you increase the number of ping requests that are sent, you will see that memory consumption also climbs.  Given that the .Net framework uses a garbage collection approach to memory allocation and cleanup, you would expect that the memory would be cleaned up properly within a reasonable amount of time.  You should also notice that I am calling pingSender.Dispose() which we would hope would release the resources being held by this object.

For demonstration purposes I have added a Console.ReadLine() call at the end of the application so that it won't exit.  I have also added a call to GC.Collect() and GC.WaitForPendingFinalizers() to make sure that when I go to debug this I am certain that the GC has recently run and cleaned up any objects that are available for cleanup.  It is not usually advisable to force GC collections except for debugging purposes.  See Maoni's blog for great information on how the GC works and how to use it effectively.

Using Windbg.exe

I chose to use Windbg because I have found it to be the best debugger for memory leaks.  It is a very powerful GUI debugger that is used very frequently inside of Microsoft for debugging everything from application bugs to OS bugs.  You can download it from here: http://www.microsoft.com/whdc/devtools/debugging/default.mspx

A couple of quick notes before I begin:

  1. updates to these debugging tools are available regularly, so make sure to update periodically.
  2. I will color commands that are to be run inside the debugger in green
  3. I have access to all of the debugger symbols for the .Net framework, but you don't.  However, most of the information you need should be accessible by using ".symfix" as noted below.  The principles discussed here apply to debugging a memory leak in any managed application.
    • public microsoft symbols can loaded into windbg using the .symfix  followed by a .reload command (requires internet connection to a microsoft http server)
  4. I may trim sample output (colored in purple) from the debugger in order to focus on the important information.

Now, Windbg is a native windows debugger and doesn't really know much about the .Net framework, so you have to load an extension to the debugger that understands managed applications.  SOS.dll is the debugger extension we are looking for. In the command window this will load the sos extension:

   .loadby sos mscorwks

This command tells the debugger to load the sos dll by looking in the same directory as the mscorwks dll, which is a core dll that is always loaded in a managed application.  Note that Windbg breaks into the debugger immediately upon loading the application, so mscorwks will not be loaded at this point.  You will have to run the application long enough for the .NET framework code to be loaded.  You can actually tell the debugger to break when mscorwks is loaded by running this in the command window:

   sxe ld:mscorwks

There are two ways that you can run commands from the sos this extension dll:

   !<command> or !sos.<command>

Here are examples for the help function found in sos (it prints usage information):

   !help
   !sos.help

The second one is useful when you have multiple extensions that share common function names.

When debugging this leak, I actually let the app run until it blocked on the call to Console.ReadLine() so that GC should have cleaned up.  The first thing I do when dealing with a managed memory leak is to look at all the objects that have been created by the runtime and see if I see any that jump out at me.  Lets look at the heap and see what it contains. 

   !dumpheap -stat
total 20600 objects
Statistics:
      MT    Count    TotalSize Class Name
7a76eb14        1           12 System.Net.DnsPermission
7a755834        1           12 System.Diagnostics.PerformanceCounterCategoryType
7a753394        1           12 System.Diagnostics.TraceOptions
7a71a710        1           12 System.Net.TimeoutValidator
7912b908        1           12 System.Collections.Generic.GenericEqualityComparer`1[[System.String, mscorlib]]
79112c90        1           12 System.Collections.Comparer
7910db30        1           12 System.Threading.SynchronizationContext
7910a718        1           12 System.DefaultBinder
79107f40        1           12 System.RuntimeTypeHandle
79107ac4        1           12 System.Reflection.__Filters
79102f48        1           12 System.__Filters
79102ef8        1           12 System.Reflection.Missing
79101ca8        1           12 System.RuntimeType+TypeCacheQueue
79100800        1           12 System.Text.DecoderExceptionFallback
...(output trimmed)
0105a514       77         4004 System.Configuration.FactoryRecord
791242ec       47         8136 System.Collections.Hashtable+bucket[]
79160a3c     1000        16000 System.Threading.RegisteredWaitHandle
79124228       58        19216 System.Object[]
7a779154     1000        20000 System.Net.SafeLocalFree
7a778ec0     1000        20000 System.Net.SafeCloseHandle
7a761bb0     1000        20000 System.ComponentModel.AsyncOperation
79160b84     1000        20000 System.Threading._ThreadPoolWaitOrTimerCallback
7910cf3c     1001        20020 Microsoft.Win32.SafeHandles.SafeWaitHandle
791609c8     1000        24000 System.Threading.RegisteredWaitHandleSafe
79115d0c     1000        24000 System.Threading.ManualResetEvent
7a7811f4     1000        32000 System.Net.NetworkInformation.PingCompletedEventHandler
79160aa8     1000        32000 System.Threading.WaitOrTimerCallback
7915ff38     1000        32000 System.Threading.SendOrPostCallback
7910d2f4     1001        36036 System.Threading.ExecutionContext
79124418     1004        44820 System.Byte[]
7a7812e0     1000        88000 System.Net.NetworkInformation.Ping
00150c90      421       108132      Free
790fa3e0     5262       304552 System.String
Total 20600 objects

Looking at the columns in the output, MT stands for Method Table and is basically a pointer to the table that describes that type of object.  Count is the number of objects that exist in the heap of the given type.  TotalSize is the amount of memory that is being consumed by any one type of object.  The last column is obviously the fully typed name of the object.

The first thing that jumps out at me in this case is the fact that we still have a large number of ping objects in the heap even though we aren't holding onto any references in our code.  We also called dispose on these objects, so this is a huge red light telling me that something went wrong.  Also, the fact that we sent exacly 1000 ping requests and there are exactly 1000 ping objects still hanging around calls my attention.  Lets take a look at these ping objects and see why they aren't getting cleaned up (Below I show two ways of doing this, you really only need to do one of them).

   !dumpheap -type Ping

This does a substring match on any object with Ping in it.  In this case it will list both System.Net.NetworkInformation.Ping objects and System.Net.NetworkInformation.PingCompletedEventHandler objects.  I really wanted just the Ping objects, so we could have run this command (using the MT for the ping object - obtained above)

   !dumpheap -mt 7a7812e0    

 Address       MT     Size
01271ff0 7a7812e0       88    
012720f8 7a7812e0       88    
012721d0 7a7812e0       88    
012722a8 7a7812e0       88    
01272380 7a7812e0       88    
01272458 7a7812e0       88    
01272530 7a7812e0       88    
01272608 7a7812e0       88    
012726e0 7a7812e0       88    
012727b8 7a7812e0       88    
01272890 7a7812e0       88    
01272968 7a7812e0       88    
...(output trimmed) 
012fada0 7a7812e0       88    
012faeec 7a7812e0       88    
012fb038 7a7812e0       88    
012fb184 7a7812e0       88    
012fb2d0 7a7812e0       88    
012fb41c 7a7812e0       88    
012fb568 7a7812e0       88    
012fb6b4 7a7812e0       88    
012fb800 7a7812e0       88    
012fb94c 7a7812e0       88    
012fba98 7a7812e0       88    
012fbbe4 7a7812e0       88    
012fbd30 7a7812e0       88    
012fbe7c 7a7812e0       88    
total 1000 objects
Statistics:
      MT    Count    TotalSize Class Name
7a7812e0     1000        88000 System.Net.NetworkInformation.Ping
Total 1000 objects

 Notice that the MT for all of the above objects matches that for the ping object.  The left hand column is the memory address where the object resides.  We can use this address to examine the actual ping object.  To take a look at the first object in the above list we would use this command:

   !dumpobject 01271ff0

I am not going to go into details at this point because this won't lead us where the problem lies.  I thought I would list the dumpobject command just to point it out to you.  The problem in this case is that objects that we think should be cleaned up are not cleaned up as expected.  This means that there must be a reference to the object somewhere.  The following command will help you to see what objects hold references to an object in question.  We will use the same object address we used in the dumpobject example:

   !gcroot 01271ff0

Note: Roots found on stacks may be false positives. Run "!help gcroot" for
more info.
Scan Thread 0 OSTHread 2f8
Scan Thread 2 OSTHread fcc
Scan Thread 3 OSTHread de8
Scan Thread 4 OSTHread eb8
Scan Thread 8 OSTHread c8c
Scan Thread 12 OSTHread 464
Scan Thread 13 OSTHread 1e0
DOMAIN(0014BF60):HANDLE(Strong):8f15b8:Root:0128d2c0(System.Threading._ThreadPoolWaitOrTimerCallback)->
01271ff0(System.Net.NetworkInformation.Ping)

This tool scans the threads to find any object that holds a reference to the given memory address.  We can see that some threadpool object is holding onto us.  If we go back to the objects on the heap we can see that there are these _ThreadPoolWaitOrTimerCallback objects on the heap also.  Now, I don't know a ton about the thread pool, but I would not expect those callback objects to still be on the heap either because they are most likely specific to a single usage of a threadpool thread. We could dump all of the _ThreadPoolWaitOrTimerCallback objects and check their roots (the objects holding onto them), but why dump the more objects from the heap?  The output from !gcroot gave us a pointer to an instance that we can quickly examine.

   !gcroot 0128d2c0

Note: Roots found on stacks may be false positives. Run "!help gcroot" for
more info.
Scan Thread 0 OSTHread 2f8
Scan Thread 2 OSTHread fcc
Scan Thread 3 OSTHread de8
Scan Thread 4 OSTHread eb8
Scan Thread 8 OSTHread c8c
Scan Thread 12 OSTHread 464
Scan Thread 13 OSTHread 1e0
DOMAIN(0014BF60):HANDLE(Strong):8f15b8:Root:0128d2c0(System.Threading._ThreadPoolWaitOrTimerCallback)->
01271ff0(System.Net.NetworkInformation.Ping)

We would have hoped that the Dispose() function would have cleaned these objects up, but it didn't.  So, what about the GC?  Why didn't GC take care of them when dispose didn't? Notice how this gave us the same exact output from the gcroot of the ping class.  This must mean that we have a circular reference (ping has a reference to this thread pool object and visa versa) and that is why the GC didn't clean it up.

Workaround (and reason why calling Dispose didn't clean this up)

The simple workaround to this problem is to cast the object to the IDisposable interface and then call Dispose on that casted object.

   ((IDisposable)pingSender).Dispose();

Why does this solve the problem?  Take a look at the declaration of the class:

   public class Ping:Component,IDisposable

Notice that it inherits from Component and it also implementes the IDisposable interface.  Now lets look at the function declaration for Dispose on the Ping class:

   void IDisposable.Dispose ()

In the implementation, the Dispose function was defined as part of the IDisposable interface but the fact that the base class Component also implements a Dispose function was missed.  The Ping class doesn't override the Component.Dispose method and so if you call Dispose directly on the Ping object without casting it to an IDisposable Interface, the runtime calls the base class implementation of Dispose instead of the one that was intended (the one for IDisposable). 

Note: There is another great blog on debugging managed memory leaks here: http://blogs.msdn.com/mvstanton/archive/2005/10/11/479861.aspx