Included in version 2.0 of the .NET framework is a Ping class (System.Net.NetworkInformation namespace) that can be used to monitor the up/down status of a network connection to a server. We recently realized that we had a minor bug in the class that can present itself as a major problem - a memory leak. Fortunately there is an easy workaround to this problem. I thought I would take this as an oportunity to demonstrate how to use the windbg debugger as I explain how we figured out what was going wrong.
Lets consider this simple console ping application that sends a series of ping requests to a server using the SendAsync method.
using System;using System.Threading;using System.Net.NetworkInformation;
class program { static int PingsOutstanding;
static void Main(string[] args) { string targetServer = "www.contoso.com"; for (int i = 0; i < 1000; i++) { PingServer(targetServer); }
int currentcount = PingsOutstanding; while (currentcount > 0) { Console.WriteLine("waiting for {0} ping requests to complete", currentcount); Thread.Sleep(1000); currentcount = PingsOutstanding; }
//the calls to the GC are just for debugging purposes Console.WriteLine("Calling GC.Collect()"); GC.Collect(); GC.WaitForPendingFinalizers();
Console.WriteLine("Press <enter> to exit application"); Console.ReadLine(); }
static void PingServer(string hostnameoraddress) { try { Ping pingsender = new Ping(); pingsender.PingCompleted += new PingCompletedEventHandler(OnPingCompleted); pingsender.SendAsync(hostnameoraddress, null); Interlocked.Increment(ref PingsOutstanding); } catch (Exception ex) { Console.WriteLine(ex); } }
static void OnPingCompleted(object sender, PingCompletedEventArgs e) { Interlocked.Decrement(ref PingsOutstanding);
try { Ping pingsender = (Ping)sender;
pingsender.Dispose(); } catch (Exception ex) { Console.WriteLine(ex); } }}
This application will not consume a ton of memory under normal operation but if you increase the number of ping requests that are sent, you will see that memory consumption also climbs. Given that the .Net framework uses a garbage collection approach to memory allocation and cleanup, you would expect that the memory would be cleaned up properly within a reasonable amount of time. You should also notice that I am calling pingSender.Dispose() which we would hope would release the resources being held by this object.
For demonstration purposes I have added a Console.ReadLine() call at the end of the application so that it won't exit. I have also added a call to GC.Collect() and GC.WaitForPendingFinalizers() to make sure that when I go to debug this I am certain that the GC has recently run and cleaned up any objects that are available for cleanup. It is not usually advisable to force GC collections except for debugging purposes. See Maoni's blog for great information on how the GC works and how to use it effectively.
Using Windbg.exe
I chose to use Windbg because I have found it to be the best debugger for memory leaks. It is a very powerful GUI debugger that is used very frequently inside of Microsoft for debugging everything from application bugs to OS bugs. You can download it from here: http://www.microsoft.com/whdc/devtools/debugging/default.mspx.
A couple of quick notes before I begin:
Now, Windbg is a native windows debugger and doesn't really know much about the .Net framework, so you have to load an extension to the debugger that understands managed applications. SOS.dll is the debugger extension we are looking for. In the command window this will load the sos extension:
.loadby sos mscorwks
This command tells the debugger to load the sos dll by looking in the same directory as the mscorwks dll, which is a core dll that is always loaded in a managed application. Note that Windbg breaks into the debugger immediately upon loading the application, so mscorwks will not be loaded at this point. You will have to run the application long enough for the .NET framework code to be loaded. You can actually tell the debugger to break when mscorwks is loaded by running this in the command window:
sxe ld:mscorwks
There are two ways that you can run commands from the sos this extension dll:
!<command> or !sos.<command>
Here are examples for the help function found in sos (it prints usage information):
!help !sos.help
The second one is useful when you have multiple extensions that share common function names.
When debugging this leak, I actually let the app run until it blocked on the call to Console.ReadLine() so that GC should have cleaned up. The first thing I do when dealing with a managed memory leak is to look at all the objects that have been created by the runtime and see if I see any that jump out at me. Lets look at the heap and see what it contains.
!dumpheap -stattotal 20600 objectsStatistics: MT Count TotalSize Class Name7a76eb14 1 12 System.Net.DnsPermission7a755834 1 12 System.Diagnostics.PerformanceCounterCategoryType7a753394 1 12 System.Diagnostics.TraceOptions7a71a710 1 12 System.Net.TimeoutValidator7912b908 1 12 System.Collections.Generic.GenericEqualityComparer`1[[System.String, mscorlib]]79112c90 1 12 System.Collections.Comparer7910db30 1 12 System.Threading.SynchronizationContext7910a718 1 12 System.DefaultBinder79107f40 1 12 System.RuntimeTypeHandle79107ac4 1 12 System.Reflection.__Filters79102f48 1 12 System.__Filters79102ef8 1 12 System.Reflection.Missing79101ca8 1 12 System.RuntimeType+TypeCacheQueue79100800 1 12 System.Text.DecoderExceptionFallback...(output trimmed)0105a514 77 4004 System.Configuration.FactoryRecord791242ec 47 8136 System.Collections.Hashtable+bucket[]79160a3c 1000 16000 System.Threading.RegisteredWaitHandle79124228 58 19216 System.Object[]7a779154 1000 20000 System.Net.SafeLocalFree7a778ec0 1000 20000 System.Net.SafeCloseHandle7a761bb0 1000 20000 System.ComponentModel.AsyncOperation79160b84 1000 20000 System.Threading._ThreadPoolWaitOrTimerCallback7910cf3c 1001 20020 Microsoft.Win32.SafeHandles.SafeWaitHandle791609c8 1000 24000 System.Threading.RegisteredWaitHandleSafe79115d0c 1000 24000 System.Threading.ManualResetEvent7a7811f4 1000 32000 System.Net.NetworkInformation.PingCompletedEventHandler79160aa8 1000 32000 System.Threading.WaitOrTimerCallback7915ff38 1000 32000 System.Threading.SendOrPostCallback7910d2f4 1001 36036 System.Threading.ExecutionContext79124418 1004 44820 System.Byte[]7a7812e0 1000 88000 System.Net.NetworkInformation.Ping00150c90 421 108132 Free790fa3e0 5262 304552 System.StringTotal 20600 objects
Looking at the columns in the output, MT stands for Method Table and is basically a pointer to the table that describes that type of object. Count is the number of objects that exist in the heap of the given type. TotalSize is the amount of memory that is being consumed by any one type of object. The last column is obviously the fully typed name of the object.
The first thing that jumps out at me in this case is the fact that we still have a large number of ping objects in the heap even though we aren't holding onto any references in our code. We also called dispose on these objects, so this is a huge red light telling me that something went wrong. Also, the fact that we sent exacly 1000 ping requests and there are exactly 1000 ping objects still hanging around calls my attention. Lets take a look at these ping objects and see why they aren't getting cleaned up (Below I show two ways of doing this, you really only need to do one of them).
!dumpheap -type Ping
This does a substring match on any object with Ping in it. In this case it will list both System.Net.NetworkInformation.Ping objects and System.Net.NetworkInformation.PingCompletedEventHandler objects. I really wanted just the Ping objects, so we could have run this command (using the MT for the ping object - obtained above)
!dumpheap -mt 7a7812e0
Address MT Size01271ff0 7a7812e0 88 012720f8 7a7812e0 88 012721d0 7a7812e0 88 012722a8 7a7812e0 88 01272380 7a7812e0 88 01272458 7a7812e0 88 01272530 7a7812e0 88 01272608 7a7812e0 88 012726e0 7a7812e0 88 012727b8 7a7812e0 88 01272890 7a7812e0 88 01272968 7a7812e0 88 ...(output trimmed) 012fada0 7a7812e0 88 012faeec 7a7812e0 88 012fb038 7a7812e0 88 012fb184 7a7812e0 88 012fb2d0 7a7812e0 88 012fb41c 7a7812e0 88 012fb568 7a7812e0 88 012fb6b4 7a7812e0 88 012fb800 7a7812e0 88 012fb94c 7a7812e0 88 012fba98 7a7812e0 88 012fbbe4 7a7812e0 88 012fbd30 7a7812e0 88 012fbe7c 7a7812e0 88 total 1000 objectsStatistics: MT Count TotalSize Class Name7a7812e0 1000 88000 System.Net.NetworkInformation.PingTotal 1000 objects
Notice that the MT for all of the above objects matches that for the ping object. The left hand column is the memory address where the object resides. We can use this address to examine the actual ping object. To take a look at the first object in the above list we would use this command:
!dumpobject 01271ff0
I am not going to go into details at this point because this won't lead us where the problem lies. I thought I would list the dumpobject command just to point it out to you. The problem in this case is that objects that we think should be cleaned up are not cleaned up as expected. This means that there must be a reference to the object somewhere. The following command will help you to see what objects hold references to an object in question. We will use the same object address we used in the dumpobject example:
!gcroot 01271ff0
Note: Roots found on stacks may be false positives. Run "!help gcroot" formore info.Scan Thread 0 OSTHread 2f8Scan Thread 2 OSTHread fccScan Thread 3 OSTHread de8Scan Thread 4 OSTHread eb8Scan Thread 8 OSTHread c8cScan Thread 12 OSTHread 464Scan Thread 13 OSTHread 1e0DOMAIN(0014BF60):HANDLE(Strong):8f15b8:Root:0128d2c0(System.Threading._ThreadPoolWaitOrTimerCallback)->01271ff0(System.Net.NetworkInformation.Ping)
This tool scans the threads to find any object that holds a reference to the given memory address. We can see that some threadpool object is holding onto us. If we go back to the objects on the heap we can see that there are these _ThreadPoolWaitOrTimerCallback objects on the heap also. Now, I don't know a ton about the thread pool, but I would not expect those callback objects to still be on the heap either because they are most likely specific to a single usage of a threadpool thread. We could dump all of the _ThreadPoolWaitOrTimerCallback objects and check their roots (the objects holding onto them), but why dump the more objects from the heap? The output from !gcroot gave us a pointer to an instance that we can quickly examine.
!gcroot 0128d2c0
We would have hoped that the Dispose() function would have cleaned these objects up, but it didn't. So, what about the GC? Why didn't GC take care of them when dispose didn't? Notice how this gave us the same exact output from the gcroot of the ping class. This must mean that we have a circular reference (ping has a reference to this thread pool object and visa versa) and that is why the GC didn't clean it up.
Workaround (and reason why calling Dispose didn't clean this up)
The simple workaround to this problem is to cast the object to the IDisposable interface and then call Dispose on that casted object.
((IDisposable)pingSender).Dispose();
Why does this solve the problem? Take a look at the declaration of the class:
public class Ping:Component,IDisposable
Notice that it inherits from Component and it also implementes the IDisposable interface. Now lets look at the function declaration for Dispose on the Ping class:
void IDisposable.Dispose ()
In the implementation, the Dispose function was defined as part of the IDisposable interface but the fact that the base class Component also implements a Dispose function was missed. The Ping class doesn't override the Component.Dispose method and so if you call Dispose directly on the Ping object without casting it to an IDisposable Interface, the runtime calls the base class implementation of Dispose instead of the one that was intended (the one for IDisposable).
Note: There is another great blog on debugging managed memory leaks here: http://blogs.msdn.com/mvstanton/archive/2005/10/11/479861.aspx