If broken it is, fix it you should

Using the powers of the debugger to solve the problems of the world - and a bag of chips    by Tess Ferrandez, ASP.NET Escalation Engineer (Microsoft)

.NET Debugging Demos Lab 3: Memory

.NET Debugging Demos Lab 3: Memory

Rate This
  • Comments 30

 

TGIF, almost time for the weekend... but before you leave, here is lab 3.

Todays debugging puzzle will be a Memory investigation.  This time we are going to stress the BuggyBits site to create a high memory usage situation and figure out what caused it.   This lab is a bit lengthier because I wanted to show various aspects of a memory investigation.  Once you know the relationship between the numbers in the dump and perfmon etc. you can often skip some parts, but if it is possible to gather I would still recommend gathering performance logs for completeness.

Previous demos and setup instructions

If you are new to the debugging labs, here you can find information on how to set up the labs as well as links to the previous labs in the series.

Information and setup instructions
Lab 1: Hang
Lab 1: Hang - review 
Lab 2: Crash
Lab 2: Crash - review

Set up a performance monitor log

1. Browse to http://localhost/BuggyBits/Links.aspx to start the w3wp.exe process

2. Open up performance monitor (perfmon.exe from start/run)

2. Right-click "performance counters and alerts/counter logs" and choose "new log settings", name it Lab3-Mem

3. Click on "Add Objects..." 

4. Add the objects ".NET CLR Memory" and "Process" and click ok.

5. Set the sampling interval to 1 second (because our repro is really short, normally 15 seconds is ok)

6. Change the "Run As:" from "<Default>" to your domain\username and click on the "Set Password..." button to set the password.  This is done because you have to run the counter log as an administrator or as someone with debug priviledges to the w3wp.exe process, otherwise you will not get any .net counters.

7. Click ok, this will start monitoring the process.

Reproduce the problem

1. Start a command prompt and move to c:\program files\IIS Resources\Tinyget or wherever you installed tinyget

2. Run "tinyget -srv:localhost -uri:/BuggyBits/Links.aspx -loop:4000"

Note: You can lower the amount of loops slightly if you find that your process crashes because of OutOfMemory exceptions, I set it this high so that we would get some kind of output in perfmon.

3. Stop the performance monitor counterlog (but let it run for at least about 20-30 seconds after the tinyget run)

Examine the performance logs

1. In performance monitor, move to the "System Monitor" node

2. Press Ctrl+L or click on the "View Log Data" button (looks like a database icon) and open the log file, make sure to expand the time range as much as possible 

3. Remove any counters that it already added at the bottom of the screen

4. Add the following counters (for the w3wp.exe process) using the + button or Ctrl+I

  • .NET CLR Memory/# Bytes in all Heaps
  • .NET CLR Memory/Large Object Heap Size
  • .NET CLR Memory/Gen 2 heap size
  • .NET CLR Memory/Gen 1 heap size
  • .NET CLR Memory/Gen 0 heap size
  • Process/Private Bytes
  • Process/Virtual Bytes

The counters should now show up at the bottom of the screen...

5. Right-click any of them and choose properties and on the Data tab set the scale for all of them to 0,0000001 to make sure you can view the whole graph on the screen

Note: If you click the lightbulb Ctrl+H the selected counter will be highlighted so that it is easier to distinguish which line corresponds to which counter

Q: What are the latest values for these counters?

Q: Compare the curves for Virtual Bytes, Private Bytes and #Bytes in all Heaps, do they follow eachother or do they diverge?

Q: Can you tell from this if it appears to be a native "leak", a .net memory "leak" or a leak on the loader heap (dynamic assemblies)?

Q: At the end the log (after you finish your tinyget run) does memory stay flat or does it go down?

6. Add the counters for .NET CLR Memory/# Gen 0 Collections, .NET CLR Memory/# Gen 1 Collections, .NET CLR Memory/# Gen 2 Collections

Q: Do any collections occurr after the tinyget run is done? if not, why not?

7. Open task manager and under "View/Select Columns..." add the column "Virtual Memory Size". Compare the values for Mem Usage and VM Size to the values for Private Bytes and Virtual Bytes in perfmon. (Note: since they are given in K in task manager you need to multiply these values by 1024).

Q: What does Mem Usage show?

Q: What does VM Size show?

Get a memory dump

1. Open a command prompt and move to the debuggers directory and type in "adplus -hang -pn w3wp.exe -quiet" and hit enter

Open the dump in windbg

1. Open the dump file in windbg (the .dmp file should be located in your debuggers directory under a folder labeled hang_mode and todays date and time.

2. Set up the symbol path and load sos (see the setup instructions for more info)

Q: How big is the dump? Look at the size in the file explorer.

Q: How does this compare to Private Bytes, Virtual Bytes and # Bytes in all Heaps.

Examine the memory to get an idea of where the memory is going

1. Run !address -summary (this will give you an overview of the memory usage) and familiarize yourself with the output. Hint: check the windbg help files for !address

Q: Which values corresponds best to the following

    • Private Bytes
    • Virtual Bytes

Q: Where is most of the memory going (which RegionType)?

Q: What does Busy, Pct(Busy) and Pct(Tots) mean?

Q: What does MEM_IMAGE mean?

Q: Under what region does .net memory fit in and why?

From perfmon we could see that #Bytes in all Heaps follows Private bytes very closely meaning that the increase in memory is mostly an increase in .NET memory.  In other words we are looking for an answer to why our .net GC heaps keep growing.

Examine the .NET GC Heaps

1. Run !eeheap -gc to examine the size of the .NET GC heaps

Q: How many heaps do you have? Why?

Q: How much memory is stored on the .net GC heaps?  Compare to #Bytes in all Heaps

Q: How much memory is on the large object heap? Hint: add up the sizes for all the Large Object heap segments. Compare to Large Object Heap Size in perfmon. 

2. Run !dumpheap -stat to dump out all .net objects in a statistical fashion (Note: you can run !help DumpHeap to get help for the !dumpheap command or check out this post, this post or this post)

Q: Looking at the 5-10 object types that use most memory, what do you think is leaking?

Q: What "size" does the size column show? I.e. what is included in this "size"?  Hint: see the post (!dumpheap -stat explained) for more info.

 

Normally I wouldn't recommend looking at strings immediately since they will be at the bottom of the !dumpheap -stat output in pretty much every single dump you take, because of the following 

    • the "size" for strings is the actual size of the contents of the string.  If you compare this to the size of a dataset for example, the size will only contain pointers to the rows and columns arrays but not include the memory for the rows and columns, so the size for a datasetobject will always be very small (and constant) while the size of a string varies in size.
    • strings are leaf nodes in most objects, i.e. datasets contain strings, aspx pages contain strings, session vars contain strings, labels contain strings etc. etc. so you are bound to have mostly strings in your app.

However in this case the strings are so many and occupy so much memory that if we don't find anything else that sticks out, we might just want to follow the string lead...

3. Dump out stats for various size strings to find out if there is a pattern (this is a bit of trial and error so you have to try a few different sizes to figure out where the bulk of the strings are.

Get the method table (MT) for System.String (first column in !dumpheap -stat)

!dumpheap -mt <string MT> -min 85000 -stat

!dumpheap -mt <string MT> -min 10000 -stat

!dumpheap -mt <string MT> -min 20000 -stat

!dumpheap -mt <string MT> -min 30000 -stat

!dumpheap -mt <string MT> -min 25000 -stat

 

Q: In what range (between what sizes) do most of the strings exist?

 

4. Dump out the strings in that range

!dumpheap -mt <string MT> -min 20000 -max 25000 

In this case most of them will even be the exact same size which is a clue to what is going on...

5. Dump out a few of them to find out what they contain

!do <address of string - first column in the !dumpheap -mt output>

Q: What do these strings contain?

6. Pick a few and find out where they are rooted (i.e. why they can't be collected)  Note: You may want to try a couple different ones.

!gcroot <address of string>

Q: Where are they rooted?  Why?

 

Examine the finalizer queue and the finalizer thread

1. Look at the finalizer queue

!finalizequeue

Q: What objects are listed in the finalizequeue output? Hint: run !help finalizequeue

Q: How many objects are "ready for finalization"?  What does this mean?

2. Find the finalizerthread to determine what it is doing. Run !threads and look for the thread listed with (Finalizer)

3. Move to the finalizer thread and examine the managed and native callstack

~5s   (5 should be substituted with the id of the finalizer thread)
kb 2000
!clrstack

Q: What object is it finalizing?

Q: What is it doing?  Why is this causing high memory usage?

 

Examine the code for verification

1. Open Link.cs to find the destructor/finalizer for the Link class to see the problematic code

 

Related posts

!dumpheap -stat explained 

ASP.NET Memory Investigation

.NET Memory usage - A restaurant analogy

.NET Memory leak: Unblock my finalizer

.NET Memory: My object is not rooted, why wasn't it garbage collected?

I have a memory leak!!! What do i do? (defining the "where")

Who is this OutOfMemory guy and why does he make my process crash when I have plenty of memory left?

Have fun debugging,

Tess

  • MSBuild The custom MSBuild task cookbook [Via: bart ] ASP.NET Simplifying ASP.NET ListView Control...

  • Link Listing - February 15, 2008

  • Say you want to find the contents of a string, so you !do. What if you get an error saying the string it too large to display ? How can you determine the contents of the string ?

    Cheers

    Aaron

  • if sos deems it too large to display you can do

    dc <address>+0xc  

    (which will dump out the characters at an offset 0xc from the string address, this is where the contents of the string start)

    or you can try da <address>+0xc or du <address>+0xc

    da = dump ascii, du = dump unicode

    You can also do

    dc <address>+0xc <address>+<size of string>

    if you want to dump out the whole string, you will get the size from the !do output

  • on my server, there are 10 w3wp.exe processes, when i add counters,  "select instance from list box",in the list box there are  w3wp.exe#1, w3wp.exe#2...., i can't figure out which is i want.

  • That's quite a few w3wp.exe processes:)  you can run

    tlist -c from the debuggers directory to get a list of all processes and their command lines to try to identify the pid and then check which w3wp corresponds to which pid under the process counter...

    but you might as well just add the counters for all w3wp.exe processes and then based on the counters (like memory usage or cpu activity) you can probably figure out which one it is...

  • add the counters for all w3wp.exe processes

    ----there are too many info got, i can't analyze them,it is too dificult for me. your method can't work.

    here i found a article: http://support.microsoft.com/kb/922775/en-us ,in last ,it write "Multiple CLR performance counters appear that have names that resemble "W3wp#1" " i follow this way, but i can't understand which represent the PID of w3wp.exe.

    identify the pid of w3wp.exe between the .net counters and native counters are different.  for native counters , do some modification on key:  HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\PerfProc\Performance,

    this is what i found on msdn

  • then try tlist and if that doesn't help then then next thing i would suggest is to either shut down the processes and just start one or do something like hit a page repetedly while watching task manager to see which one of them the CPU usage/memory usage moves for to find the pid.

  • Hope you have had a chance to go through the high CPU lab... Some people already commented with the solution

  • Lab 1: HangLab 2: CrashLab 3: MemoryLab 4: High CPU Hang

  • " tlist -c  " is  useless for me!   this command output  is the same as the tab of processes in  Debug Diagnostic tool. i installed  Debug Diagnostic tool. I guess my server has native leak. on my server, there are a few w3wp.exe running. I can't stop the process and hit a page repetadly. the serve is production server on internet, many people will acess the web page.

  • i got private bytes, virtual bytes  for  w3wp_6824  and and #Bytes in all heaps for w3wp#5, w3wp#5 is the no. 6824 process.(i do as  http://support.microsoft.com/kb/922775/en-us 's say.)

    but there are curves for private bytes, virtual bytes. but there is no curve for #Bytes in all heaps of w3wp#5. is it correct? ( the max, min, average value of  #Bytes in all heaps of w3wp#5 are the same. no value for last ). the #Bytes in all heaps for _global_ has curve( the max, min, average,last value of  #Bytes in all heaps of _global_ are exist and different)

  • The _global_ counters are not reliable. If you don't get values for #Bytes in all heaps for certain w3wp.exe processes it either means that you are not running the log as an admin (or someone with debug priviledges for the process) or the process is not serving .net content (most likely cause is probably a perms issue)

  • yes,it is the priviledges problem, I set Change the "Run As:" from "<Default>" to administrator ,but no data collect, I set to "<Default>"  is ok.

Page 1 of 2 (30 items) 12
Leave a Comment
  • Please add 5 and 5 and type the answer here:
  • Post