|
|
CLR internals, Rotor code explanation, CLR debugging tips, trivial debugging notes, .NET programming pitfalls, and blah, blah, blah...
-
Some time ago I saw a problem from a partner team in Microsoft that an InvalidOperationException is thrown from WeakReference.IsAlive. WeakReference wraps weak GC handle implemented in CLR's Execution Engine (GC handle is also exposed by System.Runtime.InteropServices.GCHandle which supports not only weak handles, but other types too). A weak GC handle will be allocated and assigned to the WeakReference object when the WeakReference object is created. As described by Jeffrey Richiter, the weak GC handle contains pointer to an object, if the object is collected by GC, the GC handle will be cleared to NULL. Most of time WeakReference.IsAlive returns true or false to indicate whether the tracked object is alive. The check is based on whether the underlying GC handle contains a non-NULL pointer or NULL. Similarly, WeakReference.get_Target will return a valid object reference or null. But after the WeakReference object itself becomes unrooted and finalized, the underlying GC handle will be destroyed and any call to IsAlive or get_Target on the WeakReference object will throw InvalidOperationException in V1.X.
How would any method being called on a finalized object? Well, if one object O has a field WR as WeakReference, when O become unrooted and there are no other roots for WR, both objects are considered to be dead and will be put into F-reachable queue for finalization(check also check Jeffrey's article). Since there are no guarantee about order of finalizers (things are a little bit different for critical finalizer), when O's finalizer is executed, WR may already be finalized. Thus inside O's finalizer or after O is resurrected, it could call methods on the finalized WR. In the example I mentioned at the beginning, the problem is some object's finalizer is calling IsAlive on its WeakReference field.
The guideline for finalization says not to use any finalizable field in finalizer, so it's fair for WeakReference's properties to throw exception if they are called in finalizer. But it might be hard for people to understand why IsAlive needs to throw. After all it's only used to check status and doesn't need to access the tracked object if it's already collected. I think the reasoning is that IsAlive is meant to check whether the underlying GC handle tracks a live object, but if the GC handle is already gone during WeakReference's finalization, we can't answer the question.
The most interesting part is to look at history of this design decision. In V1.X, both IsAlive and get_Target property throws exception after the WeakRerence object is finalized; in Beta2 of V2.0, IsAlive still throws, but get_Target won't throw, it will return null after finalization; after Beta2, we made a change so that IsAlive won't throw either, it will return false after finalization. Partly because there are too many people calling WeakReference.IsAlive in finalizers, and it is not a really dangerous thing to do. Note that WeakReference.set_Target always throws after finalization.
So on CLR V2.0 offical released build, you could safely use WeakReference in finalizer.But it is still good practice not to use finalizable objects in finalizer, including WeakReference.
|
-
I got email asking me to explain !Threads output in details. I think this is a good question and a good topic for another installment to the series.
Here is an example I'll use for this post:
0:055> !threads ThreadCount: 202 UnstartedThread: 95 BackgroundThread: 1 PendingThread: 0 DeadThread: 47 PreEmptive GC Alloc Lock ID ThreadOBJ State GC Context Domain Count APT Exception 0 0xed0 0x0014f260 0x2000020 Enabled 0x00000000:0x00000000 0x00149aa0 1 Ukn 1 0xa3c 0x00157d28 0x2001220 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn (Finalizer) XXX 0 0x00166378 0x1820 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn 4 0x12cc 0x00166540 0x2001020 Enabled 0x00000000:0x00000000 0x00149aa0 0 STA 5 0x12dc 0x00166708 0x2001020 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn 3 0xe7c 0x00175b70 0x2001020 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn XXX 0 0x00175d38 0x1820 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn XXX 0 0x00175f00 0x1820 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn XXX 0 0x001760c8 0x1820 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn XXX 0 0x00176290 0x1820 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn System.InvalidOperationException XXX 0 0x00176458 0x1820 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn XXX 0 0x00176620 0x1820 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn XXX 0 0x001767e8 0x1820 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn 12 0x7e0 0x001769b0 0x2001020 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn 13 0x15e8 0x00178008 0x2001020 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn 14 0x4d0 0x001781d0 0x2001020 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn XXX 0 0x00178398 0x1820 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn XXX 0 0x00178560 0x1820 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn XXX 0 0x00178728 0x1820 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn XXX 0 0x001788f0 0x1820 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn XXX 0 0x00178ab8 0x1820 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn XXX 0 0x00178c80 0x1820 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn XXX 0 0x00178e48 0x1820 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn 21 0x14f0 0x00179010 0x2001020 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn 22 0x1708 0x001791d8 0x2001020 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn 23 0x11f8 0x001793a0 0x2001020 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn 24 0x224 0x00179568 0x2001020 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn XXX 0 0x00179730 0x1820 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn XXX 0 0x001798f8 0x1820 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn XXX 0 0x00179ac0 0x1820 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn System.InvalidOperationException XXX 0 0x00179c88 0x1820 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn System.InvalidOperationException XXX 0 0x00179e50 0x1820 Enabled 0x00000000:0x00000000 0x00149aa0 0 Ukn ...
First !Threads gives some statistics about Thread Store.
ThreadCount: number of total C++ Thread objects in Thread Store.
UnstartedThread: number of C++ Thread objects marked as unstarted. Recall I mentioned in previous blog, if a user creates a C# Thread object, CLR will create an "unstarted" C++ Thread object. When Thread.Start is called on the C# object, CLR will create an OS thread and remove "unstarted" flag from the C++ Thread object.
BackgroundThread: number of C++ Threads (and the corresponding OS threads) considered as background. Being background simply means CLR won't wait the thread for shutting down. Threads created explicitly by using System.Threading.Thread.Start are by default foreground threads; whereas threads wandering into CLR from unmanaged world are by default background threads (Rotor: SetupThread in vm/threads.cpp calls SetBackground(TRUE)). However, whether a thread is background could be changed by using IsBackground property in C# Thread object.
PendingThreads: If an OS thread is created but its ThreadProc hasn't be executed to the place to decrement unstarted counter in Thread Store, the thread is considered to be pending. Number of this type of threads should be quite low.
DeadThreads: Number of C++ Thread objects whose OS threads are already dead but the C++ objects themselves are not deleted yet.
In Rotor, all the five numbers are actually stored in ThreadStore (vm/threads.h) object as its fields.
Then it comes a table of all C++ Thread objects in Thread Store. Let me explain each field.
The first column doesn't have a header. It is the OS thread ID given by debugger just for debugging readability. Because the numbers only exist in debugger process, not the debuggee process, you may see the number being different when you look at a live session than when you debug a dump taken from the same live session. For a "dead" or "unstarted" thread, this column is "XXX".
ID: this is the thread ID assigned by OS, it remains consistent during debugger sessions, but OS could recycle it.
ThreadOBJ: address of C++ Thread object. You could see contents of the object by "dt mscorwks!Thread <address>" if you have symbols for mscorwks.dll.
State: one of the most important fields of the table. For Rotor, it is the C++ Thread's m_State field. It is combination of bit masks to indicate what the status the Thread currently is. All possible states (bit masks) are defined as enum ThreadState in vm/Threads. We already covered several states like TS_Background, TS_Unstarted, and TS_Dead. More states include TS_AbortRequested (this thread is requested to be aborted), TS_AbortInitiated (abort process is already started for this thread), TS_GCSuspendPending (GC is trying to suspend this thread), and etc.
Preemptive GC: also very important. In Rotor, this is m_fPreemptiveGCDisabled field of C++ Thread class. It indicates what GC mode the thread is in: "enabled" in the table means the thread is in preemptive mode where GC could preempt this thread at any time; "disabled" means the thread is in cooperative mode where GC has to wait the thread to give up its current work (the work is related to GC objects so it can't allow GC to move the objects around). When the thread is executing managed code (the current IP is in managed code), it is always in cooperative mode; when the thread is in Execution Engine (unmanaged code), EE code could choose to stay in either mode and could switch mode at any time; when a thread are outside of CLR (e.g, calling into native code using interop), it is always in preemptive mode.
GC Alloc context: allocate context GC might use when it tries to allocate object for this thread. In Rotor, it is m_alloc_context in C++ Thread object.
Domain: which AppDomain the thread is currently in (Rotor: m_pDomain field of C++ Thread class). You could use !DumpDomain or "dt mscorwks!AppDomain" to dump details of the domain. A thread can only be in one domain at a time, but it could switch into different domains. Speical marks will be put on thread's stack to when it transit to another domain.
Lock count: how many locks this thread has taken (Rotor: m_dwLockCount field of C++ Thread class). The locks it tracks include the managed monitors (taken by lock(obj) in C#), BCL's ReaderWriterLock, and certain locks inside CLR's unmanaged code.
APT: COM apartment for the thread, whether the thread is in a single-threaded apartment (STA), multithreaded apartment(MTA) or unknown.
Exception: the last managed exception thrown from this thread. It is saved in a GC handle in the C++ Thread object (Rotor: m_LastThrownObjectHandle).
The last column also indicates which special thread this thread is. However, !Threads only recognize a limited type of special threads for this field, including Finalizer thread, GC thread, Threadpool Worker thread, and Threadpool Completion Port thread. And for special threads which doesn't have a C++ Thread object (a special thread doesn't need to run managed code like debugger helper thread and server GC thread), they can not be displayed here. In Whidbey, a "-special" option is added to !Threads command which will show all special threads in the process as a separate list. Here is a sample output:
0:007> !threads -special ThreadCount: 4 UnstartedThread: 0 BackgroundThread: 3 PendingThread: 0 DeadThread: 0 Hosted Runtime: no PreEmptive GC Alloc Lock ID OSID ThreadOBJ State GC Context Domain Count APT Exception 0 1 828 0029a030 a020 Disabled 06907c38:069081d4 0f59e038 2 MTA 4 2 16fc 0029e980 b220 Enabled 0690424c:069061d4 0021f4a8 0 MTA (Finalizer) 5 3 1c1c 002e71e8 1220 Enabled 028f20f8:028f3f94 0021f4a8 0 Ukn 6 4 1244 0f6fa778 80a220 Enabled 00000000:00000000 0021f4a8 0 MTA (Threadpool Completion Port)
OSID Special thread type 1 e20 DbgHelper 2 1e1c GC 3 1ed4 GC 4 16fc Finalizer 5 1c1c ADUnloadHelper 6 1244 Timer
This posting is provided "AS IS" with no warranties, and confers no rights.
|
-
With knowledge in my previous blog, we could avoid some mistakes in .NET programming.
A C++ Thread is very resource heavy. It is associated with a lot of dynamically allocated memory and some OS handles. So it had better to be cleaned up ASAP after its corresponding OS thread dies. C++ Thread class has a reference count. For its object to be deleted, the ref count has to be dropped to 0 (Rotor: Thread::DecExternalCount in vm\threads.cpp). One interesting point is that the C# Thread object actually keeps a reference to its associated C++ Thread, so a live C# Thread object could keep its C++ Thread from being deleted even if the OS thread is already dead. (On the other hand, C++ Thread also has a reference to C# Thread, but it will break the circle when its own ref count drops to 1). Because C# Thread is a managed object, its lifetime is mostly determined by users. Plus, C# Thread class has a finalizer, so its lifetime will be extended at least one GC. So if user code caches the C# Thread objects or have some ill-behaved finalizers (in another blog entry, I mentioned wrong-doing finalizer on one object could prevent all other object's fianlizer from running), "dead" C++ Thread objects may accumulate over time and some "memory leak" will be observed.
I have an example here to demo the problem and how to debug it using windbg + SOS. In this process, there are 202 C++ Thread objects. Among which 160 are "dead", meaning their associated OS threads are dead. Number of total threads in Thread Store and dead/unstarted threads are showed in "!threads" output. For a "live" thread, OS and debugger thread ID are printed out for the entry, for a "dead" thread, "XXX" is marked at beginning of the line:
0:043> !threads ThreadCount: 202 UnstartedThread: 0 BackgroundThread: 1 PendingThread: 0 DeadThread: 160 PreEmptive GC Alloc Lock ID ThreadOBJ State GC Context Domain Count APT Exception 0 0x1138 0x0015a298 0x20 Enabled 0x00000000:0x00000000 0x00149ac8 1 Ukn 1 0x1148 0x00152530 0x1220 Enabled 0x00000000:0x00000000 0x00149ac8 0 Ukn (Finalizer) 3 0x114c 0x00177548 0x2001020 Enabled 0x00000000:0x00000000 0x00149ac8 0 Ukn 4 0x1150 0x00177878 0x2001020 Enabled 0x00000000:0x00000000 0x00149ac8 0 Ukn 5 0x1154 0x00177c08 0x2001020 Enabled 0x00000000:0x00000000 0x00149ac8 0 Ukn … 42 0x11e4 0x00180460 0x2001020 Enabled 0x00000000:0x00000000 0x00149ac8 0 Ukn 22 0x11e8 0x00180838 0x2001020 Enabled 0x00000000:0x00000000 0x00149ac8 0 Ukn XXX 0 0x00180c10 0x1820 Enabled 0x00000000:0x00000000 0x00149ac8 0 Ukn XXX 0 0x00180fe8 0x1820 Enabled 0x00000000:0x00000000 0x00149ac8 0 Ukn XXX 0 0x001813c0 0x1820 Enabled 0x00000000:0x00000000 0x00149ac8 0 Ukn XXX 0 0x00181750 0x1820 Enabled 0x00000000:0x00000000 0x00149ac8 0 Ukn XXX 0 0x00181b28 0x1820 Enabled 0x00000000:0x00000000 0x00149ac8 0 Ukn XXX 0 0x00181f00 0x1820 Enabled 0x00000000:0x00000000 0x00149ac8 0 Ukn XXX 0 0x001822d8 0x1820 Enabled 0x00000000:0x00000000 0x00149ac8 0 Ukn … //continue with a huge list
Now I want to find out why all the "dead" C++ Thread objects are still around. First I could check its ref count if I have symbols for mscorwks.
//0x00180c10 is a dead ThreadOBJ I picked from !Threads output 0:043> dt mscorwks!Thread 0x00180c10 m_ExternalRefCount +0x0cc m_ExternalRefCount : 1
Since the ref count is 1, if this C++ Thread object has a C# Thread object associated with it, the C# object must be the last reference. I could verify if that is the case by checking the C++ object's m_ExposedObject field. It is a weak GC handle (a unmovable pointer to GC reference which doesn't counted as root of the GC object), so dereference it will get the managed object. As mentioned before, C++ Thread object also has a strong handle (m_StrongHndToExposedObject field) to the C# object, but it already cleared the strong handle when ref count drops to 1 to avoid circular reference.
0:043> dt mscorwks!Thread 0x00180c10 m_ExposedObject +0x0c0 m_ExposedObject : 0x00a71054
0:043> dp 0x00a71054 l1 00a71054 00c5c714
0:043> !do 00c5c714 Name: System.Threading.Thread MethodTable 0x79bb8384 EEClass 0x79bb85b0 Size 60(0x3c) bytes GC Generation: 0 mdToken: 0x020000eb (c:\windows\microsoft.net\framework\v1.1.4322\mscorlib.dll) FieldDesc*: 0x79bb8614 MT Field Offset Type Attr Value Name 0x79bb8384 0x4000330 0x4 CLASS instance 0x00000000 m_Context 0x79bb8384 0x4000331 0x8 CLASS instance 0x00000000 m_LogicalCallContext 0x79bb8384 0x4000332 0xc CLASS instance 0x00000000 m_IllogicalCallContext 0x79bb8384 0x4000333 0x10 CLASS instance 0x00000000 m_Name 0x79bb8384 0x4000334 0x14 CLASS instance 0x00000000 m_ExceptionStateInfo 0x79bb8384 0x4000335 0x18 CLASS instance 0x00000000 m_Delegate 0x79bb8384 0x4000336 0x1c CLASS instance 0x00000000 m_PrincipalSlot 0x79bb8384 0x4000337 0x20 CLASS instance 0x00000000 m_ThreadStatics 0x79bb8384 0x4000338 0x24 CLASS instance 0x00000000 m_ThreadStaticsBits 0x79bb8384 0x4000339 0x28 CLASS instance 0x00000000 m_CurrentCulture 0x79bb8384 0x400033a 0x2c CLASS instance 0x00000000 m_CurrentUICulture 0x79bb8384 0x400033b 0x30 System.Int32 instance 2 m_Priority 0x79bb8384 0x400033c 0x34 System.Int32 instance 1575952 DONT_USE_InternalThread 0x79bb8384 0x400033d 0 CLASS shared static m_LocalDataStoreMgr >> Domain:Value 0x00149ac8:0x00c05338 <<
Then I want to check root of the C# Thread object to see who keeps it alive:
0:043> !gcroot 00c5c714 Scan Thread 0 (0x1138) ESP:12f69c:Root:0xc5b3f4(System.Object[])->0xc5c714(System.Threading.Thread) …
So there is an array who keeps a reference to a "dead" C# Thread. This looks interesting. I could check all other C# Thread objects in the process using !DumpHeap command. !DumpHeap could dump objects in GC heap for a particular type specified by "-type" option:
0:043> !DumpHeap -type System.Threading.Thread Address MT Size Gen 0x00c054c8 0x79bb8384 60 1 System.Threading.Thread 0x00c5b730 0x79bc81d4 28 2 System.Threading.ThreadStart 0x00c5b74c 0x79bb8384 60 2 System.Threading.Thread 0x00c5b7bc 0x79bc81d4 28 2 System.Threading.ThreadStart 0x00c5b7d8 0x79bb8384 60 2 System.Threading.Thread 0x00c5b820 0x79bc81d4 28 2 System.Threading.ThreadStart 0x00c5b83c 0x79bb8384 60 2 System.Threading.Thread 0x00c5b884 0x79bc81d4 28 2 System.Threading.ThreadStart 0x00c5b8a0 0x79bb8384 60 2 System.Threading.Thread 0x00c5b8e8 0x79bc81d4 28 2 System.Threading.ThreadStart 0x00c5b904 0x79bb8384 60 2 System.Threading.Thread 0x00c5b94c 0x79bc81d4 28 2 System.Threading.ThreadStart 0x00c5b968 0x79bb8384 60 2 System.Threading.Thread 0x00c5b9b0 0x79bc81d4 28 2 System.Threading.ThreadStart …//long list total 401 objects
Because !DumpHeap match type by string, so it also dumps ThreadStart objects. Because every C# Thread object created by user code always has a ThreadStart object (but C# Thread created by System.Thread.CurrentThread may not have a ThreadStart), so they show up as a pair. among 401 such objects, 200 are C# Thread objects, roughly match the number of C++ Thread objects (the number doesn't have to be the same because not every C++ Thread object has a C# counterpart created). Generation for most of C# Thread objects are 2, meaning they already survive at least 2 GCs. When I track roots of those C# Thread objects, they all point to the array. In this case, we need to look closely to the source to see whether it is necessary to cache all the C# Thread objects in an array.
Another related topic is that CLR relies on DLL_THREAD_DETACH notification to mscorwks.dll's DllMain (Rotor: EEDllMain in vm\ceemain.cpp) to know an OS thread is dead, thus detach the related C++ Thread. Using TerminateThread API is already notoriously bad in unmanaged programming, here we see another reason not to call it in managed code: if TerminateThread is called on a managed OS thread, among other bad effect (e.g. back out code not executed), CLR will not get thread detach notification. Because C++ Thread object has references to OS thread's stack address, failing to detach it from the OS thread will cause crash at random place.
This posting is provided "AS IS" with no warranties, and confers no rights.
|
-
If you use SOS’s !Threads command during debugging a lot, you should be familiar with such output:
0:003> !threads PDB symbol for mscorwks.dll not loaded Loaded Son of Strike data table version 5 from "C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\mscorwks.dll" ThreadCount: 12 UnstartedThread: 5 BackgroundThread: 1 PendingThread: 0 DeadThread: 5 PreEmptive GC Alloc Lock ID ThreadOBJ State GC Context Domain Count APT Exception 0 0xb74 0x0014f230 0x20 Enabled 0x00000000:0x00000000 x00149aa8 1 Ukn 2 0xb58 0x00157cf8 0x1220 Enabled 0x00000000:0x00000000 x00149aa8 0 Ukn (Finalizer) XXX 0 0x001665f0 0x1820 Enabled 0x00000000:0x00000000 x00149aa8 0 Ukn XXX 0 0x0016d348 0x1400 Enabled 0x00000000:0x00000000 x00149aa8 0 Ukn XXX 0 0x0016d510 0x1820 Enabled 0x00000000:0x00000000 x00149aa8 0 Ukn XXX 0 0x0016d9d0 0x1400 Enabled 0x00000000:0x00000000 x00149aa8 0 Ukn XXX 0 0x0016db98 0x1820 Enabled 0x00000000:0x00000000 x00149aa8 0 Ukn XXX 0 0x0016e248 0x1400 Enabled 0x00000000:0x00000000 x00149aa8 0 Ukn XXX 0 0x0016e410 0x1820 Enabled 0x00000000:0x00000000 x00149aa8 0 Ukn XXX 0 0x0016e740 0x1400 Enabled 0x00000000:0x00000000 x00149aa8 0 Ukn XXX 0 0x0016e908 0x1820 Enabled 0x00000000:0x00000000 x00149aa8 0 Ukn XXX 0 0x0016ec98 0x1400 Enabled 0x00000000:0x00000000 x00149aa8 0 Ukn
Have you ever wondered what exactly are in the list? Why the number of threads listed here doesn’t match the number of "real" threads in the process (in the example above, I only have 4 threads in the process, but !Threads shows 12)? Why some "real" threads have entries here, some don't? Maybe they are managed System.Threading.Thread objects (but the number in the list might not match number of Thread objects either)? Answers to those questions are tied to how CLR implements threads and manages thread information. CLR provides classes in System.Threading namespace as threading APIs. As you know, threading is implemented by utilizing native threads in underlying OS (Windows, in CLR case). CLR just piggybacks managed threads to native OS threads. Users use BCL System.Threading.Thread objects (I’ll call it as C# Thread below) to control managed threads just as thread HANDLE is used to control to windows native threads. If you ever checked contents of a C# Thread object, you will find it is quite small (here I use SOS !DumpObj command, you could also see it using Visual Studio or other managed debuggers):
0:003> !DumpObj 0x00c03248 Name: System.Threading.Thread MethodTable 0x79bb8384 EEClass 0x79bb85b0 Size 60(0x3c) bytes GC Generation: 0 mdToken: 0x020000eb (c:\windows\microsoft.net\framework\v1.1.4322\mscorlib.dll) FieldDesc*: 0x79bb8614 MT Field Offset Type Attr Value Name 0x79bb8384 0x4000330 0x4 CLASS instance 0x00000000 m_Context 0x79bb8384 0x4000331 0x8 CLASS instance 0x00000000 m_LogicalCallContext 0x79bb8384 0x4000332 0xc CLASS instance 0x00000000 m_IllogicalCallContext 0x79bb8384 0x4000333 0x10 CLASS instance 0x00000000 m_Name 0x79bb8384 0x4000334 0x14 CLASS instance 0x00000000 m_ExceptionStateInfo 0x79bb8384 0x4000335 0x18 CLASS instance 0x00000000 m_Delegate 0x79bb8384 0x4000336 0x1c CLASS instance 0x00000000 m_PrincipalSlot 0x79bb8384 0x4000337 0x20 CLASS instance 0x00000000 m_ThreadStatics 0x79bb8384 0x4000338 0x24 CLASS instance 0x00000000 m_ThreadStaticsBits 0x79bb8384 0x4000339 0x28 CLASS instance 0x00000000 m_CurrentCulture 0x79bb8384 0x400033a 0x2c CLASS instance 0x00000000 m_CurrentUICulture 0x79bb8384 0x400033b 0x30 System.Int32 instance 2 m_Priority 0x79bb8384 0x400033c 0x34 System.Int32 instance 1498008 DONT_USE_InternalThread 0x79bb8384 0x400033d 0 CLASS shared static m_LocalDataStoreMgr
CLR actually needs much more information about a thread than fields of the C# Thread class, and such information is needed in CLR's unmanaged part where it's not easy to access managed objects. So inside Execution Engine (so far Execution Engine is still written in unmanaged code), there is an unmanaged C++ class also called Thread (let me call it C++ Thread) to keep all information for an OS native thread (OS thread). In Rotor, this class is defined in vm\threads.h. CLR needs to create a C++ Thread object for every OS thread EE knows of, that is, every thread which ever ran managed code. Such threads could either be (a) created explicitly by users using C# Thread.Start, (b) an unmanaged OS thread who has ever visited managed world, e.g, through interop, or (c) a special OS thread in CLR which might run managed code. For case a, when a user creates a C# Thread object, CLR will create a C++ Thread object, link it to the C# Thread object, and mark it as unstarted (Rotor: SetupUnstartedThread in vm\threads.cpp). Once Start method is called on the C# Thread object, CLR will create an OS thread, in the OS thread’s ThreadPproc (Rotor: ThreadNative::KickOffThread in vm\ComSynchronizable.cpp), CLR will save the C++ Thread object’s address to the OS thread's TLS (Thread Local Storage) and mark it to be started (Rotor: ThreadStore::TransferStartedThread in vm\threads.cpp); For case b, at entry point for an unmanaged OS thread to managed world, CLR will create a C++ Thread object and also use TLS to associate it with the OS thread (Rotor: SetupThread in vm\threads.cpp); Case c is similar to case b, except CLR might set up C++ Thread earlier. In any case, the C++ Thread object is the primary place for CLR to store information regarding to a managed thread. When CLR needs to access the information, it will just fetch the object from the OS thread's TLS. In that sense, the C# Thread is more like a managed proxy for the C++ Thread. In fact, for case b and c, there will be no C# Thread objects created unless users call System.Threading.Thread.CurrentThread. C++ Thread tracks its corresponding C# Thread using a GCHandle (m_ExposedObject field in C++ Thread object); C# Thread tracks its C++ Thread using a native pointer(DONT_USE_InternalThread field in C# Thread). You could verify this circular reference in debugger if you have symbols for mscorwks.dll:
0:003> !do 0x00c03248 Name: System.Threading.Thread … MT Field Offset Type Attr Value … 0x79bb8384 0x400033c 0x34 System.Int32 instance 1498008 DONT_USE_InternalThread … 0:013> dt mscorwks!Thread 0n1498008 … +0x0c0 m_ExposedObject : 0x00a710e4 … 0:013> dp 0x00a710e4 l1 00a710e4 00c03248
CLR uses a data structure called Thread Store (Rotor: ThreadStore in vm\threads.h) to keep all C++ Threads. What you see in output of !Threads command is actually list of C++ Threads in Thread Store. The “ThreadOBJ” field is address of a C++ Thread object. Other fields are important information about the C++ Thread, including OS thread ID for the corresponding OS thread. Here you could see not every live OS thread has a C++ Thread, which could be explained by the fact that CLR only creates a C++ Thread for an OS thread which ever runs managed code. Meanwhile you may also find some C++ Threads don't have an OS thread associated with them (the ID fields are “XXX”). They are either (a)unstarted, (b) failed to started, (c)used to represent an live OS thread which is now dead. For case (a), CLR will wait until C# Thread.Start is called and create an OS thread for it then; for (b) and (c), the C++ Threads object will be deleted some time later.
This posting is provided "AS IS" with no warranties, and confers no rights.
|
-
Question: How many threads does a typical managed process have when it just starts to run?
Answer: regardless how many threads the user creates, there are at least 3 threads for a common managed process after CLR starts up: a main thread which starts CLR and run user's Main method, CLR debugger helper thread which provides debugging service for interop debuggers like Visual Studio, and the finalizer thread which runs finalizers for unreachable objects. Depends on what the program does, CLR might create more threads to perform special tasks.
Sometimes it is important to know what "special" threads would be created in CLR so we could understand better the implicit impact of our managed programs. Here is a list of most common special threads:
1. Finalizer thread. The thread is to run finalizers for "dead" objects. This thread is created when GC heap is initialized during EE start up. In Rotor, the thread proc for the thread is GCHeap::FinalizerThreadStart in vm\gcee.cpp. Because GC is undeterministic and finalizers are executed in a separate thread, you can't predict when exactly an object will be finalized. Because there is only one thread to run all finalizers, if one finalizer is blocked, no other finalizers could run. So it is discouraged to take any lock in finalizer. Also see Maoni Stephens's blog for details about finalizer thread.
2. Debugger helper thread. As its name suggests, this thread helps debuggers (mixed mode and managed debugger, but not pure unmanaged debug like windbg) to get information of the managed process and to execute certain debugging operations. The thread is created when EE initializes debugger during start up. In Rotor, the thread proc for this thread is DebuggerRCThread::ThreadProcStatic (debug\ee\Rcthread.cpp). Also see Mike Stall's blog about impact of this helper thread。
3. Concurrent GC thread (doesn't exist in Rotor). As explained in Maoni and Chris Lyon's blog, concurrent GC is a special GC mode which allows garbage to be collected while managed threads are running simultaneously. To achieve this goal, CLR creates a thread to perform GC concurrently with user threads. The thread is only created when CLR decides to do a concurrent GC (even when concurrent GC mode is on, not every GC is concurrent, read Maoni's blog for details) and will be recycled when there are no concurrent GC work to do.
4. Server GC threads (doesn't exist in Rotor). Maoni and Chris also explained Server GC mode where on multi-process machine CLR creates one GC heap for each CPU and one thread to do GC for each heap. When Server GC mode is enabled, server GC threads will be created at EE start up time when GC heaps are initialized.
5. App Domain unload helper thread. In CLR V1.X, when a thread requests to unload an App Domain and the thread is in that App Domain itself, it needs to create a worker thread to do the unloading work. The worker thread will be dead once the target AD is unloaded. In Rotor, the thread starts with UnloadThreadWorker.ThreadStart (bcl\system\Appdomain.cs). In Whidbey, all AD unload work is performed in a special thread regardless whether the requesting thread is in the unloading domain. The helper thread is created when first non-default App Domain is created (default domain is never unloaded) and will stay alive since then. Also see Chris Brumme's blog about details of AD unload.
6. Threadpool threads. Depends on how a program use CLR threadpool, CLR might create threads of a varieties of types. There is only one thread for some thread type. For other types, number of threads is related to number of CPUs, the work load, and some user configurable settings. The thread types including wait threads (threads to perform asynchronized wait, could be more than one); worker threads (threads to execute user work item, could be more than one); Completion port threads (threads wait for completion port IO in Windows, could be more than one, doesn't exist in Rotor); Gate thread (thread help to monitor status of completion port threads and worker threads, only one); Timer thread (thread manages timer queue, only one).
|
-
I changed the program in previous post to use new Whidbey syntax.
using namespace System;
ref class RefT { public: RefT () {Console::WriteLine ("RefT::RefT");} ~RefT () {Console::WriteLine ("RefT::~RefT");} !RefT () {Console::WriteLine ("RefT::!RefT");} };
value class ValueT { //constructor is not allowed for value type //destructor is not allowed for value type };
int main() { { //1. finalizer will be called in asynchronizied fashion RefT ^ rrt = gcnew RefT; }
{ //2. Dispose is called at “delete” and finalizer is suppressed RefT ^ rrt = gcnew RefT; delete rrt; }
{ //3. Dispose is called at end of the block and finalizer is suppressed RefT rt; } { ValueT vt; } { ValueT ^ pvt = gcnew ValueT; }
{ ValueT * pvt = new ValueT; } return 0; }
First thing to notice is __gc and __value are replaced by ref and value, this is definitely clearer; then RefT now could have another method !RefT. What does this new methods do and how is related to the other ones? Having checked IL generated by Whidbey cl, I found RefT is translated into something like:
class RefT : IDisposable { //constructor RefT () { Console.WriteLine (“RefT::RefT”); }
//Dispose methods Dispose (bool disposing) { if (disposing) { ~Ref(); } else { try { !Ref(); } finally { Object.Finalize(); } } }
Dispose () { Dispose (true); SuppressFinalize (this); }
//finalizer Finalize () { Dispose (false); }
//body of Dispose and Finalize ~RefT () { Console.WriteLine (“RefT::~RefT”); }
!RefT () { Console.WriteLine (“RefT::!RefT”); }
}
Basically a reference type with destructor (method starts with ~) implements IDisposable interface. Its destructor becomes Dispose method; we could also define finalizer for a reference type using the “!” syntax. According to my test, those two are independent to each other. If I only define “~” function, not the “!” one, the class will only have Dispose methods, no Finalize will be generated although Dispose still call SuppressFinalize; if I only define “!” method without the “~” one, I got a compiler warning and a class which has a finalizer but doesn't implement IDisposable.
Other new things include using tracking handle (“^”) instead of pointer for reference to GC type and gcnew to indicate the memory is allocated in managed heap.
All the new designs make CLR concepts (reference/value type, Finalize, Dispose, managed heap) first class citizen in C++. As a CLR team member, I think this is much clearer than special annotation or mapping new concepts to existing features which have different semantics. However, from a C++ user's point of view, the changes seem to draw C++ closer to C#. I'm not sure if everyone would love them.
Back to finalization: for tracking handles, things are still similar to pointers in V1.X. If delete is not called on a tracking handle (like the first block in main), some time in the future finalizer will run and GC will collect the object; if "delete" is called (like the 2nd block in main), Dispose (~ function) is called and finalizer is suppressed, the object will still be GCed later.
The 3rd block in main is the most interesting one: we could create reference type object “on stack” (of course, the object is still in heap, we just save a reference in stack) and this “stack object” has the traditional C++ finalization semantics: when the variable goes out of scope, its destructor (Dispose method) will be called automatically if it has one. The generated IL for block 3 looks to be something like this:
RefT r = new RefT; try { } finally { ((IDisposable)r).Dispose (); }
This looks almost same as C#'s using statement. Implementing C++'s automatic destruction by Disposable pattern is a brilliant idea. I just worry the new “stack object” syntax will create new confusion about where the object really lives. I tend to think the stack object is a holder, similar to auto_ptr, which holds reference to an object in heap and will delete the object when it goes out of scope. I did have question at the beginning about whether this holder tracks ownership of the object (like auto_ptr) or does ref counting (like traditional smart pointer) to make sure if multiple holders reference the same object, only the last one dispose the object. But there seems to be no way to make two holders to point to the same object. E.g: this code doesn't compile with compliant “operator=” isn't available for RefT, even if you define operator= for RefT, it only applies to the object in heap, not the holder on stack:
RefT rt1; { RefT rt2; rt1 = rt2; } //I assumed rt2 would reference to a disposed object here
After all this “holder” is more of syntax sugar other than a real smart pointer class, it's easy to guarantee they don't represent the same object in heap. My worry might be unnecessary, but it's just an example how it could be confused.
With so many changes, it's inappropriate to call it managed extension to C++ anymore. Now people refer the new language mostly as C++/CLI. I really feel sorry for those who have to rewrite their C++ code for .NET platform again and again. But I think Whidbey lays down a foundation which could last for generations.
|
-
As a C++ fan, I'm a long time admirer for deterministic finalization. I think introduction of garbage collection to C style language by Java and .Net is a huge improvement. However, I found lose of deterministic destructor is almost unacceptable when I first enter Java/.Net world. Of course I'm used to it now, but it's still quite confusing to me for C# to use C++ destructor syntax for Finalizer. And in managed extension of C++, destructor becomes something totally different for managed data type. I bet a lot of experienced C++ developers make mistake to use finalizer as if it was destructor when they first try .Net. So when I read the new changes in Whidbey version of C++ from Stan Lippman's blog, I'm very excited and can't wait to give it a try.
But before we look into the new features, let's go over how old version of managed C++ handles destructors. I wrote this simple program:
#using <mscorlib.dll>
using namespace System;
__gc class RefT { public: RefT () {Console::WriteLine ("RefT::RefT");} ~RefT () {Console::WriteLine ("RefT::~RefT");} };
__value class ValueT { ValueT () {Console::WriteLine ("ValueT::ValueT");} //destructor is not allowed for value type };
int main() { { //1. auto-generated finalizer will be called in asynchronizied fashion RefT * prt = new RefT; }
{ //2. Dispose is called at “delete” and finalizer will be suppressed RefT * prt = new RefT; delete prt; }
{ ValueT vt; } { //value type can't be created in GC heap ValueT * pvt = __nogc new ValueT; delete pvt; } return 0; }
I compiled it with V1.1 C++ compiler and checked generated IL code using ildasm. RefT is compiled to something like this:
class RefT { RefT () { Console.WriteLine (“RefT::RefT”); }
void Finalize () { Console.WriteLine (“Ref::~Ref”); }
void __dtor () { GC.SuppressFinalize (this); Finalize (); } }
Here we could see that C++ destructor is mapped to CLR finalizer and a method “__dtor” is added to call finalizer and SupressFinalize.
In main, the first block creates a RefT object in heap and leaves it as garbage. Sometime later, finalizer (~RefT) will run and the object will be collected. In the second block, we “delete” the object. This is translated into a call to __dtor in IL. So “delete” acts more like Dispose method recommended by IDisposable pattern: the object is not freed but the contents are disposed and finalizer won't run on the object later.
|
-
One day I was debugging a problem where a Waston dialog popped up on a process. What surprised me was that on the stack where Waston was triggered, there was a unmanaged C++ function with a try-catch(…) block. To my understanding, this block should catch any user mode exception thrown in Windows, including exception from RaiseException call (e.g, C++ exceptions), AV, stack overflow, and etc. Why an exception could escape such a block and become unhandled (thus Waston showed up)? I found the exception was a debug break. In X86, it is triggered by opcode 0xCC or “int 3”. When I debugged into VCRT’s EH code, I found catch (…) deliberately let debug break go. It does make sense: debug break is meant to stop the debugger so source code should never handle it. I just never realized it before.
Another interesting part is where this debug break was from, the code of the process never calls DebugBreak. After I debugged more, the problem turned out to be the bug I mentioned in my previous blog entry: a premature GC issue. managed code passed a Delegate to unmanaged code without telling GC to extend its lifetime. When unmanaged code called the callback, the managed Delegate object was already collected so unmanaged code called into garbage memory. The memory happened to be filled with 0xCC so when the process tried to execute this code, it fired int 3, then Waston kicked in.
This posting is provided "AS IS" with no warranties, and confers no rights.
|
-
There is a bug in this program below, try to see if you could catch it.
Test.cs (compiled to DelegateExample.exe):
using System; using System.Threading; using System.Runtime.InteropServices;
class Test { delegate uint ThreadProc (IntPtr arg);
private uint m;
public Test (uint n) { m = n; }
uint Reflect (IntPtr arg) { Console.WriteLine (m); return m; }
static void Main () { Test t = new Test (1); ThreadProc tp = new ThreadProc (t.Reflect); NewThread (tp); Thread.Sleep (1000); }
[DllImport("UsingCallback")] static extern void NewThread (ThreadProc proc); }
UsingCallback.cpp (compiled to UsingCallback.dll):
_stdcall void NewThread (LPTHREAD_START_ROUTINE cb) { DWORD id = 0; CreateThread (NULL, 0, cb, NULL, 0,&id); }
Yes, here is the problem: in the cs file, the managed code passes a Delegate object to unmanaged code which will create a new thread to call the delegate. Since unmanaged code has no way to tell CLR how it plans to use the object, from CLR's point of view, there are no live roots for the Delegate object after the line “NewThread (tp);”. Thus the object is eligible to be garbage collected (GC) after the call, even if the new thread might not start yet. So it's possible for the Delegate to become trash before unmanaged code invokes it and cause unspecified failure. A fix is to add GC.KeepAlive before Main returns:
Test t = new Test (1); ThreadProc tp = new ThreadProc (t.Reflect); NewThread (tp); Thread.Sleep (1000); GC.KeepAlive (tp);
One thing annoying about this kind of bug is that GC is nondeterministic, the program could work just fine 99% of time, but only crashes under stress situation. Another thing make the problem hard to find is that it's not very intuitive by looking at the source. You might know the theory that when a variable is not used anymore, it will not be reported as a live root; but it would take quite some time to figure out at which point which variable is dead, plus there's no way to verify if CLR agrees with your analysis.
I'll show you how to use SOS.dll to check the internal data structure used by CLR to determine variable lifetime. When JIT compiles code, it generates such variable aliveness information (GC info) for each method and saves it along with the machine code of the method. When GC happens, it will check GC info for every method in the stack to find out which variable is alive and will use the live variables as object roots. GC info is highly compacted, but SOS.dll has “!GCInfo” command to crack it and show it in a human-readable way. This approach only works in assembly level, so stop reading if you are not interested in assembly language. :)
I compiled test.cs above using Visual studio to a "Debug" build, and launched the program under Windbg. After the Main method is JITted, I could disassemly the generated native code of the method using "!SOS.u" command:
0:000> !u 02f00058 Will print '>>> ' at address: 02f00058 Normal JIT generated code [DEFAULT] Void Test.Main() Begin 02f00058, size 60 >>> 02f00058 55 push ebp 02f00059 8bec mov ebp,esp 02f0005b 83ec08 sub esp,0x8 02f0005e 57 push edi 02f0005f 56 push esi 02f00060 53 push ebx 02f00061 33ff xor edi,edi 02f00063 33db xor ebx,ebx 02f00065 b9e850ad00 mov ecx,0xad50e8 (MT: Test) 02f0006a e8a91fbcfd call 00ac2018 (JitHelp: nc) 02f0006f 8bf0 mov esi,eax 02f00071 8bce mov ecx,esi 02f00073 ba01000000 mov edx,0x1 02f00078 ff152051ad00 call dword ptr [00ad5120] (Test..ctor) 02f0007e 8bfe mov edi,esi 02f00080 b9c451ad00 mov ecx,0xad51c4 (MT: Test/ThreadProc) 02f00085 e88e1fbcfd call 00ac2018 (JitHelp: nc) 02f0008a 8bf0 mov esi,eax 02f0008c 689350ad00 push 0xad5093 02f00091 8bd7 mov edx,edi 02f00093 8bce mov ecx,esi 02f00095 ff152452ad00 call dword ptr [00ad5224] (Test/ThreadProc..ctor) 02f0009b 8bde mov ebx,esi 02f0009d 8bcb mov ecx,ebx 02f0009f ff152c51ad00 call dword ptr [00ad512c] (Test.NewThread) 02f000a5 b9e8030000 mov ecx,0x3e8 02f000aa ff155084bb79 call dword ptr [mscorlib_79990000+0x228450 (79bb8450)] (System.Threading.Thread.Sleep) 02f000b0 90 nop 02f000b1 5b pop ebx 02f000b2 5e pop esi 02f000b3 5f pop edi 02f000b4 8be5 mov esp,ebp 02f000b6 5d pop ebp 02f000b7 c3 ret
SOS's "!u" is similar to Windbg's "u", but it shows more data because managed code is self-describable. For example, for those indirect calls, if the target is a managed function, "!u" could tell us the function name.
To check its GC info, we need to get the method's MethodDesc first (Method descriptor, CLR's data structure to keep all information about one method). We could use "!ip2md" to find out the method desc from any instruction pointer in the method:
0:000> !ip2md 02f00058 MethodDesc: 0x00ad50a8 Jitted by normal JIT Method Name : [DEFAULT] Void Test.Main() MethodTable ad50e8 Module: 151ad0 mdToken: 06000003 (D:\projects\DelegateExample\bin\Debug\DelegateExample.exe) Flags : 10 Method VA : 02f00058
This command shows some important information about the method. To check GC info, we only need to pass the MethodDesc pointer itself to “!GCInfo”:
0:000> !gcinfo 0x00ad50a8 Normal JIT generated code Method info block: method size = 0060 prolog size = 9 epilog size = 7 epilog count = 1 epilog end = yes saved reg. mask = 000F ebp frame = yes fully interruptible=yes double align = no security check = no exception handlers = no local alloc = no edit & continue = yes varargs = no argument count = 0 stack frame size = 2 untracked count = 0 var ptr tab count = 0 epilog at 0059 60 E5 C0 45 |
Pointer table: F0 7B | 000B reg EDI becoming live 5A | 000D reg EBX becoming live F0 42 | 0017 reg EAX becoming live 72 | 0019 reg ESI becoming live 4A | 001B reg ECX becoming live F0 03 | 0026 reg EAX becoming dead 08 | 0026 reg ECX becoming dead F0 44 | 0032 reg EAX becoming live 30 | 0032 reg ESI becoming dead 72 | 0034 reg ESI becoming live 57 | 003B reg EDX becoming live 4A | 003D reg ECX becoming live 06 | 0043 reg EAX becoming dead 08 | 0043 reg ECX becoming dead 10 | 0043 reg EDX becoming dead 4C | 0047 reg ECX becoming live 0E | 004D reg ECX becoming dead 30 | 004D reg ESI becoming dead F1 1B | 0060 reg EBX becoming dead 38 | 0060 reg EDI becoming dead FF |
Output of the command has 2 sections, the first part is method info block, which contains some basic information about the JITted code, like size of the method, size of the prolog, and etc. The 2nd part is pointer table, on which I'll spend most of time. Pointer table describes lifetime of every GC reference inside the method. It has 3 columns, the first one is byte encodings, which we don't need to care about; the second column is the offset in the JITted code. E.g, this method's code starts from 02f00058, so 000B means the instruction at 02f00058+B = 2F00063, “xor ebx,ebx”; the third column tells us change of lifetime for a GC pointer at that instruction. E,g, “000B reg EDI becoming live” means starts from 2F00063, register EDI is a live root; similiarly, we can see EDI becomes dead at offset 60, end of the method (0060 reg EDI becoming dead). So whatever variable EDI is used to store, its lifetime is from beginning to end of the method. To understand the pointer table, it's better to interweave the table with the JITted code. In Whidbey, SOS has "!u -gcinfo" to do the job; but for Everett, I have to manually put them together and show it along with the source code:
02f00058 55 push ebp 02f00059 8bec mov ebp,esp 02f0005b 83ec08 sub esp,0x8 02f0005e 57 push edi 02f0005f 56 push esi 02f00060 53 push ebx 02f00061 33ff xor edi,edi GCInfo: 000B reg EDI becoming live 02f00063 33db xor ebx,ebx GCInfo: 000D reg EBX becoming live
Test t = new Test (1);
02f00065 b9e850ad00 mov ecx,0xad50e8 (MT: Test) 02f0006a e8a91fbcfd call 00ac2018 (JitHelp: nc) GCInfo: 0017 reg EAX becoming live 02f0006f 8bf0 mov esi,eax GCInfo: 0019 reg ESI becoming live 02f00071 8bce mov ecx,esi GCInfo: 001B reg ECX becoming live 02f00073 ba01000000 mov edx,0x1 02f00078 ff152051ad00 call dword ptr [00ad5120] (Test..ctor) GCInfo: 0026 reg EAX becoming dead GCInfo: 0026 reg ECX becoming dead 02f0007e 8bfe mov edi,esi
ThreadProc tp = new ThreadProc (t.Reflect);
02f00080 b9c451ad00 mov ecx,0xad51c4 (MT: Test/ThreadProc) 02f00085 e88e1fbcfd call 00ac2018 (JitHelp: nc) GCInfo: 0032 reg EAX becoming live GCInfo: 0032 reg ESI becoming dead 02f0008a 8bf0 mov esi,eax GCInfo: 0034 reg ESI becoming live 02f0008c 689350ad00 push 0xad5093 02f00091 8bd7 mov edx,edi GCInfo: 003B reg EDX becoming live 02f00093 8bce mov ecx,esi GCInfo: 003D reg ECX becoming live 02f00095 ff152452ad00 call dword ptr [00ad5224] (Test/ThreadProc..ctor) GCInfo: 0043 reg EAX becoming dead GCInfo: 0043 reg ECX becoming dead GCInfo: 0043 reg EDX becoming dead
NewThread (tp);
02f0009b 8bde mov ebx,esi 02f0009d 8bcb mov ecx,ebx GCInfo: 0047 reg ECX becoming live 02f0009f ff152c51ad00 call dword ptr [00ad512c] (Test.NewThread) GCInfo: 004D reg ECX becoming dead GCInfo: 004D reg ESI becoming dead
Thread.Sleep (1000);
02f000a5 b9e8030000 mov ecx,0x3e8 02f000aa ff155084bb79 call dword ptr [mscorlib_79990000+0x228450 (79bb8450)] (System.Threading.Thread.Sleep) 02f000b0 90 nop 02f000b1 5b pop ebx 02f000b2 5e pop esi 02f000b3 5f pop edi 02f000b4 8be5 mov esp,ebp 02f000b6 5d pop ebp 02f000b7 c3 ret
GCInfo: 0060 reg EBX becoming dead GCInfo: 0060 reg EDI becoming dead
Let's assume a thread triggers GC when another thread is excuting code 02f0009d. From the table, we know at this time, register ESI, EBX, and EDI will be reported as live roots to GC. With some disassembly, we could see at this moment both ESI and EBX contain reference to Delegate tp, and EDI is variable t. It might surprise you that both EBX and EDI keep alive until the function ends. That means variable t and tp are actually alive for the whole function, so the code has no bug?!
The tricky point is that a variable is eligible to be dead once it's not used anymore. However, it's up to JIT to determine whether it really wants to report the variable to be dead. In fact, for debuggable code, JIT extends lifetime for every variable to end of the function.
To prove the premature GC bug really exists in the sample, I compiled it to "Release" build and repeated all the steps above:
0:000> !u 02df0058 Loaded Son of Strike data table version 5 from "C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\mscorwks.dll" Will print '>>> ' at address: 02df0058 Normal JIT generated code [DEFAULT] Void Test.Main() Begin 02df0058, size 46 >>> 02df0058 57 push edi 02df0059 56 push esi 02df005a b9e850ad00 mov ecx,0xad50e8 (MT: Test) 02df005f e8b41fcdfd call 00ac2018 (JitHelp: nc) 02df0064 8bf0 mov esi,eax 02df0066 c7460401000000 mov dword ptr [esi+0x4],0x1 02df006d b9bc51ad00 mov ecx,0xad51bc (MT: Test/ThreadProc) 02df0072 e8a11fcdfd call 00ac2018 (JitHelp: nc) 02df0077 8bf8 mov edi,eax 02df0079 689350ad00 push 0xad5093 02df007e 8bd6 mov edx,esi 02df0080 8bcf mov ecx,edi 02df0082 ff151c52ad00 call dword ptr [00ad521c] (Test/ThreadProc..ctor) 02df0088 8bcf mov ecx,edi 02df008a ff152c51ad00 call dword ptr [00ad512c] (Test.NewThread) 02df0090 b9e8030000 mov ecx,0x3e8 02df0095 ff155084bb79 call dword ptr [mscorlib_79990000+0x228450 (79bb8450)] (System.Threading.Thread.Sleep) 02df009b 5e pop esi 02df009c 5f pop edi 02df009d c3 ret
0:000> !ip2md 02df0058 MethodDesc: 0x00ad50a8 Jitted by normal JIT Method Name : [DEFAULT] Void Test.Main() MethodTable ad50e8 Module: 151ad0 mdToken: 06000003 (D:\projects\DelegateExample\bin\Release\DelegateExample.exe) Flags : 10 Method VA : 02df0058
0:000> !gcinfo 0x00ad50a8 Normal JIT generated code Method info block: method size = 0046 prolog size = 2 epilog size = 3 epilog count = 1 epilog end = yes saved reg. mask = 0003 ebp frame = no fully interruptible=no double align = no security check = no exception handlers = no local alloc = no edit & continue = no varargs = no argument count = 0 stack frame size = 0 untracked count = 0 var ptr tab count = 0 epilog at 0043 46 21 |
Pointer table: A9 | 001F call 0 [ ESI ] 07 | 0026 push CB | 0030 call 1 [ EDI ] FF |
Here is mixed source code, native code and pointer table:
02df0058 57 push edi 02df0059 56 push esi
Test t = new Test (1);
02df005a b9e850ad00 mov ecx,0xad50e8 (MT: Test) 02df005f e8b41fcdfd call 00ac2018 (JitHelp: nc) 02df0064 8bf0 mov esi,eax 02df0066 c7460401000000 mov dword ptr [esi+0x4],0x1
ThreadProc tp = new ThreadProc (t.Reflect);
02df006d b9bc51ad00 mov ecx,0xad51bc (MT: Test/ThreadProc) 02df0072 e8a11fcdfd call 00ac2018 (JitHelp: nc) GCInfo: 001F call 0 [ ESI ] 02df0077 8bf8 mov edi,eax 02df0079 689350ad00 push 0xad5093 GCInfo: 0026 push 02df007e 8bd6 mov edx,esi 02df0080 8bcf mov ecx,edi 02df0082 ff151c52ad00 call dword ptr [00ad521c] (Test/ThreadProc..ctor) GCInfo: 0030 call 1 [ EDI ]
EM>NewThread (tp);
02df0088 8bcf mov ecx,edi 02df008a ff152c51ad00 call dword ptr [00ad512c] (Test.NewThread)
Thread.Sleep (1000);
02df0090 b9e8030000 mov ecx,0x3e8 02df0095 ff155084bb79 call dword ptr [mscorlib_79990000+0x228450 (79bb8450)] (System.Threading.Thread.Sleep) 02df009b 5e pop esi 02df009c 5f pop edi 02df009d c3 ret
In release build, the JIIted code is smaller, and the pointer table in GC info is much smaller. I need to explain some new syntax in this pointer table: “call 0 [ESI]” means the method calls a function with 0 argument, and ESI is a live variable at this point; "push" just indicates change of the stack, which is important information for GC to unwind this frame, but doesn't affect pointer aliveness.
One thing interesting about this new pointer table is that it only reports GC references when the method calls into other methods. What if a GC happens somewhere else? For example, at line 02df0079, EDI contains object tp and ESI contains object t, if a GC happens while a thread is excuting 02df0079 but GCInfo doesn't report those two variables, will they be collected? The answer is GC can't happen at that place. There is an important field in method info block called "fully interruptable". In debug version, this field for the method is true but in release version the field is false. A method is fully interruptable means GC could stop a thread (and perform collection) at any point if the thread is executing this method. If a method is not fully interruptable, GC can't start at arbitrary point if a thread is executing this method. It has to wait until a point when the thread returns from a call (e.g, return from ThreadProc's constructor back to Main) or when the method calls into another method which allows GC to happen (calls into unmanaged code via PInvoke always allow GC to happen). That's why we only need to report varaibles at calls. For performance reason, such trivial methods like Test.Main is usually non-fully interruptable in release build.
With all the knowledge, now we know that if a GC happens when a thread is just returning from the call "new ThreadProc" at instruction 02df0072, GC knows ESI (variable t) is a live root; if a GC happens when a thread is just returning from the call to ThreadProc's constructor at 02df0082, GC knows EDI (variable tp) is a live root. Variable t is not reported this time(but it's kept alive by object tp). However if a GC happens when a thread returns from the call to NewThread at 02df008a, neither t nor tp will be reported so they could be collected by GC. If the new thread hasn't start then, it will have trouble when using the objects.
So next time if you suspect there might be some premature GC bug, you could try "!SOS.GCInfo" to see how exactly CLR thinks about the GC object's lifetime.
This posting is provided "AS IS" with no warranties, and confers no rights.
|
-
I didn't realize I've stopped blogging for 1 year. What a shame! Fortunately I didn’t waste the time: we ship Whidbey Beta1 and Beta2 in the past year! Now with Beta2 out of door, I have more spare time for blogging. :)
Today I want to talk about some interesting facts about Timer in CLR. There is an example for how to use timer in MSDN: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpref/html/frlrfsystemthreadingtimerclasstopic.asp
This sample starts a timer and does certain things when the timer fires for certain times, like killing the timer. However, this sample has a bug which will cause trouble in stress scenario. To demonstrate the problem, I made a little change to the code:
using System; using System.Threading;
class TimerExample { static void Main() { AutoResetEvent autoEvent = new AutoResetEvent(false); StatusChecker statusChecker = new StatusChecker(100);
// Create the delegate that invokes methods for the timer. TimerCallback timerDelegate = new TimerCallback(statusChecker.CheckStatus);
Console.WriteLine("{0} Creating timer.\n", DateTime.Now.ToString("h:mm:ss.fff")); Timer stateTimer = new Timer(timerDelegate, autoEvent, 0, 10);
// start another thread to post work items to thread pool Thread t = new Thread (new ThreadStart (PostWortItem)); t.Start ();
// When autoEvent signals, dispose of // the timer. autoEvent.WaitOne(); stateTimer.Dispose(); Console.WriteLine("\nDestroying timer."); }
// a Thread proc which keeps posting work items to thread pool static void PostWortItem () { // Post some user work items to thread pool for (int i = 0; i < 1000; i++) { ThreadPool.QueueUserWorkItem (new WaitCallback (WorkItem)); Thread.Sleep (10); } }
// An nop work item for thread pool static void WorkItem (object o) { Thread.Sleep (500); } }
class StatusChecker { int invokeCount, maxCount; public StatusChecker(int count) { invokeCount = 0; maxCount = count; }
// This method is called by the timer delegate. public void CheckStatus(Object stateInfo) { Console.WriteLine("Checking status " + (++invokeCount));
if(invokeCount == maxCount) { //signal Main. AutoResetEvent autoEvent = (AutoResetEvent)stateInfo; autoEvent.Set(); } } }
Basically I added another thread to keep posting work items to threadpool, but the rest part is still expected to behave the same: when the timer fires the 100th time, it should set an event so the main thread would stop the timer.
In one of 5 runs in my machine, I got such output:
5:48:07.625 Creating timer.
Checking status 1 Checking status 2 Checking status 3 Checking status 4 … Checking Status 93 Checking Status 94 Checking Status 95 Checking Status 96 Checking Status 97 Checking Status 98 Checking Status 102 Checking Status 99 Checking Status 103 Checking Status 104 Checking Status 105 … Checking Status 698 Checking Status 700 Checking Status 701 Checking Status 703 Checking Status 703 Checking Status 704 Checking Status 705 …
^C
It seems that invokeCount never hits 100 thus the program doesn't stop and some other sequence in the output looks to be out of order.
How does this happen? First we need to understand how timer is implemented in CLR, who is executing the timer callbacks?
One simple idea would be putting all timers in a queue and having a dedicate thread doing something like this (pseudo code):
while (true) { foreach (Timer t in timer queue) { if (t.TimeToFire ()) { t.InvokeCallback (); } } Sleep(MinumInterval); }
However with this logic one lengthy timer callback would block all other timers. In CLR, we do have a timer queue and a dedicate timer thread. However the only job of timer thread is to maintain the timer queue, when a timer needs to fire, timer thread queue a work item to threadpool, then a thread pool's worker thread will pick up the work item and invoke the timer callback.
In Rotor's source, the timer thread's logic is in vm/Win32threadpool.cpp, the thread proc is ThreadpoolMgr::TimerThreadStart and ThreadpoolMgr::FireTimers does most of interesting work. The pseudo code looks like:
while (true) { foreach (Timer t in timer queue) { if (t.TimeToFire ()) { // put a work item to thread pool // to call timer cal back on t once WorkItem work = CallTimerCallbackOnce (t); ThreadPool.QueueWorkItem (work); } } //MinumInterval is minum of next firing interval // for all timers in the queue Sleep(MinumInterval); }
| |
|