<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/atom.xsl" media="screen"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-US"><title type="html">mgrier's WebLog</title><subtitle type="html" /><id>http://blogs.msdn.com/mgrier/atom.xml</id><link rel="alternate" type="text/html" href="http://blogs.msdn.com/mgrier/default.aspx" /><link rel="self" type="application/atom+xml" href="http://blogs.msdn.com/mgrier/atom.xml" /><generator uri="http://communityserver.org" version="2.1.61025.2">Community Server</generator><updated>2005-05-17T01:04:00Z</updated><entry><title>Canonical Function Structure</title><link rel="alternate" type="text/html" href="http://blogs.msdn.com/mgrier/archive/2006/10/02/Canonical-Function-Structure.aspx" /><id>http://blogs.msdn.com/mgrier/archive/2006/10/02/Canonical-Function-Structure.aspx</id><published>2006-10-02T21:50:00Z</published><updated>2006-10-02T21:50:00Z</updated><content type="html">&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;Here is a proposed canonical structure for functions/procedures.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;Clearly in some cases some sections would not be present.&lt;SPAN style="mso-spacerun: yes"&gt;&amp;nbsp; &lt;/SPAN&gt;What’s important is to understand the basic phasing of the work to be done and the separation of tasks.&lt;/FONT&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;int foo(&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;char *StringIn, // “in” parameter&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;size_t BufferLength, // “in” parameter&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;char Buffer[], // “out” parameter, max length governed by BufferLength&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;size_t *CharsWritten // “out” parameter, must be &amp;lt;= BufferLength&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;) {&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;// Phase 1: Initialize out parameters&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;*CharsWritten = 0;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;o:p&gt;&lt;FONT size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;// Phase 2: Capture and validate parameters&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;const size_t StringInLength = strlen(StringIn);&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;if ((BufferLength != 0) &amp;amp;&amp;amp; (Buffer == NULL))&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt 0.5in; TEXT-INDENT: 0.5in"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;return ERR_INVALID_PARAMETER;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;o:p&gt;&lt;FONT size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;// Phase 3: do work&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt 0.5in"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;if (BufferLength &amp;lt;= StringInLength)&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt 0.5in"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;return ERR_BUFFER_TOO_SMALL; &lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;memcpy(Buffer, StringIn, StringInLength);&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;Buffer[StringInLength] = ‘\0’;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;o:p&gt;&lt;FONT size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;// Phase 4: copy data out&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;*CharsWritten = StringInLength + 1; // safe because BufferLength is &amp;gt; StringInLength so “+1” won’t overflow&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;o:p&gt;&lt;FONT size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;// Phase 5: goodbye&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;&lt;SPAN style="mso-tab-count: 1"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/SPAN&gt;return ERR_SUCCEEDED;&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;SPAN style="FONT-FAMILY: 'Courier New'"&gt;&lt;FONT size=3&gt;}&lt;o:p&gt;&lt;/o:p&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;o:p&gt;&lt;FONT face=Calibri size=3&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;&lt;FONT face=Calibri size=3&gt;There may be other phases and the example’s perhaps kind of dumb but I think it works well to call out what kinds of operations should happen in which phases and even more so what doesn’t have to happen later on given work that occurred earlier.&lt;/FONT&gt;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=781420" width="1" height="1"&gt;</content><author><name>mgrier</name><uri>http://blogs.msdn.com/members/mgrier.aspx</uri></author></entry><entry><title>The NT DLL Loader: FreeLibrary()</title><link rel="alternate" type="text/html" href="http://blogs.msdn.com/mgrier/archive/2005/06/28/433707.aspx" /><id>http://blogs.msdn.com/mgrier/archive/2005/06/28/433707.aspx</id><published>2005-06-29T08:39:00Z</published><updated>2005-06-29T08:39:00Z</updated><content type="html">&lt;P&gt;Let's review the loader's modus operandi and derive the (once again simple!) rules for what the heck is going on.&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;When someone calls FreeLibrary(hInstance), the loader walks the closure of the dependencies for the module/instance in question and if their refcounts are not maxxed out ("pinned"), they are decremented.&amp;nbsp; Any modules whose refcounts are now zero are marked for unload.&amp;nbsp; This means that the loader finds all of those modules in the in-initialization-order list and runs their DLL_PROCESS_DETACH callouts in the opposite order of initialization.&amp;nbsp; Notice that the return value from DllMain() is ignored for DLL_PROCESS_DETACH - you cannot do anything which could fail.&amp;nbsp; After shutting down all those DLLs, they are unmapped and FreeLibrary() returns.&lt;o:p&gt;&lt;/o:p&gt;&lt;/P&gt;
&lt;P&gt;Notice that the ordering dragon reared its head again but also notice that as long as your init order made sense, it's very likely that your uninit order makes sense also so you tend not to hit so many problems here w.r.t. ordering.&amp;nbsp; Most LoadLibrary()/FreeLibrary() ordering issues are found in the LoadLibrary() path, not the unload path.&amp;nbsp; (In fact I can't recall a single instance of FreeLibrary() inducing a DLL_PROCESS_DETACH [basic] ordering issue.&amp;nbsp; Not that you can't imagine it but I can't recall ever having such a problem and as you know I like to talk about problems.)&lt;/P&gt;
&lt;P&gt;Reentrancy problems tomorrow...&amp;nbsp; Just imagine, given this information, what happens if someone calls LoadLibrary() during DLL_PROCESS_DETACH...&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=433707" width="1" height="1"&gt;</content><author><name>mgrier</name><uri>http://blogs.msdn.com/members/mgrier.aspx</uri></author></entry><entry><title>The NT DLL Loader: DLL_PROCESS_ATTACH reentrancy - wrap up</title><link rel="alternate" type="text/html" href="http://blogs.msdn.com/mgrier/archive/2005/06/28/433649.aspx" /><id>http://blogs.msdn.com/mgrier/archive/2005/06/28/433649.aspx</id><published>2005-06-29T05:45:00Z</published><updated>2005-06-29T05:45:00Z</updated><content type="html">&lt;P&gt;Hopefully the culmination of these cautionary tales is clear: you're walking a very fine line when you attempt to reenter the loader from within a loader callout.&amp;nbsp; You're (more!) subject to ordering and cycle issues, you can force initialization to proceed on to your dependencies even if you're not ready for them to call in to you and failure to propagate failures from the loader is fatal.&lt;/P&gt;
&lt;P&gt;Spot the bug that corrupted/crashed your process:&lt;/P&gt;
&lt;P&gt;if (GetFileAttributesW(pszSomePath) != 0xffffffff) {&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; // process the file&lt;BR&gt;}&lt;/P&gt;
&lt;P&gt;Where was it?&amp;nbsp; Oh, right, GetFileAttributes might actually be a &lt;A href="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore98/html/vcrefdelayload(delayloadimport).asp"&gt;linker delayload&lt;/A&gt;!&amp;nbsp; So it's actually an invisible wrapper around calling into the loader!&amp;nbsp; And I didn't remember to propagate loader errors!&amp;nbsp; Hmm... how do I even know that the error code &lt;EM&gt;was&lt;/EM&gt; a loader error?&lt;/P&gt;
&lt;P&gt;I consider this class of bug to be unstomachable.&amp;nbsp; Looking at the souce it seems obvious that there's a bug there.&amp;nbsp; At least you should be looking for ERROR_FILE_NOT_FOUND / ERROR_PATH_NOT_FOUND, not just treating all errors as file not found.&amp;nbsp; But who says that those errors are not propagated out of the delayload stubs in the case that the load failed?&amp;nbsp; (This particular case is arguably obtuse; GetFileAttributesW is in kernel32 which will already have been loaded and initiailzed but the point is still valid.)&lt;/P&gt;
&lt;P&gt;I'll provide an in-depth wrap up of all the general cautions and what to do about them but this series should make you feel &lt;EM&gt;very very nervous indeed&lt;/EM&gt; if you have code in your DLL_PROCESS_ATTACH that's doing much more than calling InitializeCriticalSection().&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=433649" width="1" height="1"&gt;</content><author><name>mgrier</name><uri>http://blogs.msdn.com/members/mgrier.aspx</uri></author></entry><entry><title>The NT DLL Loader: DLL_PROCESS_ATTACH reentrancy - step 4 - ramifications of questionable quality</title><link rel="alternate" type="text/html" href="http://blogs.msdn.com/mgrier/archive/2005/06/28/433361.aspx" /><id>http://blogs.msdn.com/mgrier/archive/2005/06/28/433361.aspx</id><published>2005-06-28T17:41:00Z</published><updated>2005-06-28T17:41:00Z</updated><content type="html">&lt;P&gt;Last time I alluded to the world of hurt you're in when the loader is reentered during DLL_PROCESS_ATTACH and the initialization of another DLL failed.&lt;/P&gt;
&lt;P&gt;The state of the affairs is pretty derivable from the clues that I've left behind in the series.&lt;/P&gt;
&lt;P&gt;First, we know that the loader tracks entry into the initialization function.&amp;nbsp; Otherwise the recursive/reentrant usage of the loader would have just called the same function.&amp;nbsp; Once it sees that an init function has been called, it will not call it again for initialization.&lt;/P&gt;
&lt;P&gt;So, in the context of the example from last time, foo.dll's initializer failed.&amp;nbsp; The loader had marked it as entered.&amp;nbsp; The failure was not propagated.&amp;nbsp;&amp;nbsp; Even if the example code was better and called FreeLibrary() to release the library in the case that GetProcAddress() failed, the refcount of foo is still non-zero so it stays loaded.&lt;/P&gt;
&lt;P&gt;The next behavior is entirely predictable.&amp;nbsp; When the loader returns back to the outer LoadLibrary() (which was in the middle of initializing all the uninitialized DLLs in the initialization list), it continues on and is very nice and careful not to call any initializers which have not already been called.&lt;/P&gt;
&lt;P&gt;Now, foo.dll is loaded, it is not initialized and its initializers will never be called.&amp;nbsp; Nonetheless since the failure was lost it tries to make progress.&lt;/P&gt;
&lt;P&gt;Aside:&lt;/P&gt;
&lt;BLOCKQUOTE dir=ltr style="MARGIN-RIGHT: 0px"&gt;
&lt;P&gt;The obvious suggestion is to track the fact that the init failed and fail the entire load sequence.&amp;nbsp; But that's not really possible; you have an arbitrary number of levels of reentrancy here; making this a failure case now when it didn't used to be will break applications.&amp;nbsp; Once again the black helicopters swoop in and accusations that Windows only &lt;EM&gt;tries&lt;/EM&gt; to be compatible fly around.&amp;nbsp; Most of all, you've caused an app compat hit by disallowing something that's always been documented not to work.&lt;/P&gt;
&lt;P&gt;Now, let's see.&amp;nbsp; I can spend let's roughly cost it to 320 person-hours dealing with the arguably better but incompatible change I made or I can just let it be and continue to advise against the whole pattern.&lt;/P&gt;
&lt;P&gt;Like I said, while it was a fun ride maintaining the loader for a period of time, it was also a little slice of hades too.&lt;/P&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Now clearly this behavior needs to be called out better in the documentation.&amp;nbsp; These blog entries are an attempt at shedding light on the topic focussing mostly on directly observable behavior rather than "contractual obligations".&amp;nbsp; (app compat turns observable behavior into contractual obligations but app compat is also dialable given the intestinal fortitude and free time to do it.)&lt;/P&gt;
&lt;P&gt;Note that this problem is about the closure of the call graph during the duration of the DLL_PROCESS_ATTACH.&lt;/P&gt;
&lt;P&gt;Next time, wrapping up DLL_PROCESS_ATTACH reentrancy and then setting the stage for DLL_PROCESS_DETACH issues.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=433361" width="1" height="1"&gt;</content><author><name>mgrier</name><uri>http://blogs.msdn.com/members/mgrier.aspx</uri></author></entry><entry><title>The NT DLL Loader: DLL_PROCESS_ATTACH reentrancy - step 3 - quality requirements</title><link rel="alternate" type="text/html" href="http://blogs.msdn.com/mgrier/archive/2005/06/24/432455.aspx" /><id>http://blogs.msdn.com/mgrier/archive/2005/06/24/432455.aspx</id><published>2005-06-25T02:23:00Z</published><updated>2005-06-25T02:23:00Z</updated><content type="html">&lt;P&gt;Now we're loaded for bear!&amp;nbsp; We understand how PEs which are either launched via CreateProcess() or loaded via LoadLibrary() are the roots of directed cyclic graphs.&amp;nbsp; Each new graph is turned into a linear initialization-order list where nodes further from the root are initialized prior to nodes closer to the root.&amp;nbsp; Cycles in the graph are resolved based on where you first enter the cycle and thus depend on the entire graph (not just local DLL-to-DLL relationships).&amp;nbsp; Dynamic loads during initialization are often handled correctly but they themselves can introduce additional cycles.&amp;nbsp; There is less opportunity to fix this up since the dynamic load results only in new additions to the initialization list.&lt;/P&gt;
&lt;P&gt;Great.&amp;nbsp; Seven impossible things already done (I guess they weren't really impossible, but then again maybe they weren't really done either, eh?) let's see where things start to get really messy.&lt;/P&gt;
&lt;P&gt;Dynamic loads during initialization lead to a couple of very interesting things.&amp;nbsp; First, recall that the initializer that was in progress isn't re-run when GetProcAddress() is called.&amp;nbsp; That's going to be important.&lt;/P&gt;
&lt;P&gt;Let's go back to my sleazy little attempt to call a function here:&amp;nbsp; (let's assume this is in bar.dll; it's not going to be very important but given all the players it's going to get confusing...)&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; case DLL_PROCESS_ATTACH:&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; SOME_FUNCTION_PTR_T pfn = (SOME_FUNCTION_PTR_T) GetProcAddress(LoadLibraryW(L"SomeOther.DLL"), "SomeFunction");&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if (pfn != NULL) (*pfn)();&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; break;&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;We didn't check the return status from the LoadLibrary() call.&amp;nbsp; Since LL() doesn't run initializers let's assume that that's OK in this context.&amp;nbsp; I'll assume that GetProcAddress() fails with an invalid argument and thus still returns NULL.&amp;nbsp; We do check the result of GPA() and don't call through the pointer if it's NULL so hey, nobody got hurt, right?&lt;/P&gt;
&lt;P&gt;The LL() non-check is dubious.&amp;nbsp; Do you know &lt;EM&gt;why&lt;/EM&gt; LoadLibrary() failed?&amp;nbsp; If it failed because of out-of-memory maybe it's the wrong thing to press on.&amp;nbsp; I digress; this is more of a usual topic for my blog rather than focussing on loader related issues.&amp;nbsp; But it's about to become a loader related issue.&lt;/P&gt;
&lt;P&gt;Let's say that foo.dll was already on the init list, after the current DLL.&amp;nbsp; Let's assume that its refcount was 1.&amp;nbsp; Now SomeOther.DLL maybe statically imported foo.dll also.&amp;nbsp; Now it's (foo.dll's) refcount is 2 after the LoadLibrary() call.&amp;nbsp; Let's write the overall initialization list now:&lt;/P&gt;
&lt;P&gt;ntdll.dll&lt;BR&gt;kernel32.dll&lt;BR&gt;bar.dll&lt;BR&gt;foo.dll&lt;BR&gt;somedllimportedbytheexethatusesfoo.dll&lt;BR&gt;someother.dll&lt;/P&gt;
&lt;P&gt;The GetProcAddress() call attempts to run foo.dll's initializer (since it has to be initialized before running someother.dll's initializer).&amp;nbsp; Let's assume it fails.&amp;nbsp; These things happen, it's the real world out there.&amp;nbsp; I/Os fail, memory allocations fail, duplicate file names occur, network glitches when trying to open file handles using the redirector, etc.&lt;/P&gt;
&lt;P&gt;So, foo fails initialization and properly reports FALSE back to the loader.&amp;nbsp; The loader will propagate this failure out to the GetProcAddress() call.&amp;nbsp; But wait, that darned code assumes that the only reason GetProcAddress() can fail is due to ERROR_PROC_NOT_FOUND!&lt;/P&gt;
&lt;P&gt;Care to guess what happens next?&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=432455" width="1" height="1"&gt;</content><author><name>mgrier</name><uri>http://blogs.msdn.com/members/mgrier.aspx</uri></author></entry><entry><title>The NT DLL Loader: DLL_PROCESS_ATTACH reentrancy - step 2 - GetProcAddress()</title><link rel="alternate" type="text/html" href="http://blogs.msdn.com/mgrier/archive/2005/06/23/431954.aspx" /><id>http://blogs.msdn.com/mgrier/archive/2005/06/23/431954.aspx</id><published>2005-06-23T22:28:00Z</published><updated>2005-06-23T22:28:00Z</updated><content type="html">&lt;P&gt;Last time we pondered what does LoadLibrary() do when called inside of a DLL_PROCESS_ATTACH callout.&amp;nbsp; The answer was pretty simple and predictable - the only nuance is that the initializers are not run before returning.&lt;/P&gt;
&lt;P&gt;Now place yourself in the position of mythical developer Weve Stoods who evidently did most of the loader development in the early 80s.&amp;nbsp; You cleverly avoided the whole recursive initialization problem.&amp;nbsp; Then a bug report comes in.&amp;nbsp; GetProcAddress and calling through the pointer doesn't work.&amp;nbsp; I've never met Weve myself and I can't say for sure what was going on at the time but I can imagine two scenarios.&amp;nbsp; First, the fact that it &lt;EM&gt;does&lt;/EM&gt; work in some cases (because after all if the library aleady had been initialized due to static imports prior to this case), the team in question "suddenly" has a break where sometimes calling through the function pointer works and sometimes it does not.&amp;nbsp; Second scenario is that "it's so easy to make it work, why can't you just make it work?"&lt;/P&gt;
&lt;P&gt;The road to Hades is paved with good intentions...&lt;/P&gt;
&lt;P&gt;In both cases, pushing back and saying "GetProcAddress should never be called during DLL_PROCESS_ATTACH" would be a very strange response, even if we might wish now that that was the actual result.&lt;/P&gt;
&lt;P&gt;So someone made it work.&amp;nbsp; How did they make it work?&amp;nbsp; Well, we already established that there is an in-initialization-order list that's built up in the loader's internal tables.&amp;nbsp; I believe that this is not an implementation detail - you have to have the single linear/global order because uninitialization on unload &lt;STRONG&gt;&lt;EM&gt;has&lt;/EM&gt;&lt;/STRONG&gt; to occur in the reverse order of initialization.&lt;/P&gt;
&lt;P&gt;So the simple answer is that within the context of GetProcAddress(), the loader runs all the initializers that have not yet been run up to and including the target module of the GetProcAddress() function.&lt;/P&gt;
&lt;P&gt;And you know what?&amp;nbsp; This very often works.&amp;nbsp; Let's assume that we're in a.dll's DLL_PROCESS_ATTACH and it's loading b.dll and calling GetProcAddress() to get the foo() function address.&amp;nbsp; If b.dll depends &lt;U&gt;only&lt;/U&gt; on let's say kernel32.dll, well its initializer already ran and is complete before b.dll's initialier is run so b.dll's initializer runs and lo and behold, the b!foo function is ready for business.&lt;/P&gt;
&lt;P&gt;But let's say that b.dll, unbeknownst to a.dll, now has a dependency on a.dll.&amp;nbsp; The loader will not attempt to rerun A's initializer and will instead just call B's initializer.&amp;nbsp; Which may call into A.&amp;nbsp; But A didn't finish its initialization!&amp;nbsp; And A had no idea that B depended on it so it wasn't particularly careful about doing all the initialization that B&amp;nbsp;might need before loading B.&amp;nbsp; &amp;nbsp;Boom.&lt;/P&gt;
&lt;P&gt;As usual, sprinkle dependency cycles into the mix, stand back&amp;nbsp;and watch the fun.&lt;/P&gt;
&lt;P&gt;Next time, we'll look at quality problems with DLL_PROCESS_ATTACH implementations which directly or indirectly call back into the loader.&amp;nbsp; This next topic is a very big part of why I'm so stuck on quality and reliability.&amp;nbsp; Very innocuous errors which people would generally ignore can amplify into terrible problems which are unbelievably difficult to debug.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=431954" width="1" height="1"&gt;</content><author><name>mgrier</name><uri>http://blogs.msdn.com/members/mgrier.aspx</uri></author></entry><entry><title>The NT DLL Loader: DLL_PROCESS_ATTACH reentrancy - step 1 - LoadLibrary()</title><link rel="alternate" type="text/html" href="http://blogs.msdn.com/mgrier/archive/2005/06/22/431674.aspx" /><id>http://blogs.msdn.com/mgrier/archive/2005/06/22/431674.aspx</id><published>2005-06-22T23:34:00Z</published><updated>2005-06-22T23:34:00Z</updated><content type="html">&lt;P&gt;So what happens if you call back into the loader when you're inside a loader callout (DllMain) for DLL_PROCESS_ATTACH?&lt;/P&gt;
&lt;P&gt;I'll be addressing teardown (DLL_PROCESS_DETACH) after completing the DLL_PROCESS_ATTACH series.&lt;/P&gt;
&lt;P&gt;The first issue is: what about LoadLibrary()?&amp;nbsp; I'll address GetProcAddress() and FreeLibrary() later.&lt;/P&gt;
&lt;P&gt;We already know how LoadLibrary() works.&amp;nbsp; It walks the dependency graph starting from the DLL that is being loaded and trims away any portions already in the loader's tables.&amp;nbsp; It then effectively adds that to the initialization order list and (this is what varies) if you're not using the loader reentrantly, calls the initializers.&lt;/P&gt;
&lt;P&gt;If you are calling into the loader reentrantly, the list of DLLs to initialize is extended but (I should double check the code to make sure I'm not lying; it's been a couple of years...) only the outer-most invocation of the loader will complete the initialization sequence.&lt;/P&gt;
&lt;P&gt;Thus, you call LoadLibrary("a.dll").&amp;nbsp; In a.dll's DLL_PROCESS_ATTACH, it calls LoadLibrary("b.dll").&amp;nbsp; b.dll's initializer is added to the init list but it isn't actually initialized at that time.&amp;nbsp; Instead, the inner LoadLibrary() just returns and then the DLL_PROCESS_ATTACH returns (successfully for the sake of argument) and then the original stack frame for the original LoadLibrary() initializes b.dll.&lt;/P&gt;
&lt;P&gt;Clearly things are a little more complicated since it might not be a.dll that actually calls loadlibrary(); it could be a static import of A which does so and thus the init list &lt;STRONG&gt;&lt;EM&gt;may still have uninitialied entries on it at the time that the entries for b.dll (and its static imports!) are appended&lt;/EM&gt;&lt;/STRONG&gt;.&lt;/P&gt;
&lt;P&gt;So far so good; assuming that the cycles don't break you, this was a fine solution to the problem which also avoids arbitrary recursion due to nested dynamic DLL loads.&amp;nbsp; This was mostly harmless!&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=431674" width="1" height="1"&gt;</content><author><name>mgrier</name><uri>http://blogs.msdn.com/members/mgrier.aspx</uri></author></entry><entry><title>The NT DLL Loader: reentrancy - play along at home!</title><link rel="alternate" type="text/html" href="http://blogs.msdn.com/mgrier/archive/2005/06/21/431383.aspx" /><id>http://blogs.msdn.com/mgrier/archive/2005/06/21/431383.aspx</id><published>2005-06-22T08:06:00Z</published><updated>2005-06-22T08:06:00Z</updated><content type="html">&lt;P&gt;Anyone care to hazard a guess about what happens if you have the following code in your DllMain()?&amp;nbsp; Ignore the leak and the lack of error checking; focus on the what the loader's behavior &lt;EM&gt;has&lt;/EM&gt; to be...&lt;/P&gt;
&lt;P&gt;BOOL WINAPI DllMain(HINSTANCE hInstance, DWORD fdwReason, LPVOID lpvReserved) {&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; switch (fdwReason) {&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; case DLL_PROCESS_ATTACH:&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; SOME_FUNCTION_PTR_T pfn = (SOME_FUNCTION_PTR_T) GetProcAddress(LoadLibraryW(L"SomeOther.DLL"), "SomeFunction");&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if (pfn != NULL) (*pfn)();&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; break;&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; }&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; return TRUE;&lt;BR&gt;}&lt;/P&gt;
&lt;P&gt;The answer is actually relatively straightforward (the tricky bit is &lt;EM&gt;what is GetProcAddress()'s implied contract&lt;/EM&gt;).&amp;nbsp; I'll do exposition tomorrow.&amp;nbsp; For extra credit, consider what the effect of the leak and lack of error checking is on the loader's internal data tables...&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=431383" width="1" height="1"&gt;</content><author><name>mgrier</name><uri>http://blogs.msdn.com/members/mgrier.aspx</uri></author></entry><entry><title>The NT DLL Loader: DLL callouts (DllMain) - DLL_PROCESS_ATTACH deadlocks</title><link rel="alternate" type="text/html" href="http://blogs.msdn.com/mgrier/archive/2005/06/21/431378.aspx" /><id>http://blogs.msdn.com/mgrier/archive/2005/06/21/431378.aspx</id><published>2005-06-22T06:48:00Z</published><updated>2005-06-22T06:48:00Z</updated><content type="html">&lt;P&gt;The Windows DLL loader (I wasn't around then but I assume some of this even comes from the days of 16-bit Windows) has a feature where a DLL&amp;nbsp;may have an "entry point".&lt;/P&gt;
&lt;P&gt;If a DLL has an entry point, the loader calls into it on certain significant events.&amp;nbsp; These events have identifiers associated with them:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;DLL_PROCESS_ATTACH 
&lt;LI&gt;DLL_PROCESS_DETACH 
&lt;LI&gt;DLL_THREAD_ATTACH 
&lt;LI&gt;DLL_THREAD_DETACH&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;I'm not going to talk about the thread callouts any time soon; they probably don't do what you expect them to do so for the most part you should call the function &lt;A href="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/loadlibrary.asp"&gt;DisableThreadLibraryCalls()&lt;/A&gt; in your DLL_PROCESS_ATTACH to save extra page faults when threads come and go.&lt;/P&gt;
&lt;P&gt;The apparent contract for DLL_PROCESS_ATTACH is that it is called before any code in your DLL can be called.&amp;nbsp; Sounds like a nice place to do one-time initialization that couldn't be static for some reason.&lt;/P&gt;
&lt;P&gt;Note that I said "apparent".&amp;nbsp; Due to the previous articles, if you are involved in a cycle, you can have your code called before your DLL_PROCESS_ATTACH callout has been issued.&amp;nbsp; Maybe you're lucky and you've never been hit by this.&amp;nbsp; There are a lot of lucky people out there.&lt;/P&gt;
&lt;P&gt;I'm going to paint a contractual picture here, assuming no cycles are involved.&lt;/P&gt;
&lt;P&gt;Presumably if one thread is in the middle of calling your DLL_PROCESS_ATTACH, any other thread that wants to access the exports of your DLL has to block waiting for&amp;nbsp;you to finish your initialization.&amp;nbsp; Let's call this the "mythical loader lock".&amp;nbsp; Maybe it's not so mythical and we can discuss/debate the scope of the synchronization (all DLLs, only the DLLs waiting to initialize, what about cycles?) but let's work out the invariants of the contract before we get too hung up on implementation details.&lt;/P&gt;
&lt;P&gt;Thus the first problem with DLL_PROCESS_ATTACH processing is deadlocks.&amp;nbsp; A great example of this is calling CreateThread() inside your own DLL_PROCESS_ATTACH to start a thread running code for your DLL.&amp;nbsp; Clearly the new thread can't start running your code until you have finished your DLL_PROCESS_ATTACH because otherwise we would be breaking the contract.&amp;nbsp; Thus the new thread has to wait for you to exit your DllMain().&amp;nbsp; Maybe that's OK; if it can just suspend waiting for you to finish up, maybe it fired up and immediately had to block waiting for synchronization but if it doesn't happen too frequently, you can survive this.&lt;/P&gt;
&lt;P&gt;But now imagine that you do something nifty and useful in that worker thread.&amp;nbsp; Someday someone comes along and wants to queue a work item to that thread and wait for it to complete.&amp;nbsp; &lt;A href="http://b5.cs.uwyo.edu/bab5/snds/boom.wav"&gt;Boom&lt;/A&gt;.&amp;nbsp; Insta-deadlock.&lt;/P&gt;
&lt;P&gt;That's an easy example, but basically the rule is this:&lt;/P&gt;
&lt;P&gt;Calling any function from within your DLL_PROCESS_ATTACH which requires synchronization can deadlock.&lt;/P&gt;
&lt;P&gt;Obviously it doesn't have to deadlock; a lot of folks get away with a lot of bad stuff.&amp;nbsp; They're getting lucky for the most part.&lt;/P&gt;
&lt;P&gt;A great example is the process heap.&amp;nbsp; Did you know that you can lock it?&amp;nbsp; You can!&amp;nbsp; You can probably have a lot of fun by calling HeapLock(GetProcessHeap())?&amp;nbsp; Why would you do that?&amp;nbsp; I don't know!&amp;nbsp; Who can know?&amp;nbsp; Can we stop it?&amp;nbsp; People want to but just wait for the black helicopter crowd to show up saying that it's really a collusion/conspiracy to get people to upgrade software on Windows.&lt;/P&gt;
&lt;P&gt;If someone locks it (or maybe calls HeapWalk on the process heap which I assume locks it for the duration of the walk) and then calls into the loader... well... boom.&amp;nbsp; You're deadlocked.&lt;/P&gt;
&lt;P&gt;Those are two easy cases.&amp;nbsp; Clearly you can deadlock in additional ways (RPC calls to another process or machine which have to reenter your process on a different thread which then might need the Mythical Loader Lock) and being creative with things like the thread pool, windows messages, etc. you can come up with a million variations on the theme.&lt;/P&gt;
&lt;P&gt;Thus, DLL_PROCESS_ATTACH rule #1:&lt;/P&gt;
&lt;P&gt;Don't do anything that requires synchronization.&amp;nbsp; Currently, even heap allocation is suspect.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=431378" width="1" height="1"&gt;</content><author><name>mgrier</name><uri>http://blogs.msdn.com/members/mgrier.aspx</uri></author></entry><entry><title>The NT DLL loader: dynamic unloads</title><link rel="alternate" type="text/html" href="http://blogs.msdn.com/mgrier/archive/2005/06/18/430522.aspx" /><id>http://blogs.msdn.com/mgrier/archive/2005/06/18/430522.aspx</id><published>2005-06-19T08:06:00Z</published><updated>2005-06-19T08:06:00Z</updated><content type="html">&lt;P&gt;To recap our story from last time:&lt;/P&gt;
&lt;P&gt;The NT DLL loader starts from some PE (either the main EXE or the DLL which is passed in to the LoadLibrary() API), walks the graph of static imports rooted with that first PE.&amp;nbsp; You can think of the loader as then building a linear ordered list of DLLs to initialize starting with the deepest away from the root.&amp;nbsp; The order tends to be stable but is dependent on a number of factors which no one DLL can control.&lt;/P&gt;
&lt;P&gt;It's not uncommon to find cycles.&amp;nbsp; Any time you try to apply some kinds of topological sort to a graph with cycles you basically have to break the cycles; how you break them typically depends on how you first entered into the cycle.&amp;nbsp; The result of this is that even if a cyclic graph of static imports exists, a linear initialization order is selected.&amp;nbsp; As imports are added and removed from various points in the graph, the first point of entry into the cycle can change and so even though you can experiment and figure out that A is always initialized before B and thus as long as A's initialization does not depend on B's, things work.&amp;nbsp; However an innocuous change elsewhere in the graph can change the order and suddenly the initialization order is broken and your component fails.&lt;/P&gt;
&lt;P&gt;For a mental model of dynamic loads, imagine that the closure of the dependencies from the loaded DLL are found and then the DLLs already loaded into the process are removed and the initialization is run in that order.&lt;/P&gt;
&lt;P&gt;The global list of DLLs is maintained in multple ordered lists, one of which is the in-initialization-order list.&amp;nbsp; When a dynamic load occurs, the new initialization list is added to the end of the init list.&lt;/P&gt;
&lt;P&gt;A few interesting notes.&amp;nbsp; The refcount of the closure of the static imports of the main executable is "maxed out".&amp;nbsp; The refcounting code knows not to decrement load counts when the count is as the maximum value.&amp;nbsp; I call this "pinned" - the DLLs which are in the static closure of the executable are permanently pinned in the process.&amp;nbsp; In theory if you loaded a DLL enough times, its refcount could reach the same value and then it could not be unloaded but if your refcounts get that high you probably have a DLL ref count leak anyways.&lt;/P&gt;
&lt;P&gt;The other interesting note is that the refcount of (non-pinned) DLLs is the total number of dynamic loads that could have reached it.&amp;nbsp; E.g. the closure of the dependency graph of the dynamic load has its refcount incremented.&amp;nbsp; This doesn't really matter that much except that when I describe unloading, I'll reference it.&lt;/P&gt;
&lt;P&gt;Unloading is pretty explainable: Starting from the DLL handle that was passed in to FreeLibrary(), the refcount of the closure of its dependencies is decremented.&amp;nbsp; Any DLLs whose refcounts reach zero are marked as to-be-unloaded.&amp;nbsp; The in-init-order list is consulted for these DLLs and they are uninitialized in the opposite order of their initialization.&lt;/P&gt;
&lt;P&gt;The reason that this is important is that if a DLL managed to get initialized at the right time, it's presumably going to be uninitialized also at the right time.&amp;nbsp; If the uninitialization order was based solely on the graph being unloaded, cycles could have been entered at a different point in the cycle and the uninitialization order could be wrong.&lt;/P&gt;
&lt;P&gt;Summary: cycles make initialization order hard to understand but the algorithm picks an order.&amp;nbsp; The order can change when cycles are involved since adding or removing static imports from any DLL in the graph may cause the cycles to be entered at different places.&amp;nbsp; The initialization order is inverted to get the uninitialization order in the case of unloads.&amp;nbsp; DLLs which are reachable from the executable are "pinned" (will never unload).&amp;nbsp; Clients forgetting to call FreeLibrary() will eventually force the DLL to be permanently pinned rather than allowing the refcount to overflow back to zero.&lt;/P&gt;
&lt;P&gt;Next time, the hazards of DllMain.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=430522" width="1" height="1"&gt;</content><author><name>mgrier</name><uri>http://blogs.msdn.com/members/mgrier.aspx</uri></author></entry><entry><title>The NT DLL Loader: basic operation</title><link rel="alternate" type="text/html" href="http://blogs.msdn.com/mgrier/archive/2005/06/18/430409.aspx" /><id>http://blogs.msdn.com/mgrier/archive/2005/06/18/430409.aspx</id><published>2005-06-18T09:39:00Z</published><updated>2005-06-18T09:39:00Z</updated><content type="html">&lt;P&gt;Let's start simply and consider the mythical vertical application (a topic itself for another day).&amp;nbsp; I'm not going to walk through what a &lt;A href="http://msdn.microsoft.com/msdnmag/issues/02/02/PE/default.aspx"&gt;PE&lt;/A&gt; is or a DLL or an EXE; if you don't know, follow the link or take a gander around MSDN.&lt;/P&gt;
&lt;P&gt;Let's call it ccalc.exe (console calculator - no funky graphics, etc. it just uses stdin/stdout).&amp;nbsp; To keep things simple, it only links against the old C runtime library, msvcrt.dll.&amp;nbsp; msvcrt.dll imports functions from kernel32.dll and kernel32.dll imports functions from ntdll.dll.&lt;/P&gt;
&lt;P&gt;When the process initializes, the loader figures this imports-graph from the static imports in the PE headers and since code is still sequential, sets up a DLL initialization list which is just a list of DLLs to try to initialize, in order.&amp;nbsp; The initialization is depth first but other than that it's nothing particularly suprising.&amp;nbsp; For this example, the init order would be:&lt;/P&gt;
&lt;P&gt;ntdll.dll&lt;BR&gt;kernel32.dll&lt;BR&gt;msvcrt.dll&lt;/P&gt;
&lt;P&gt;Not exactly rocket science.&amp;nbsp; ccalc.exe doesn't have an initializer - it has an entry point where its initialization is done prior to passing control to whatever you like to call your main function.&amp;nbsp; (If you don't use the C runtimes, you can tell the linker to use your own function as the entry point but if you do use the C runtime functions, the C runtime needs to get control first so that it can set up its heap, set up the standard streams and do some other various things like set up fp control registers and whatever else it needs to do and then it calls &lt;EM&gt;your&lt;/EM&gt; main() or wmain() function depending on which C runtime init function you set as the entry point.)&lt;/P&gt;
&lt;P&gt;Now let's assume you have two teams working on the calculator - one on the "UI" (command line parsing) and another on the logic.&amp;nbsp; Maybe the logic folks want to have a separate DLL.&amp;nbsp; Let's assume that they do and it's called calclogic.dll.&amp;nbsp; This isn't necessarily a good way to do it but let's say that the &lt;A href="http://www.pioneernet.net/curtis/wile_e/"&gt;Super Genius&lt;/A&gt; command line parser folks want to use LoadLibrary()/GetProcAddress() to figure out if a function typed in can be used by the calculator logic component.&amp;nbsp; (Insert your favorite reason to use DLLs here; there are a million good ones and two million bad ones.)&lt;/P&gt;
&lt;P&gt;But the calclogic.dll &lt;A href="http://www.subgenius.com/"&gt;SubGeniuses&lt;/A&gt; cleverly didn't do all the work themselves.&amp;nbsp; They use bignum.dll to handle all that fancy pants math stuff.&amp;nbsp; Nice job!&lt;/P&gt;
&lt;P&gt;So, ccalc.exe calls LoadLibrary() on calclogic.dll.&amp;nbsp; Now what happens?&lt;/P&gt;
&lt;P&gt;Bignum.dll is presumably a &lt;A href="http://catb.org/~esr/jargon/html/B/bignum.html"&gt;bignum package &lt;/A&gt;and it needs a heap since the representations are variable length.&amp;nbsp; Let's assume that it uses metaheap.dll which they paid a lot of money for but which ends up really just being a big fat wrapper around GlobalAlloc in kernel32.dll.&lt;/P&gt;
&lt;P&gt;When the loader sees the load come in for calclogic.dll, it looks at what it imports and figures out what it is importing that isn't already imported into the process.&amp;nbsp; So it looks at calclogic.dll and discovers bignum.dll, and then metaheap.dll and then kernel32.dll and then... hey, wait, kernel32.dll's already in the loader's database so we can stop.&amp;nbsp; After processing the static imports, it &lt;EM&gt;appends&lt;/EM&gt; the new entries to the global initialization list and keeps track of the new entries.&amp;nbsp; To wit, the global init-order list looks like:&lt;/P&gt;
&lt;P&gt;ntdll.dll&lt;BR&gt;kernel32.dll&lt;BR&gt;msvcrt.dll&lt;BR&gt;metaheap.dll&lt;BR&gt;bignum.dll&lt;BR&gt;calclogic.dll&lt;/P&gt;
&lt;P&gt;and the init list that it has to process before LoadLibrary() returns is:&lt;/P&gt;
&lt;P&gt;metaheap.dll&lt;BR&gt;bignum.dll&lt;BR&gt;calclogic.dll&lt;/P&gt;
&lt;P&gt;No suprises yet, eh?&lt;/P&gt;
&lt;P&gt;Now let's add the first twist.&amp;nbsp; There are cycles in the dependency graphs of many DLLs.&amp;nbsp; Why?&amp;nbsp; Well, because they are.&amp;nbsp; Two teams or two companies were independently working on something which maybe was &lt;EM&gt;conceptually&lt;/EM&gt; part of the same thing but they had to build, test and deliver the two things independently.&amp;nbsp; Component granularity is suprisingly hard to get right and very rarely has to do with what's best for the components.&amp;nbsp; Instead it's a combination of what's best for the teams (teams don't like to have multiple DLLs and teams want autonomy w.r.t. a single DLL so that their testers can feel comfortable about what it means to test the component) and what's best for the perf team (one DLL per function would destroy system performance; in a big system, if 10% of the teams "just add one more DLL" for each release, you end up with dozens and dozens over time for no apparent reason other than it was easier than modifying an existing DLL).&lt;/P&gt;
&lt;P&gt;What does the loader do when it finds these?&amp;nbsp; not much.&amp;nbsp; I already described the algorithm above when the dynamic load case reached kernel32 again.&amp;nbsp; (It's a little more complicated because of the interaction of refcounting and cycles but I'll make a separate post on that.)&lt;/P&gt;
&lt;P&gt;If A depends on B and B depends on A, the initialization order depends on which one you found first.&amp;nbsp; If you loaded A first, it would be added to the tables and then B would be found and added to the loader's tables.&amp;nbsp; Then when walking B's imports, it would find A again and say that it's over and done with.&amp;nbsp; Thus if you loaded A, B would be initialized first since it would appear to be deeper in the graph than A.&amp;nbsp; Similarly loading B first will result in A initialized first.&lt;/P&gt;
&lt;P&gt;The implementation is linear/sequential.&amp;nbsp; Therefore even the order of the imports in your static import tables matters.&amp;nbsp; I can't guarantee which order is used but assume you have DLL c.dll which links against both a.dll and b.dll.&amp;nbsp; If a.dll happens to be in the import tables first (assuming that the imports are walked in order), it will be added first and then b would be discovered and we'd decide that b was deeper.&amp;nbsp; If the linker for some reason reverses the order of the static imports, you'll see the opposite.&lt;/P&gt;
&lt;P&gt;Now let's say you're ccalc.exe and you link against c.dll.&amp;nbsp; The init order might go B -&amp;gt; A -&amp;gt; C.&amp;nbsp; But now let's say that ccalc decides to link against&amp;nbsp;b.dll itself.&amp;nbsp; Now maybe the init order is A -&amp;gt; C -&amp;gt; B!&lt;/P&gt;
&lt;P&gt;This leads to the first hazards in DLL initialization order: &lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;The way that cycles are resolved into an initialization order are stable for any given graph but if the graph changes for some reason, the initialization order may change.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;If A calls into B during its DLL_PROCESS_ATTACH, this may have worked for &lt;EM&gt;years and years&lt;/EM&gt; but suddenly because of either the addition or removal of an import by some other DLL in the graph can break the initialization order.&lt;/P&gt;
&lt;P&gt;My team debugged so many of these during Windows XP it wasn't funny.&amp;nbsp; Sure we were the obvious targets, who are those crazy folks over there changing the DLL loader?!? Are they mad?&amp;nbsp; They must be the cause that my code which has worked for 10+ years is suddenly broken!&amp;nbsp; But we got pretty good at recognizing the defective patterns.&amp;nbsp; (The changes themselves weren't defective - anyone adding or heavens forbid actually reducing/removing dependencies would cause these effects.)&lt;/P&gt;
&lt;P&gt;I even have recommendations about what (not!) to do during DLL initialization to avoid these problems yourself.&amp;nbsp; They'll be at the end of the series and I can say right now that I don't think you're going to like them one bit.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=430409" width="1" height="1"&gt;</content><author><name>mgrier</name><uri>http://blogs.msdn.com/members/mgrier.aspx</uri></author></entry><entry><title>How the NT Loader works</title><link rel="alternate" type="text/html" href="http://blogs.msdn.com/mgrier/archive/2005/06/18/430402.aspx" /><id>http://blogs.msdn.com/mgrier/archive/2005/06/18/430402.aspx</id><published>2005-06-18T09:27:00Z</published><updated>2005-06-18T09:27:00Z</updated><content type="html">&lt;P&gt;My team maintained the NT loader (the component that loads DLLs) for about a year or so during Windows XP as we were adding the &lt;A href="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/sbscs/setup/isolated_applications_and_side_by_side_assemblies_start_page.asp"&gt;isolated application&lt;/A&gt; features to it so we got quite an interesting perspective on this lovely little piece of technology.&amp;nbsp; Warning to people who find themselves wanting to innovate in technology which has basically been left dormant and untouched for over a decade: be sure you have plenty of time to deal with the anthills you knock over!&lt;/P&gt;
&lt;P&gt;We don't own it any more (not sure if it's a blessing or a curse...) but it sure was interesting and enlightening; especially in the tradeoffs of application compatibility, robustness and reliability.&lt;/P&gt;
&lt;P&gt;You might notice that the &lt;A href="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/dllmain.asp"&gt;docs for DllMain&lt;/A&gt; have grown a lot over the past few years.&amp;nbsp; I like to think that my team's involvement here had a lot to do with it because DLL load order etc. was always a vaguely understood and arcane topic.&amp;nbsp; There were always vague warnings about not doing too much in DLL_PROCESS_ATTACH but nobody could really describe the situation except for a number of anecdotes they had had in the past when somehow mysteriously load orders changed and they were broken during either initialization or shutdown.&lt;/P&gt;
&lt;P&gt;I'll take a break from where I'm headed on the reliability front and walk through a summary of the issues which I recently sent to the internal win32 programming email alias.&amp;nbsp; Hopefully I'll fix the incomplete sentances and bad grammar this time.&lt;/P&gt;
&lt;P&gt;I'll make a separate post with the beginning - a basic rundown of how things work today.&amp;nbsp; As usual, do not consider this in any way shape or form a contract.&amp;nbsp; One of the reasons that this isn't documented fully is that people have wanted to change/fix it for years and years now.&amp;nbsp; On the other hand, maintaining compatibility with the current behavior is going to constrain the implementation so much that either (a) it won't change after all or (b) the change will have to be compatible with the effects of anything I'm describing here anyways.&lt;/P&gt;
&lt;P&gt;You will see aspects of my reliability/robustness series come up here.&amp;nbsp; You'll laugh, you'll cry, you'll see local innocuous bugs in DLL initialization or uninitialization affect the entire process's reliability.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=430402" width="1" height="1"&gt;</content><author><name>mgrier</name><uri>http://blogs.msdn.com/members/mgrier.aspx</uri></author></entry><entry><title>Enough about in-memory models and transactions for now</title><link rel="alternate" type="text/html" href="http://blogs.msdn.com/mgrier/archive/2005/05/31/423677.aspx" /><id>http://blogs.msdn.com/mgrier/archive/2005/05/31/423677.aspx</id><published>2005-06-01T07:18:00Z</published><updated>2005-06-01T07:18:00Z</updated><content type="html">&lt;P&gt;I want to follow up on this topic of transactional behavior in memory but to motivate it I'm going to have to further discuss my other topics more about invariant restoration and (eventually) failure handling.&lt;/P&gt;
&lt;P&gt;The main point is this: there are potentially better ways to solve this overarching problem but there isn't a lot of coherent support for having a set of components which play together well.&amp;nbsp; You can yourself take on this "transactional" mantra but unless there's deep support for it (possibly integrating with a "real" transactional facility to enable interaction with (rollback/commit of) persistest side effects), you're generally stuck reinventing everything.&lt;/P&gt;
&lt;P&gt;I still think it's interesting to take a moment and compare and contrast the reliability of a system which worked in light of transaction support like we've been discussing vs. manual/explicit rollback code.&amp;nbsp; For extra points, what's the difference between the rollback proposed here and use of C++ destructors.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=423677" width="1" height="1"&gt;</content><author><name>mgrier</name><uri>http://blogs.msdn.com/members/mgrier.aspx</uri></author></entry><entry><title>Transactions and in-memory data (part 3 of n)</title><link rel="alternate" type="text/html" href="http://blogs.msdn.com/mgrier/archive/2005/05/18/419058.aspx" /><id>http://blogs.msdn.com/mgrier/archive/2005/05/18/419058.aspx</id><published>2005-05-18T09:24:00Z</published><updated>2005-05-18T09:24:00Z</updated><content type="html">&lt;P&gt;Last time I laid out a little framework for transactional rollback.&amp;nbsp; It clearly is not sufficient for a real honest-to-goodness persistent transaction but if you can tolerate every ESE failing (due to allocating the rollback log entry) it's pretty compelling.&amp;nbsp; Pretty much every function can be transactional if it only works with data structures/algorithms which could support this metaphor.&amp;nbsp; It might look something like...&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New"&gt;int do_work(collection_t *collection, int x, int y, int z) {&lt;BR&gt;&amp;nbsp;&amp;nbsp; rollback_record_t *txn = NULL;&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;int err;&lt;BR&gt;&amp;nbsp;&amp;nbsp; if ((err = collection_insert(&amp;amp;txn, collection, x))) {&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; do_rollback(txn);&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;return err;&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;}&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;if ((err = collection_remove(&amp;amp;txn, collection, y))) {&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;do_rollback(txn);&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;return err;&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;}&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;if ((err = collection_do_something_else(&amp;amp;txn, collection, z))) {&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;do_rollback(txn);&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;return err;&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;}&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;free_rollback(txn);&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;return 0;&lt;BR&gt;}&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;Hey, that's kinda cool.&amp;nbsp; Pretty easy to write a transactional function, eh?&amp;nbsp; But wait there's more.&amp;nbsp; With this metaphor it's also easy to be a good transactional citizen!&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New"&gt;int do_work(rollback_record_t **parent_txn, collection_t *collection, int x, int y, int z) {&lt;BR&gt;&amp;nbsp;&amp;nbsp; rollback_record_t *txn = NULL;&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;int err;&lt;BR&gt;&amp;nbsp;&amp;nbsp; if ((err = collection_insert(&amp;amp;txn, collection, x))) {&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; do_rollback(txn);&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;return err;&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;}&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;if ((err = collection_remove(&amp;amp;txn, collection, y))) {&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;do_rollback(txn);&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;return err;&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;}&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;if ((err = collection_do_something_else(&amp;amp;txn, collection, z))) {&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;do_rollback(txn);&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;return err;&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;}&lt;BR&gt;&amp;nbsp;&amp;nbsp; merge_rollback(txn, parent_txn);&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;return 0;&lt;BR&gt;}&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;Hey, woah maybe we're on to something!&lt;/P&gt;
&lt;P&gt;Hmm.. it's going to be hard to use other libraries... what if I don't really need this level of guarantee?&amp;nbsp; What if people point at the code and say it's weird?&amp;nbsp; I guess social acceptability is as important as reliability and correctness...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=419058" width="1" height="1"&gt;</content><author><name>mgrier</name><uri>http://blogs.msdn.com/members/mgrier.aspx</uri></author></entry><entry><title>Transactions and in-memory data (part 2 of n)</title><link rel="alternate" type="text/html" href="http://blogs.msdn.com/mgrier/archive/2005/05/17/418428.aspx" /><id>http://blogs.msdn.com/mgrier/archive/2005/05/17/418428.aspx</id><published>2005-05-17T10:04:00Z</published><updated>2005-05-17T10:04:00Z</updated><content type="html">&lt;P&gt;Buffering the opertations wasn't particularly successful.&lt;/P&gt;
&lt;P&gt;Let's look at maintaining a rollback log.&lt;/P&gt;
&lt;P&gt;Seems pretty simple.&amp;nbsp; You can do something like:&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New"&gt;typedef void (*rollback_function_ptr_t)(void *context);&lt;BR&gt;typedef struct _rollback_record {&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;rollback_function_ptr_t rollbackFunction;&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;struct _rollback_record *next;&lt;BR&gt;} rollback_record_t;&lt;BR&gt;&lt;BR&gt;void do_rollback(rollback_record_t *head) {&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;while (head != NULL) {&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;rollback_record_t *next = head-&amp;gt;next;&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;(*head-&amp;gt;rollbackFunction)(head + 1);&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;free(head);&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;head = next;&lt;BR&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;}&lt;BR&gt;}&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;That's pretty simple, eh?&amp;nbsp; Since rollbacks probably have to be done in reverse order of changes, you have to maintain the list with a stack discipline (always adding to the head).&amp;nbsp; Whenever you're going to do anything which mutates global in-memory state, you &lt;EM&gt;first&lt;/EM&gt; allocate a rollback record plus space for your own data, fill it in and then put it on the head of the list.&lt;/P&gt;
&lt;P&gt;Sounds good at one level, but simultaneously it totally breaks the model of folks like STL who were trying to do the right thing by providing a few functions like std::map::erase() which do not fail.&amp;nbsp; Now these destructive actions have to be able to fail since they can allocate a rollback record which can fail.&lt;/P&gt;
&lt;P&gt;Next time I'll go into this model a little more and point out some of its problems but since I hadn't written since Thursday I wanted to get the start of this out at least.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=418428" width="1" height="1"&gt;</content><author><name>mgrier</name><uri>http://blogs.msdn.com/members/mgrier.aspx</uri></author></entry></feed>