I recently had a case that I thought I'd share with you.
A well known ISV opened an incident with Developer Support to help solve an issue at their customer site.
This ISV has an application that encrypts certain sensitive data in the custom application in the ISV's custom table in their own Dexterity based window.
The complaint given: After opening the custom window and making the application perform the decrypted data, when the window closes and the Dynamics navigation shell has focus - Dynamics GP crashes with the "Dynamics Runtime has encountered an error and needs to close.." dialog.
After discussing the issue with partner & customer, I was able to discover a few facts:
After launching Dynamics GP, we used Windbg to attach to the Dynamics.exe process. We then replicated the error and examined the stack trace.
What the stack dump showed as the cause of the error was a stack overflow. Odd - can't say I've ran across that before.
Looking at the stack dump, if you have the proper symbols, we can see the area of the issue.
0:000> kn# ChildEBP RetAddr 00 0004385c 7c90df5a ntdll!KiFastSystemCallRet01 00043860 7c8025cb ntdll!NtWaitForSingleObject+0xc02 000438c4 7c802532 kernel32!WaitForSingleObjectEx+0xa803 000438d8 7a0454ad kernel32!WaitForSingleObject+0x1204 00043908 7a0459f7 mscorwks!ClrWaitForSingleObject+0x2405 00043dc4 7a048194 mscorwks!RunWatson+0x1df06 00044508 7a04885a mscorwks!DoFaultReportWorker+0xb5907 00044544 7a097542 mscorwks!DoFaultReport+0xc308 00044568 7a09618a mscorwks!WatsonLastChance+0x3f09 000445d0 7a0962d1 mscorwks!EEPolicy::HandleFatalStackOverflow+0xdb0a 000445f4 7a00b70f mscorwks!EEPolicy::HandleStackOverflow+0x173 0b 00044610 7c9032a8 mscorwks!COMPlusFrameHandler+0x10b0c 00044634 7c90327a ntdll!ExecuteHandler2+0x260d 000446e4 7c90e48a ntdll!ExecuteHandler+0x240e 000446e4 7c812a6b ntdll!KiUserExceptionDispatcher+0xe0f 00044a30 7a0364b4 kernel32!RaiseException+0x5310 00044a4c 7a0c878a mscorwks!ReportStackOverflow+0x61 <------------here is the stack overflow11 00044a5c 79e7cc34 mscorwks!Alloc+0x3b12 00044a9c 79ee1e9d mscorwks!AllocateObject+0x5913 00044cfc 79ee220e mscorwks!AssemblyNameNative::AssemblyNameInit+0x9514 00044e40 792efdb3 mscorwks!AssemblyNameNative::Init+0x19615 00044e64 792d36ed mscorlib_ni+0x22fdb316 00044e88 792d1a52 mscorlib_ni+0x2136ed17 00044eb0 79e71b4c mscorlib_ni+0x211a5218 00044ec0 79e821f9 mscorwks!CallDescrWorker+0x3319 00044f40 79e96591 mscorwks!CallDescrWorkerWithHandler+0xa31a 00045084 79e965c4 mscorwks!MethodDesc::CallDescr+0x19c1b 000450a0 79f27ac3 mscorwks!MethodDesc::CallTargetWorker+0x1f1c 000453f4 79fae617 mscorwks!AppDomain::RaiseAssemblyResolveEvent+0x17e1d 00045414 79fae6a0 mscorwks!AssemblySpec::ResolveAssemblyFile+0x241e 0004549c 79f27cbe mscorwks!AppDomain::TryResolveAssembly+0x761f 0004610c 79ea15c0 mscorwks!AppDomain::BindAssemblySpec+0x5a320 00046174 79eddefa mscorwks!AssemblySpec::LoadDomainAssembly+0x11421 00046198 79ee268b mscorwks!AssemblySpec::LoadAssembly+0x1d22 000462e4 792f11ec mscorwks!AssemblyNative::Load+0x24023 0004630c 792efc00 mscorlib_ni+0x2311ec24 00046340 792efdcf mscorlib_ni+0x22fc0025 00046364 792d36ed mscorlib_ni+0x22fdcf26 00046388 792d1a52 mscorlib_ni+0x2136ed27 000463b0 79e71b4c mscorlib_ni+0x211a52
<Below we repeat this block for the length of the dump--------------------------------------------->
28 000463c0 79e821f9 mscorwks!CallDescrWorker+0x3329 00046440 79e96591 mscorwks!CallDescrWorkerWithHandler+0xa32a 00046584 79e965c4 mscorwks!MethodDesc::CallDescr+0x19c2b 000465a0 79f27ac3 mscorwks!MethodDesc::CallTargetWorker+0x1f2c 000468f4 79fae617 mscorwks!AppDomain::RaiseAssemblyResolveEvent+0x17e2d 00046914 79fae6a0 mscorwks!AssemblySpec::ResolveAssemblyFile+0x242e 0004699c 79f27cbe mscorwks!AppDomain::TryResolveAssembly+0x762f 0004760c 79ea15c0 mscorwks!AppDomain::BindAssemblySpec+0x5a330 00047674 79eddefa mscorwks!AssemblySpec::LoadDomainAssembly+0x11431 00047698 79ee268b mscorwks!AssemblySpec::LoadAssembly+0x1d32 000477e4 792f11ec mscorwks!AssemblyNative::Load+0x24033 0004780c 792efc00 mscorlib_ni+0x2311ec34 00047840 792efdcf mscorlib_ni+0x22fc0035 00047864 792d36ed mscorlib_ni+0x22fdcf36 00047888 792d1a52 mscorlib_ni+0x2136ed37 000478b0 79e71b4c mscorlib_ni+0x211a52
On line 10, we get the stack overflow and it bubbles up from there and Dynamics crashes.
Why is there a stack overflow? Good question and the reason for it is the section from line 28 to 37 which repeats over and over again - about 200 times until the stack overflows. Here something is trying to load an assembly. But unfortunately, we cannot determine exactly what that assembly might be - even looking in the some of the memory locations referenced in the stack trace. Perhaps I might need a bit more practice or perhaps it is just not there.
The next thing I did was detach from the crash and then reset Dynamics and Windbg to reproduce the issue. I noticed this time that Windbg reported "C++ EH exception - code e06d7363 (first chance)." being thrown many times. This wasn't a big surprise to me as I had deduced that from the stack trace above. An assembly is trying to be loaded and an error is thrown (mscorwks!AppDomain::RaiseAssemblyResolveEvent). And then it must be handled in a try block and retried. Again and again and again. We did use the Windbg command "sxe eh" to see if a first chance exception break would show us any further information however it did not. OK - dead end there as best I can tell.
Edit: 3/5/2010: Still trying to pinpoint the assembly trying to be loaded, I tried the FUSLOGVW.exe application -The Assembly Binding Log Viewer. This app will log .NET assembling binds (success or failures) done by all applications. Surely this will tell me the name of the missing assembly.
After testing this a bit with a few .NET apps and even Dynamics GP starting up, I disabled the app and went to the point of reproducing the issue and then enabled the Assembly Binding Log Viewer. After reproducing the issue, I checked the application and it seemed to not have logged anything! I was surprised at the time. On retrospect, that might be because the assembly was already "bound" by Dynamics GP. Or perhaps I just wasn't logging correctly although my other tests did show the successful binds happening. In either case, this didn't give me the name of the assembly.
Running out of ideas, the next thought would be to to run Process Monitor as that perhaps would pinpoint the file trying to be loaded.
We did a Process Monitor trace and I noticed that Dynamics.exe is attempting to find the Microsoft.Mshtml.dll in the GP folder and the results are "Name Not found". Then again and again, Dynamics.exe is trying to load that assembly in the GP folder repeatedly and not looking in the typical search path.
Here obviously is the file that is trying to be loaded - the one that the crash dump didn't reveal the name of.
Finding that puzzling but having a good lead on the resolution, we did a search on the machine having the issue. The Microsoft.Mshtml.dll couldn't be found anywhere on the local machine.
From the IT persons machine we did find the file Microsoft.MSHtml.dll located in the \Primary Interop Assembly folder which I believe was installed with Visual Studio. Looking in the Dynamics GP folder, this assembly didn't exist there either. However this machine didn't crash like the other one did.
Since Dynamics.exe was insisting on looking in the GP folder for this file, we copied the file from the IT persons machine into the GP folder of the machine that was crashing and tried to repro the issue.
Happily, we could not! Everything worked fine. We then went to a 2nd machine that was crashing and searched for the Microsoft.MSHtml.dll and it also didn't have it on the machine. After copying the Microsoft.Mshtml.dll to the GP folder, we could once again no longer reproduce the issue and declared victory.
A partial victory anyway. This isn't the real solution but the customer was quite happy since we now had a way to make all the machines workable.
There are several questions still outstanding that we don't know about. I could have checked further but after quite a few hours we were all happy that everything was working and called it good.
The unanswered questions are:
Edit 3/5/2010 I was thinking about this a bit more the last few days. I have a feeling that I can answer #2 above. Even though we did a search of the machine for the Microsoft.Mshtml.dll and didn't find it, I don't believe that the search looks in the GAC. I just tested that theory on my own machine and that appears true. So in #2 above, I had assumed that the assembly didn't exist on the machine at all and therefore GP wasn't using it at all. I don't believe that was a correct assumption. I tested on my own Dynamics GP 10.0 just now by attaching Windbg to my Dynamics.exe process. Looking in the loaded modules list I see:
So indeed on my machine Dynamics.exe found it in the GAC and is using it. I suspect that we'd see that as well as well on the customer machine before opening the custom window. After using the custom code, Dynamics.exe/.NET somehow loses track of the assembly and insists on only looking in the GP folder. So I think that answers questions #2 & #3 of mine above, but the rest remain.
Hope this helps,
Edit 3/5/2010 - Additional thoughts added to the end. Also added that we used the Assembly Binding Log Viewer tool to troubleshoot.
Great article! It seems your ISV is trying to automate HTML within a control. However, in doing so, they are apparently trying to reference MSHtml.dll interop assembly, which is only available with the .NET Framework SDK and unfortunately, this assembly is not always present on every client machine.
Take a look at the following Microsoft article and see if you perhaps agree with me.
No, don't think that is it but it was a good thought.
The reason I think "not" is if their app was doing this then I'd assume the encryption/decryption part wouldn't be working in that GP would crash on the call itself. However that all works completely fine.
It is only after the call is made to their dll and then the GP navigation shell does something with IE that we get the crash. So it is GP/Runtime that is doing this. But the mystery is what is the ISV doing that somehow cases GP to try to load it. And again it must be something in one of their calls because even if we open/close their window without making it do the decryption then everything is fine. So it surely is the call. But the call itself never crashes.
PS: Don't forget to rate the article (shameless plug)
I bet they referenced a local copy of the dll instead of the one in the globals assembly cache and I bet they also marked it as a requirement and in their install pointed it to the gp directory instead of the global assembly cache.
One thing that is confusing is that I thought .net apps look at the CAG first then directory if they cannot find it.
PLEASE READ BEFORE POSTING
Please only post comments relating to the topic of this page.
If you wish to ask a technical question, please use the links in the links section (scroll down, on right hand side) to ask on the Newsgroups or Forums. If you ask on the Newsgroups or Forums, others in the community can respond and the answers are available for everyone in the future.