I was helping to debug a failure in a complicated WiX deferred custom action and we weren't getting any useful diagnostic information at all. During a deferred custom action the install would simply roll back without explanation. After doing my initial research I was convinced that debugging something like this was not possible. Typically you can put a call to a MessageBox in a #if DEBUG condition within an unmanaged custom action and attach a debugger to it during the MessageBox blocking call, load symbols and then start debugging your source code. And we had that as the very first executable line of code of the custom action and still couldn't get the MessageBox to display. We also were not getting anything useful in any msiexec logs, event viewer, etc. The issue only occurred on WinXP SP2 but not on other operating systems or environments that we were testing on.
It turns out that debugging issues like this with MSI's can be done but it's not exactly simple. At first we were mistakenly trying to attach windbg to the msiexec.exe process for the interactive user. But I was completely confused when I turned on break on Create Process (tools -> event filters -> Create Process -> enabled -> not handled) and it was not breaking when our custom actions ran in a separate instance of MSIEXEC.exe. After some pair programming, some creative thinking and a little luck, it turns out that there is an instance of msiexec.exe that runs in the system logon session which launches these custom action processes. So here is how you can debug this scenario:
1. Launch windbg.exe and attach to a process (F6).
2. Attach to the instance of MSIEXEC.exe that is running in the system logon session.
3. Enable the break on create process event filter (tools -> event filters -> Create Process -> enabled -> not handled)
4. (Optional) break on module load. this seemed to make it easier to step through this scenario, although it's a little noisy.
After discovering how to debug this, it was pretty easy to determine what was happening. LoadLibrary was failing when trying to load advapi32.dll on Windows XP. From there the problem was discoverable by walking up the call stack one frame at a time and looking for the method name of the function that it failed to load. The function name? RegGetValue. I had mistakenly coded a call to RegGetValue instead of RegQueryValueEx because I thought I knew what I was doing at the time. I saw the prefix "Reg*" and I assumed it was a standard registry function that I thought I knew fairly well. If you check the docs on RegGetValue, you'll notice that it's only supported on WinXP 64-bit, Vista, and Windows 2003. We also needed to support Windows XP 32-bit. Believe it or not, I did consult the MSDN docs on this particular API, but never scrolled down to look at the supported operating systems. Oops.
I've learned 2 very valuable lessons from this experience:
1. Always, always read the documenation for the Win32 API's. And I mean all of it. Verify the return values, the required vs. optional parameters, the supported operating system versions as well as specific platforms. I am typically really thorough when it comes to checking these docs, but when you see an API that you think you know then it's easy to get lazy. I saw Reg* and thought that Intellisense would guide me the rest of the way. You should consider double-checking everything in the docs when you use a Win32 API.
2. Sometimes it's better to do the mandatory configuration in a first-run experience rather than during an install. And this does not necessarily have anything to do with the complexity of developing and debugging feature-rich install programs. But if you have to ask someone for 50 pieces of information during an install to get the application to work the way they want it, is it really the right approach? After all the work that was done to get our MSI to work well on multiple platforms from unmanaged code (remember that building it in a custom action meant that we couldn't assume any dependencies on .NET 2.0 or anything else productive) I realized that an end-user still has to go in and tweak a few configuration settings before successfully using the application. Had I thought of it at the time, we could have simply laid down all of our components during the install, and then asked the user for ALL of the configuration information during a simple first-run wizard.