Aaron Stebner's WebLog

Thoughts about setup and deployment issues, WiX, XNA, the .NET Framework and Visual Studio

Windows Installer, the Class table, and resiliency repair dialogs

Windows Installer, the Class table, and resiliency repair dialogs

Rate This
  • Comments 21

A while back I wrote an article about Windows Installer and the resiliency feature.  Anyone who has seen a small dialog appear saying "Configuring XXX....please wait" has seen resiliency in action.  I wanted to talk in more detail about a specific type of problem I have seen that ends up inadvertantly triggering resiliency repairs at very random times.

According to the help documentation for Windows Installer, the Class table contains COM server-related information that must be generated as a part of the product advertisement. Each row may generate a set of registry keys and values. The associated ProgId information is included in this table.

In practice, what happens during setup is that for each entry in the Class table, a set of registry data is created under HKEY_CLASSES_ROOT\CLSID\{GUID} for the COM object in question.  The {GUID} in this case is the value of the CLSID column of each row in the Class table of the MSI.  This registry data allows the COM object to be instantiated on this machine by applications or custom script code later on.

There is also some additional data that advertises this COM object and associates it to any MSI that includes it in the Class table.  If you look under the InprocServer32 subkey, you will see a REG_MULTI_SZ value that is also named InprocServer32.  This collection of strings represents an encoded list of features that are associated with this COM object.  Whenever an instance of this COM object is instantiated (with CoCreateInstance or WScript.CreateObject for example), Windows Installer recognizes that this COM object is advertised and proceeds to perform a feature health check for each of the features listed in the REG_MULTI_SZ value named InprocServer32 for this COM object.  If this feature health check returns false for any reason, then you will see a small Windows Installer dialog and a resiliency repair, even if the information needed for this COM object is functioning perfectly fine.

I have run into this problem numerous times while debugging resiliency repairs that have popped up for internal users who installed daily builds of Visual Studio 2005.  Using the debugging techniques in my previous blog article and also here have led me to components in Visual Studio that were being reported as broken.  I had been meaning to write about this for a while, but it got put back in the forefront of my mind yesterday while I helped someone on the Visual Studio team figure out why they were seeing a repair dialog appear when they tried to build a setup/deployment project in the Visual Studio IDE.  In this case, building a setup/deployment project was calling CoCreate on the MSM.Merge object exposed by mergemod.dll, and since this COM object was installed via the Class table of the Visual Studio MSI, it was advertised and a health check was triggered.  The health check happened to fail due to a known bug in the VS MSI that is scheduled to be fixed soon.

This isn't that big of a deal for daily development work because there are experts within Microsoft that can look at the problem and also because the resiliency repair is exposing valid bugs (in several previous cases they were bugs in fusion related to new assembly attributes that were being treated as invalid because they were not recognized, see this blog item for more details if you're interested).

However, it becomes a big deal for end users who are trying to install and use shipped products and encountering repair dialogs for unknown reasons - and worse yet are being asked to insert a CD or browse to an inaccessible network location so that Windows Installer can access source files to try to repair them.  In some products, especially large products such as Visual Studio, the feature that ends up being advertised in the registry by the Class table contains many components.  A resiliency repair will be triggered even if some component that is completely unrelated to the functionality of the COM object being instantiated fails the Windows Installer health check.

Fortunately there is a way to avoid this issue for setup authors.  I recommend using the standard MSI registry, file and component tables to install and register COM objects instead of using the Class table (and the companion ProgId table).

 

  • I really am disappointed that your recommendation is to avoid using the Class table - this totally negates the rich functionality that Windows Installer provides compared to the bad old days when every application installer jammed in whatever registry keys they thought might be needed.

    Your recommendation should have been to create another feature containing just the COM server and the components that are directly required by it. When the repair occurs, if the component is used by multiple features then I understand that Windows Installer uses the feature of "least cost". This new feature would be a low cost feature compared to the feature for the whole application and could be marked as hidden so that it doesn't show in the GUI.

    Creating windows installer packages is one of our companies core functions, so we are very keen for the technology to be used correctly for our clients benefit.

    Everyone at Microsoft needs to get serious using Windows Installer properly to act as a role model for ISVs. Remember that it is a requirement of the "Designed for Windows XP", which I would assume you would want all your products to qualify for.
  • My recommendation to avoid the Class table does not mean that I recommend to not use MSI. I just think it is cleaner to determine the registry keys that are set by your COM object's registration code and set those values in the Registry table of the MSI instead.

    COM registration is deterministic so describing the registration in the Registry table should be possible. Doing so provides a white-box view of the settings needed for the COM object that inserting an entry in the Class table does not provide.

    I don't agree that avoiding the Class table negates any of the functionality of Windows Installer. I see the Class table as being a black box similar to (but not as bad as) self-reg or custom actions. That is why I would recommend not using it. In addition, my philosophy is that we should try to avoid popping up "insert source disk" dialogs wherever possible, and avoiding the Class table helps reduce instances of those.

    That being said, you're right - you can use smart MSI authoring to minimize the possible scenarios where "insert source disk" dialogs appear. Creating a small hidden feature that only contains the component advertised in the Class table can do this. I am not sure how the behavior works if a COM object is advertised and associated with more than one feature in a given MSI though. I will have to look into that and try a couple scenarios to see for sure.

    In the case of Visual Studio, the structure of the features and the MSI and the build processes that create the MSI cause the components advertised in the Class table to be associated with the root feature or one of the major sub-features (such as Visual C#, Visual Basic, etc).

  • "A resiliency repair will be triggered even if some component that is completely unrelated to the functionality of the COM object being instantiated fails the Windows Installer health check."

    By this can you confirm that the component GUID in the eventlog error "Detection of product {GUID}, feature 'MyFeature' failed during request for component {component GUID}"

    ...is just referring to the COM server that was being invoked at that point that triggered the repair and may not be the cause of the problem at all, the problem may just be in that feature?

    Could you provide more details on the health check that occurs there seems to be very little info on the web. Is the key path checked for each component in the feature?

    Thanks. This is the best summary Ive seen on the web.
  • Hi Dave - these are good questions, thank you for posting them. Generally, when a resiliency repair happens, you will see pairs of entries in the application event log. I dug back and found an example from a Class table repair we had in some builds of VS 2005 (pre-beta 1) and this is the format of the entries I saw in the event log:

    Detection of product '{Product Code}', feature 'MyFeature' failed during request for component '{Component GUID 1}'
    Detection of product '{Product Code}', feature 'MyFeature', component '{Component GUID 2}' failed. The resource 'ResourceName' does not exist.

    In this example, Component GUID 1 is the component that fails the health check, and Component GUID 2 is the component that caused the health check to be triggered (in this case, it was the GUID of the component that was advertised in the Class table of the MSI).

    My understanding of the MSI component health check is that it goes through each component and checks the keypath to verify that it still exists, and in the case of binary file keypaths it will check to make sure the file version is greater than or equal to what is listed in the MSI. If a component has a keypath that is an assembly that should be installed to the GAC, then MSI will use Fusion APIs to validate that the assembly is properly installed in the GAC.

    I think I'm going to write a new blog article specifically about this, thank you for the inspiration :-)
  • Fantastic thats just want is needed so many articles just state "...and MSI checks the component." Understanding this check is fundamental to finding the cause of those pesky resiliency problems. How can you find what is wrong if you dont know what MSI is looking for?

    The additional log detail was added in MSI 2.0 wasn't it?

    Other resources state that you cannot get around resiliency by just removing the keypath from a component. Why is this so if the final health check just checks the keypath? What does MSI do if a component does not have a keypath?

    Is removing keyfiles insufficient because the MSI progress inidcator is already visible by the time it gets to checking the key path? Maybe another question for your anticipated blog ;)

    It would be great if there was a tool that would perform the same health check on a specified feature.

    Im trying to track down an intermittent resiliency problem in a large setup that occurs when the app is running. The event log indicates the problematic feature, and the MSI log states that the feature is Request: Local but all the components in that feature Request: Null. A greater understanding of this process is my only hope. Thanks again Dave.
  • Hi Dave,

    I'm not sure when the additional event log data was added for this type of repair scenario, but 2.0 sounds about right.

    Leaving the key path blank for a component does not help here because Windows Installer treats that as a special case and uses the existence of the directory listed in the Directory_ column of the Component table of the MSI as the key path in that case (as described at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/component_table.asp).

    I will have to look and see if there is any kind of tool to simulate component health checks, but I think it wouldn't be too hard to write using some of the MSI APIs documented at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/msi/setup/installer_function_reference.asp.

    Regarding your specific error case that you're trying to track down - if you'd like I can try to help you with that. If you can send me the following at aaronste (at) microsoft (dot) com I'll take a look and we can see if we can figure it out together:

    1. Export of the application event log (in .evt format please so I can easily import it into eventvwr.exe on my side)

    2. Verbose MSI log file for when the resiliency repair happens. You can enable verbose logging using the steps at http://blogs.msdn.com/astebner/archive/2005/03/29/403575.aspx

    3. A copy of the MSI for the product in question (if possible)

    Normally a request of null means that the component keypath exists but is a higher version than the version listed in the MSI for that component. But there are some weird ways this can end up happening, so hopefully the event log and MSI verbose log will help narrow that down....

  • We are currently migrating from an ancient Wise installer to the latest InstallShield, which produces MSI files.

    We currently support multiple major versions of our software, because our clients are not all interested in upgrades, and our policy is to, within reason, accommodate them as much as possible.

    Though we are currently working on enabling the software to be installed to a user-selectable directory (or at least drive letter), all of the current versions install to the same directory in C:\Program Files.

    During the course of development, point updates are released internally which are installed essentially through XCOPY deployment (plus re-registering updated COM DLLs). In order to do effective QA across our various versions, which all periodically get new external releases, I keep all versions of the POS installed, and I simply rename the program's directory for each version, appending a _5.5, _6.2, _6.3 to the end. The version I am currently testing gets renamed back to the original installation path.

    With the Wise installer, this approach works fine. However, after running a test installer using the InstallShield-generated MSI file, the resiliency mechanism kicks in whenever I try to use any other version of the software. We've read that this can be disabled by not publishing the components, but we also need the automatic updates feature to work, and for that, the components *must* be published.

    Installing the software and applying a point update takes quite a long time, compared to renaming the directories, so the obvious solution, that being to uninstall the previously-tested version and then install the next version to be tested, would greatly increase the amount of time needed for switching between versions and would significantly decrease my productivity.

    At present, the only thing I have been able to do is to disable the Windows Installer service entirely. Even with this, the resiliency check still kicks in, and the "Preparing to install..." window pops up several dozen times just during the startup of the program. It pops up all over the place while trying to use program features as well. It doesn't actually prevent the software from being used, but it is extremely annoying, and it also interferes somewhat with automated testing.

    Is there any way I can, just on my development machine, kill off resiliency without preventing updates from proceeding?

    Is there a way I can disable the resiliency that will also stop the "Preparing to install..." dialogs from popping up?

    If it isn't possible to have updates and not resiliency, is there a way to disable both, perhaps by directly editing the registry?

    You described a REG_MULTI_SZ value called InprocServer32 in the HKCR\CLSID section of the registry, inside a key by the same name. I have found a number of these for components in our program, but I can't make heads or tails of them. How are the values encoded? Also, are they the *only* thing that can spontaneously activate a repair when the program is instantiating a component? Other than instantiating a published COM component, what things can trigger a repair?

    When I look in the Application Log in the Event Viewer, I do see events in pairs like you described -- "Detection of <blah blah> failed" and what have you -- but when I try to intentionally create new log items, e.g. by starting a different version of the program up than was installed (which, as I mentioned, shows the "Preparing to install..." dialog numerous times), no entries are added. This in and of itself doesn't really bother me, but could it be a symptom of something else?

    Finally, short of system virtualization (or actually having a separate PC for each version), are there any alternative means by which I might switch between versions efficiently for testing?

    I look forward to your responses with great anticipation :-)
  • Hi Jonathan - I'm sorry for the delay getting back to you on this issue.  I have to admit I have not heard of a strategy like you describe above for testing different versions of a product.  Without knowing the details about what your setup does and what is changed between the various versions (particularly in terms of registry settings and COM class info), it is hard to assess whether or not this strategy will produce reliable results.  In general, I would recommend trying virtualization for this type of testing.  You could use a tool like Virtual PC or something along those lines, and maintain an image with each version of your product installed.

    For trying to narrow down the resiliency dialog, I would suggest enabling verbose MSI logging and then reproducing the "preparing to install..." dialog and let it complete.  The data in the verbose log combined with the entries in the application event log should give a clue about what component(s) that Windows Installer think need to be repaired.

    Advertised COM classes in the Class table of your MSI are one possible cause of this kind of resiliency dialog, but not necessarily the only possible cause.  This will also depend on exactly what your setup does.  If you can send me a zipped copy of a verbose repair log and the app event log, I can try to help you look at this and try to figure it out.

    If the repair is caused by advertised COM classes, I can help you figure out the exact registry values to remove as a workaround.  There is not a way to turn off resiliency though, this is a core feature of Windows Installer that they do not support disabling because it is detrimental to the overall robustness of Windows Installer.
  • Thanks for your reply. I tried what you suggested and obtained a log file. I had screwed around with so many things, though, trying to find some way to effectively turn it off at least for my present configuration, that I seriously doubt the usefulness of the results.

    One of the things I had done was to go through all of the Windows Installer keys in the registry and rename every instance of the install directory to the same name with an 'I' appended. This didn't fix the problem at the time but has possibly led to a temporary solution (at least until the next installer I have to run). When I allowed the repair to continue with verbose logging enabled, it reconstructed the installation *into that alternate directory*, without messing with any of the files I had manually placed into the actual application directory. Now, with that directory a pristine copy, I am no longer prompted for a repair at all while running our application. :-)

    For the longer term, this is clearly not a viable solution, and perhaps virtualization is the only option (though the first step, then, will be to install a bigger hard drive in my workstation! :-).

    Regarding the component which triggered the check, the following lines appear near the top of the MSI log file:

    === Verbose logging started: 5/1/2006  12:00:56  Build type: SHIP UNICODE 3.01.4000.2435  Calling process: C:\PROGRA~1\WWPos\POSAudit.exe ===
    MSI (c) (A8:24) [12:00:56:184]: Entering MsiProvideComponentFromDescriptor. Descriptor: ,(GnBmGFa=nZ7]6MJA+rGIANT_AntiSpyware_Files>M5KDYSUnf(HA*L[xeX)y, PathBuf: 1E1F784, pcchPathBuf: 1E1F780, pcchArgsOffset: 1E1F6E0
    MSI (c) (A8:24) [12:00:56:543]: MsiProvideComponentFromDescriptor called for component {997FA962-E067-11D1-9396-00A0C90F27F9}: returning harcoded oleaut32.dll value
    MSI (c) (A8:24) [12:00:56:715]: MsiProvideComponentFromDescriptor is returning: 0
    MSI (c) (A8:28) [12:00:56:918]: Incrementing counter to disable shutdown. Counter after increment: 0
    MSI (c) (A8:64) [12:00:57:012]: Entering MsiProvideComponentFromDescriptor. Descriptor: ,(GnBmGFa=nZ7]6MJA+rGIANT_AntiSpyware_Files>M5KDYSUnf(HA*L[xeX)y, PathBuf: 12E688, pcchPathBuf: 12E684, pcchArgsOffset: 12E5E4
    MSI (s) (54:90) [12:00:57:012]: Grabbed execution mutex.
    MSI (c) (A8:64) [12:00:57:090]: MsiProvideComponentFromDescriptor called for component {997FA962-E067-11D1-9396-00A0C90F27F9}: returning harcoded oleaut32.dll value

    I searched my entire registry for the GUID {997FA962-E067-11D1-9396-00A0C90F27F9} and came up with nothing, but two things stand out to me:

    * The description contains the text "...A+rGIANT_AntiSpyware_Files>M5...".
    * The line: "returning harcoded oleaut32.dll value"

    It would seem that the scan was triggered by the application using some class or function in the core system OLE automation library.

    Is there anything else that could be gleaned from the log file? If so, I can make it available.
  • I'm having this resiliency problem, but it seems the problem is an interop assembly in a _parent_ feature.

    Two issues:

    1. The event log doesn't say what component is reason for reinstall (only the first event is generated - creating a COM object using AppId).

    2. The interop assembly has been installed to GAC_MSIL, while MSI looks for it in GAC only...

    How should I fix the mismatch with installing to \Windows\Assembly\GAC_MSIL, but MSI resiliency looks in \Windows\Assembly\GAC

    Hope you can help.

    Regards,

    Rune.Christensen at visma dot com.

  • Hi Runec - I've never heard of a case where a repair was triggered by Windows Installer resiliency but yet there was not information in the application event log indicating what component(s) were being repaired.  Can you double-check to see if there are any warnings reported from the source named MsiInstaller that you might have missed?  It might help to clear out the log and start fresh in order to do this.  Also, you can enable Windows Installer verbose logging (using steps like the ones I posted at http://blogs.msdn.com/astebner/archive/2005/03/29/403575.aspx) and reproduce the problem and the verbose log might help you narrow this down as well.

    If you have an assembly that was installed to GAC_MSIL, but your MSI is looking for it in the GAC folder instead, this is likely caused by some missing attributes for the assembly in the MsiAssemblyName table.  In cases I've seen like this in the past, the MsiAssemblyName table did not have a processorArchitecture attribute listed for the assembly, which caused Windows Installer to look in the old .NET Framework 1.0/1.1 GAC instead of the new .NET 2.0 GAC_32, GAC_64 or GAC_MSIL.  You might need to add processorArchitecture = Neutral to your MSI for this issue.

  • Thanks for your reply!

    I've checked the event viewer log again, and the only entry is:

    Detection of product '{09001326-D530-40C6-98C9-B127A47498E2}', feature 'ClientTools' failed during request for component '{544FC311-F9E9-4FB7-889B-FC855AAF9BA3}'

    {544FC311-F9E9-4FB7-889B-FC855AAF9BA3} is a COM server (advertised) being instatiated using prog ID.

    Analysing the log file I found that components in the ClientTools were reinstalled, plus Visma.Interop.Crm.dll, the interop .NET assembly installed to GAC. the Visma.Interop.Crm.dll component belongs to ProgramFiles, the parent feature of ClientTools. This is the component that has been installed to GAC_MSIL, but resiliency process looks for in GAC.

    The records in the MsiAssemblyName table for Visma.Interop.Crm.dll are

    Visma.Interop.Crm.dll Name Visma.Interop.Crm

    Visma.Interop.Crm.dll Version 1.0.0.0

    Visma.Interop.Crm.dll Culture neutral

    Visma.Interop.Crm.dll FileVersion 7.15.4013

    Visma.Interop.Crm.dll PublicKeyToken D15E281808C3A4BB

    When I try to add the "Visma.Interop.Crm.dll processorArchitecture  Neutral" to the MsiAssemblyName table, I get this error in verbose log:

    MSI (s) (94:D8) [13:39:34:605]: Assembly Error:The given assembly name or codebase, '%1', was invalid.

    MSI (s) (94:D8) [13:39:34:605]: Note: 1: 1935 2: {BB52BE7F-4807-4C70-A93C-9E83420DEA28} 3: 0x80131047 4:  5: CreateAssemblyNameObject 6: Visma.Interop.Crm,Version="1.0.0.0",processorArchitecture="neutral",Culture="neutral",FileVersion="7.15.4013",PublicKeyToken="D15E281808C3A4BB"

    {BB52BE7F-4807-4C70-A93C-9E83420DEA28} is Visma.Interop.Crm.dll

    I tried "Visma.Interop.Crm.dll processorArchitecture  x86" instead. Then installation succeeds, but the re-install issue is still present.

    As a current "fix", I've moved the Visma.Interop.Crm.dll component from ProgramFiles feature into a leaf feature...

    Regards,

    Rune Christensen

  • Hi RuneC - I'm sorry, but it looks like I mistyped when I suggested using processorArchitecture = Neutral.  Can you please try to set processorArchitecture = MSIL instead and see if that will work in this scenario?

    Also, you can use the gacutil.exe utility in the .NET Framework SDK to display the full assembly identity for your assembly.  What I typically do is run gacutil.exe /i <path to assembly DLL> and then run gacutil.exe /u <assembly DLL name without the .dll extension>.  Doing that from a cmd prompt will display text like the following:

    Assembly: <assembly name>, Version=6.0.6000.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35, processorArchitecture=MSIL

    Uninstalled: <assembly name>, Version=6.0.6000.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35, processorArchitecture=MSIL

    You'll need to make sure that the MsiAssemblyName table of your MSI contains values for all parts of the assembly identity in order for Windows Installer to be able to correcty find it during this type of resiliency scenario.

    Also, in general, I'd recommend leaving your advertised components in leaf features like you describe above.  Repairs should be non-existent or very rare after fixing the assembly attributes as described above, but authoring your features this way will ensure that any repairs that might be triggered will be minimal and not take as long of a time for the user.  

    Hope this helps....

  • Now it finds Visma.Interop.Crm.dll during resiliency repair - thanks!

    Regards,

    Rune Christensen

  • Is there a way to decode the [HKEY_CLASSES_ROOT\CLSID\{GUID}]"InprocServer32" value?

Page 1 of 2 (21 items) 12
Leave a Comment
  • Please add 7 and 4 and type the answer here:
  • Post