Thoughts about setup and deployment issues, WiX, XNA, the .NET Framework and Visual Studio
All postings are provided AS IS with no warranties, and confer no rights. Additionally, views expressed herein are my own and not those of my employer, Microsoft.
I was talking last week with someone about how difficult it is to truly get progress indication UI right in setup (time remaining estimates, completion percentage, etc). In the extreme cases, you end up with something like my all-time favorite screenshot, but in mainstream cases it can be anywhere ranging from mildly difficult to impossible to get the estimated time correct (depending on the size and complexity of your setup, the number of custom actions, along with several other factors).
To illustrate how tricky this can be, let me describe how we implement the progress UI in setup for Visual Studio .NET 2002 and 2003. There are 2 distinct phases to the setup for these products - the prerequisites (or Windows Component Update) and the main Visual Studio feature installation UI.
For the main VS setup, we implemented a Windows Installer external UI handler, and we interpret the progress messages passed back to us by Windows Installer and then update the time remaining and the progress bar. That sounds nice in theory, but what really happens is that each phase of the setup (generating script, installing, cleaning up rollback files, etc) is interpreted separately, so then the progress bar will fill up and then start over, and the time remaining will decrease close to zero, then increase again as the next phase starts. In fact, we removed the estimated time remaining string from the progress UI between VS 2002 and VS 2003 because it was so flaky and very frustrating to users to have it jump forward and backwards throughout the installation process.
For the VS prerequisites setup, we could not use a Windows Installer external UI handler because we were chaining together several setup packages, some of which were not MSIs. Because we were simply calling CreateProcess to spawn each package, there was also not a way to receive a callback from each of the individual packages with progress messages that we could use to update setup UI for the user. Instead, what we did is find a really slow machine in our test lab (one that was below our published minimums for both RAM and processor speed), and ran an installation on that machine. Then we parsed the log files to determine the amount of time required for each prerequisite component and put that value in one of the setup data files (baseline.dat to be exact - if you look in that file in notepad you will see a value EstInstallTime for each component). The setup UI code then started a timer when it called CreateProcess, and filled in the progress bar using this estimated time value and the current elapsed time to create a completed percentage. If something bad happened and it took longer, we just left the progress bar full and put a simple message saying "setup is still installing...." above the progress bar (in my experience this rarely happens though unless one of the components hangs when we call it). Because we used a super-slow machine, the components often completed before the progress bar filled up - and it generally gives a better impression to finish sooner than predicted instead of later.
Based on feedback we've heard for both the VS 2002 and VS 2003 product families, it appears that our old-fashioned method of "calculating" progress of the prerequisite components seems much more accurate than the more correct way of calling into Windows Installer.
I received an email yesterday from an individual who had just installed the latest Windows Installer Platform SDK and had read a previous blog post that I wrote about using msi.chm for Windows Installer help information, but was unable to find msi.chm on his system. I took a look on our internal products server and couldn't find msi.chm there either, so I decided to go to the Platform SDK site and figure out what was going on. What I found is that I was basing the information in that blog entry about msi.chm on what appears to be an older version of the Platform SDK that I had downloaded a while ago and then just copied msi.chm off to a separate location.
I tried out a new download of the Windows Installer part of the Platform SDK, and it appears they have re-organized the help documentation for the entire SDK. Instead of having standalone CHM files for each product in the SDK, there is now a unified Platform SDK help collection and each individual product plugs in and registers an HXS (compiled help) file.
So what I had to do was launch the Platform SDK Documentation link on the Start menu after installation of the SDK. From there I was able to use the index and search that I normally use in my old copy of msi.chm, and as an added bonus there are updated topics for Windows Installer 3.0 and some of the incorrect info in my old version of msi.chm have been fixed.
Sorry for any confusion I created in my original post....
I was exchanging email with Michael Teper last week while discussing an issue he brought up on his blog about the J# redist and why he had to install it when installing Visual Studio 2003. In the course of this discussion, he brought up a really interesting point (which I will include here with his permission) -
"By the nature of the market and Microsoft's position in it, any piece of software coming out of MS carries with it guideline authority. In fact, one of the reasons Microsoft gets criticized so much for any flaws in its software is that, quite frankly, the expectation is that anything Microsoft unleashes on the world is perfect."
When I first joined Microsoft back in 1999, I had similar ideals that everything that ships with the Microsoft name on it would be perfect. Of course, being a computer scientist, deep down I knew even then that this could not be true for any company, including Microsoft. After 5 years as a professional software engineer, I know that Microsoft (like all companies) ships software with known defects and that there are often complex business and design trade-offs that go into individual decisions about bugs (like in the J# redist installation design that I blogged about earlier).
I've also come to realize that while this lack of perfection is generally known, Microsoft does have a unique position as a de facto authority when we ship products or even publish recommendations or opinions on the web. It is kind of like that line from Spiderman that goes something like "with great power comes great responsibility." This is a point that I have been focusing more on internally when taking stands on issues and debating about whether or not we should fix a particular bug or create a new white paper to better explain a feature/workarounds/etc. This is also partially what has led me to start blogging about some of the troubleshooting strategies I've come across for setup issues and Windows Embedded.
I already know that our next versions will be so much better based on some of the input I've been able to bring to designs based on direct feedback I've heard from people feeling pain from our shipped versions. I can only hope that any of the insights that I'm able to provide will help some people ease that pain that I know is there and make it easier to do your work, make Microsoft software easier to use and understand, etc. I also hope I can show that there are people who work at Microsoft who know that some of the experiences with our software is painful and that we really want to do everything we can to eliminate (or at least minimize) that, even on an individual level when it is not possible across an entire product.
Thanks to Stefan Kreuger who runs InstallSite for sending along an additional link with some more information about automatic Windows Installer repairs that I talked about in my previous blog entry. Check it out at http://www.msifaq.com/a/1037.htm. There is information about the specific entries that get created in the application event log and an additional link to a good explanation of issues caused by consuming MSMs into an MSI package. There is also a VBScript that you can download that can help diagnose root causes for some of these repairs. Hope this helps!
Hey all, I just got a mail from my "sister" team saying that updated embedded evaluation kits with Windows CE 5.0 are now available at http://msdn.microsoft.com/embedded/getstart/evaluate/. Get em while they're hot!
I was looking at some blog items that were posted about Microsoft products last night, and I stumbled upon this one on Michael Teper's blog. The question he raised is one that I've run into a lot within Microsoft, particularly from C++ developers that have to install Visual Studio .NET 2003 to do their day-to-day coding work - why do I have to install the Visual J# redistributable package 1.1 before I can starting installing VS, even if I'm not going to be doing any J# coding? (of course, in the mails from internal devs I have to remove the expletives from the question also..... :-) )
Since I was on the VS setup team at the time that we added J# to the suite of tools/languages and I know why this is the case, I thought I would try to explain this design. It helps to start by looking at the background and planning of the VS 2003 product. When VS 2002 finally shipped it was nearly 4 years after VS6 had been released, and we wanted to make sure that the next release of VS was not going to require 4 more years of bake time, partially because a lot of customers will wait for Microsoft to release a service pack or 2 before they will jump in and start using a v1.0 product (which is essentially what VS 2002 was in comparison to VS6, and it is exactly what the .NET Framework 1.0 was). So, VS 2003 was originally planned to be a 6-month project that would essentially be a 1.1 release for VS 2002 and would contain only bug fixes to existing features. The only new features would be the new J# programming language and smart device project support in the IDE for C# and VB .NET. In fact, some of the early plans for VS 2003 had proposed to ship it as a service pack that would install on top of VS 2002.
Anyway, with the mandate that we would be shipping in 6 months, the setup team planned very limited work for VS 2003 - some deployment improvements for the .NET Framework redist package, the OCM package that would be used to deliver the .NET Framework 1.1 to Windows Server 2003, and the addition of the J# language tools and J# redistributable package that was required for J# applications to work correctly. We considered putting the J# redist functionality into the .NET Framework 1.1, but rejected that for several reasons. Then we considered including it in the main VS setup tree, but also rejected that because it is needed at runtime in addition to design time and we couldn't tell people to install a part of VS on their clients' machines in order to enable J# applications to work correctly. We also tried to propose that the J# redist installation package be installed if the user chooses the J# programming language in the VS tree, and then the IDE could install it on demand or provide UI to guide the user to install it themselves if they really needed it. This presented usability issues, caused an additional barrier to entry for J#, and also caused problems with administrative deployment scenarios where the end user was not an administrator when they were using the IDE and could therefore not install products themselves.
That left us with the option of including it as a distinct component in the Visual Studio Windows Component Update (renamed to be Visual Studio Prerequisites in VS 2003). The obvious drawbacks of this decision are that people who don't intend to ever develop a J# application are forced to install something they do not need, and worse yet, are blocked from installing Visual Studio if this thing they don't need fails to install. In an ideal world with infinite time and resources, we would have added logic to our setup to be able to associate a component in the VS Prerequisites part of setup with the features in the main VS feature tree. Given a 6 month milestone, that was not feasible because it is non-trivial to artificially introduce a feature-level dependency within one MSI on a component that is not also in that MSI, and we did not support nested MSIs.
As we now know, the VS 2003 product cycle extended from the original 6 month target to something like 15 months (in order to better incorporate customer feedback we started getting for VS 2002 as well as to increase the overall quality and stability of the product we wanted to ship). By the time that became evident, we had already passed the window of opportunity for coding up a new bit of dependency logic to give a better experience for installing the J# redist. Unfortunately we have heard lots of feedback since then from customers and also from Microsoft developers who only want to install VS to code in C++ and don't want anything J# related, etc. The good news is that this type of setup logic is now in place and will start being used in the VS 2005 beta2 timeframe.
In the meantime, I can offer a couple of workarounds for folks who don't want the J# redist on their machine:
Hopefully this gives some insight into how Visual Studio .NET 2003 setup behavior came to be and what we're doing about it in the future. Let me know if you have any comments, concerns or follow-up questions....
I ran across a relatively new blog that is being maintained by the release team for Visual Studio and the .NET Framework. I used to work closely with most of them in my former role on the setup team and now I meet up and play basketball with some of them before work on Tuesdays and Thursdays when I can drag myself out of bed in time. Check it out at http://blogs.msdn.com/release_team, it gives interesting insights into some of the logistics behind shipping products at Microsoft.
In response to a post from Mike Hall about funny/sad/crazy UI messages (see his original post here), I want to submit a screenshot I took back in my early days working on Visual Studio setup. Back in the day, the product was not yet .NET'ized and it was still known as Visual Studio 7.0, and our setup team was working through a lot of bugs and new features. We had one particular problem where we were calculating the time remaining based on some information we received back from some Windows Installer API calls, and the info was ridiculously overestimating the time remaining. After I logged a bug on this, the development team and I decided to have a contest to see who could get a screenshot with the longest time remaining. I ended up winning with this screenshot. When you do the math it comes out to something a little more than 67 years. I know that Visual Studio setup takes a long time, but that is a little bit excessive......My favorite part of this screenshot is the additional 2 minutes that it lists because we all know that after waiting 67 years it is important to be accurate up to the minute.....
As a side note, back then Visual Studio was one of the first Windows Installer setups within Microsoft to try to implement an external UI handler, and in fact I think it is still one of the few to do so (not counting other setups produced by our same team like the new .NET Framework UI in version 2.0). We had a lot of growing pains while implementing that setup. It was always funny to me when I would send questions to the Windows Installer internal support email alias and they would answer by saying "I think the Visual Studio setup team is doing something like that" and then I would have to reply and tell them that I am on the Visual Studio setup team......
There is a known issue with the QFE we released in December 2003 to include the .NET Framework 1.1 for XP Embedded. It creates 2 shortcuts on the start menu that are in Japanese (which may show up as square boxes if you do not have the full set of Japanese fonts and settings included in your embedded image). This will be fixed in the upcoming XP Embedded SP2 release, but in the meantime you can work around this issue by doing the following prior to building your XP Embedded image:
If you have built your image but not yet had FBA run on it, you can go to the \Windows\Inf directory and locate the file netfxocm.inf and make the same change described above.
I posted this item a while back in the newsgroups but I wanted to put it in my blog also because I think it may have gotten a little buried amongst all the posts there. We have seen many reports of bugs related to calling into performance counter APIs on XP Embedded. These have shown up most commonly when using the System.Diagnostic.Process class in the .NET Framework but can also be seen when using certain native APIs. The underlying issue is a registry key that is incorrectly populated when creating an XP Embedded image. This issue will be fixed in the upcoming XP Embedded SP2 release, but I also wanted to provide a couple of workarounds that can be applied in the meantime.
If you have not yet built your XP Embedded runtime, you can do the following:
If you have already built your runtime, you can delete the registry key named HKLM\Software\Microsoft\Windows NT\CurrentVersion\Perflib\009\ and then reboot your machine to fix this issue.
Hey all, thanks to your posted comments, I realized that the exclusion list I posted here originally was not in the correct syntax for the InCtrl5 tool to understand it.
There is not a way to exclude files in the InCtrl5 tool, so you will need to look at the exclusion list at http://www.winisp.net/astebner/bin/inctrl5_exclusion.txt and manually exclude them after running the tool for the setup you are trying to model.
For registry keys, I have created a new file in the correct format at http://www.winisp.net/astebner/bin/inctrl5_registry.txt. You can download this file, rename it to InCtrl5.ini, and copy it to \Program Files\InCtrl5 on the machine that you have installed InCtrl5. This will cause the tool to ignore this set of registry keys/values when analyzing a setup.
Hope this helps....
Hey all, I'm happy to report that we figured out the root cause of the really strange rollback behavior we saw in the .NET Framework setup (described in a post earlier this week) and I wanted to pass on what we found in case any of you hit this issue with a Windows Installer-based setup in the future. Here is a rough outline of the issue and the underlying problem:
The only thing we are not sure about is why the RBS file was orphaned in the registry and on the user's machine to begin with, so we'll have to keep digging into that if we get any additional repro machines and see if we can figure it out. We are going to look into making the next version of Windows Installer smarter about which RBS files it will run when it is removing backup files to try to help avoid this scenario in the future, but in the meantime if you encounter this behavior in any of your scenarios, start by taking a look at that registry value and see if there is any orphaned data that may be causing your setup to rollback.
Hope this helps......
I found this really cool article this morning about a project at Cornell University where they're building an unmanned aircraft with onboard embedded control systems that are running XP Embedded - check it out at http://research.microsoft.com/displayArticle.aspx?id=685
My colleague, former manager and friend Michele Coady is currently an international test lead on the Visual Studio and .NET Framework localization team. She sent me a really interesting link today that I wanted to pass on for anyone who may be writing world-ready applications. Colors have different meanings and cultural implications depending on the audience, and it is very important to keep the connotations in mind when writing applications for worldwide audiences. Anyone writing worldwide applications should take a look at this article, and even if you're not, this is an interesting read - http://www.colorconnection.xerox.com/wwwco578/html/en/tips/international.html
I saw a very odd rollback behavior in the .NET Framework setup yesterday that I thought I should post in case anyone else runs into something similar. This was the 2nd time I saw it, but the first time in a scenario from a non-Microsoft internal machine. When it first happened I had written it off as a one-off caused by daily builds of the OS, the .NET Framework or something like that. But apparently it is possible to get the machine into this state through released products (possibly beta bits, but still valid in my opinion)
Essentially what happened was the .NET Framework setup proceeded all of the way through installation and then initiated a full rollback. I couldn't find any evidence in the verbose MSI log file that any kind of error had triggered the rollback, and the last section of the log told me that the return code was 0 and that the .NET Framework installation completed successfully. I had to use filemon and regmon (from http://www.sysinternals.com) to diagnose that there was a file in the folder c:\config.msi named *.rbs that was being executed, and then finally I traced it back to the following registry value on the machine:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Installer\Rollback\Scripts
I wasn't able to determine how that value got set, and searching for information about this registry hive in msi.chm and on the internet didn't yield any conclusive information. The closest I came to a possible explanation is in the section named Active (Incomplete) Installations at http://www.microsoft.com/resources/documentation/WindowsServ/2003/all/techref/en-us/Default.asp?url=/resources/documentation/windowsServ/2003/all/techref/en-us/msizap_remarks.asp
What I think may have happened is some other Windows Installer setup (more than likely the .NET Framework itself) got orphaned before it completed, and then this script was left behind in the registry and inadvertantly got launched by future installations. Normally when a Windows Installer setup gets orphaned and you try to run another MSI, it will say that an installation is currently in progress and is suspended and asks if you want to rollback before you continue with the new setup, but for whatever reason we didn't see that in these 2 instances. I'm still hoping to get a more thorough explanation from the Windows Installer team about what is really going on here, and I'll post my findings back here.....
Hey, I'm slowly but surely getting other folks on my team to create blogs. Neil Marlowe is a program manager who I work closely with - some of you may have met him and attended one of his talks at Embedded DevCon. His primary focus is the Embedded Enabling Features (EEFs) like me, plus he's got some expertise on embedded security and servicing scenarios. Take a look at his blog at http://blogs.msdn.com/neilmarlowe/ if you have a chance, he's planning to post some good how-to documentation in the near future about using the ICF utility to protect embedded devices....
Hey all, as I promised a while back, I wanted to walk through some examples of how I approach reverse engineering a setup package to learn how it works, make any necessary changes, figure out command line switches, etc. I apologize for not posting anything in a while, I was spending some time with my family who was in town from Texas and then I managed to get sick right after they left. But I'm back now :-)
I'm going to start with a relatively straight-forward reverse engineering example - the .NET Framework 1.1 package. I chose this first because that setup is very simple - no setup data files, no configurable setup UI options, etc. Yet it is complicated enough to be interesting and it is a setup that a lot of people want to include in their setup packages and/or have had issues trying to install. I'm going to try to approach this from the perspective of someone who is familiar with different types of setup technologies in a broad sense, but who has not yet seen this particular setup before (I may get a little off track on this part because I'm so familiar with this setup from my previous work, but I'll try.....)
Step 1 - figure out what packaging technology is being used
When I download the .NET Framework 1.1 I see that it is a single EXE named dotnetfx.exe. When I double-click on it, setup asks me if I want to install and when I say Yes it proceeds to extract files to the %temp% folder on my machine. Then a EULA appears. When I go to %temp% after the EULA appears, I see a folder named ixp000.tmp. This tells me that the package in question is an IExpress self-extracting EXE. Since I know it is IExpress, I can now use some well-known IExpress command lines to extract the package to a folder of my choosing. I will run dotnetfx.exe /t:c:\aaron /c to create a folder named c:\aaron on my machine and unpack the .NET Framework files to this folder. The /t flag specifies the folder to extract to, and the /c flag specifies that I want to extract the files only and not run the setup after extraction.
Side note for setup developers - I found some IExpress docs on the web here, and you can also find iexpress.exe in %windir%\system32 on an XP Professional machine. It is a useful tool for packaging up multiple files into a single package but it is also a fairly old technology and has some limitations that may not make it practical to use, depending on your scenario.
Step 2 - figure out what setup/installation technology is being used
Now that I have extracted the files, I will go to the c:\aaron folder that I created above.
Step 3 - figure out the basics of how the setup works
After I look at all of this, I go ahead and step through the UI and install the .NET Framework 1.1, and then re-open the log file dotnetfx.log. I see that it lists the command line that is passed to the MSIInstallProduct function - and I know that this is an API call for an application to install a product using Windows Installer. Then I see a return code for that API call - fortunately it is 0 in this case to tell me that the .NET Framework 1.1 installed correctly. Then I see another call to StopDarwinService, so this install.exe application is stopping the Windows Installer service again after installation is complete - kind of strange.
Step 4 - figure out details about how the setup works
Now that I have a basic understanding of how the .NET Framework 1.1 setup works, I'm going to use my Windows Installer expertise to dig a little deeper. The first thing I will do is install Orca, then right-click on netfx.msi and choose Edit with Orca to view the contents of the MSI. There is a ton of data to wade through in an MSI and it is overwhelming at first, so I will only look at a few key things here:
There are many more things that we could look at in an MSI, but the above are good starting points. Often for other setups you will see items in tables that will lead you on trails through other tables. Most commonly, if there are launch conditions that depend on the existence of other applications, you will be led to the AppSearch table to figure out exactly what file or registry data the setup is looking for when deciding whether or not to block.
Step 5 - install with verbose logging and look at the MSI log file
Reading an MSI log file is more of an art than a science but can often be useful when reverse engineering a setup. For the .NET Framework 1.1, we know that because it is an MSI we can set the verbose logging flags before running setup - HKLM\Software\Policies\Microsoft\Windows\Installer, Debug (REG_DWORD = 7) and Logging (REG_SZ = voicewarmup!). These will give us a verbose log named msi*.log in %temp% where * is a randomly generated set of characters.
I also noticed when running install.exe /? it says that if you pass a /l flag it will generate netfx.log in %temp%. Since I also noticed that the log created by running install.exe was named dotnetfx.log and not netfx.log, I might be curious what the difference is and run install.exe /l from c:\aaron to see what it does. When I install this way I see that the resulting file %temp%\netfx.log is essentially the equivalent of a Windows Installer verbose log file without needing those registry keys to be set. So now I can take a look at netfx.log (or msi*.log if I choose) and view the step-by-step process of this setup.
With that, the process of reverse engineering the .NET Framework 1.1 setup is done for the common cases. There is of course more detail that may be needed depending on your scenarios, but much of that will require more in-depth analysis of the MSI tables and/or the verbose MSI log files. I will leave that for a later lesson.
Side note for the curious - the reason that the .NET Framework 1.1 setup stops the Windows Installer service is because Windows Installer uses pieces of the .NET Framework to install assemblies to the GAC (the MSIAssembly and MSIAssemblyName tables in the MSI) in Windows Installer 2.0 and higher. This creates a classic chicken-n-egg problem - Windows Installer is going to install the .NET Framework but Windows Installer needs the .NET Framework in order to do so. Because of that, Windows Installer has some special logic to bootstrap the part of the .NET Framework it needs to install assemblies, but since Windows Installer is run as a service, it could already be running from a previous setup and we want to stop the service to unload any older versions of the .NET Framework from memory so that the most recent version will be used.
I hope this post is at least a little bit useful to give some insight into how I approach the process of taking apart a setup and figuring out how it works. I will post more later with a more complicated example of a setup that uses information from data files, etc. Let me know via comments or emails if you have any questions about any of the above.....