Welcome to MSDN Blogs Sign in | Join | Help

Hardening takes time [unfortunately]

As time has gone on and I have worked with various KMDF users over the last few years, some of them have been helpful in forwarding information to me regarding installation failures even when they had successfully resolved them.  Nothing secret involved- setup logs, what fix they used and how well it worked, etc. 

Recently I received one from Dana Gregory of DataColor [names used with permission], which highlighted another problem with our older coinstallers, which I thought I'd discuss in today's installment.  While I had previously passed along instructions on fixing a broken installation by registering the runtime service for KMDF, and that worked for him, this problem would still break the next attempt to install another product (or to reinstall his own).  I'll add links for the service fix to this article, and also to the problem this showed.

But the logs showed me more than that.  This user had first attempted to use another vendor's product which utilized KMDF 1.5, tried dozens of times to get it to install, and ran afoul of one of our other issues with that coinstaller- thus never successfully installing.

He or she then uninstalled that product, and replaced it with a competing one that did not use KMDF.  I believe that the uninstall set up the problem Dana faced, but again, this customer made dozens of attempts to install, some of those probably at the direction of DataColor's tech support, before finally getting the product to work.

It doesn't take loads of empathy to realize how frustrating this experience must have been for that customer.  Not to mention the fact that an even closer customer [the vendor of the first product] suffered damage to their reputation because they had relied upon us to provide a robust installation experience for the end user and had failed to do so.  Beyond that, there were potential career repercussions for the engineers and others within that company who advocated the use of WDF [which obviously we would like to think was a good thing for them to have done].  Now some of those damages may be small in the larger picture, but they aren't small to the individuals involved, and success is something that really needs to built one customer at a time, even in a global market.  These negative images aren't things I like to think of- all I can say at this point is I don't take them lightly and it's one reason I've been pursuing coinstaller issues as much as I have been [it wasn't even part of my job to begin with- I just got concerned folks were having trouble with installation, and I've been meddling in it ever since].

The problem here is that the coinstaller will add version information about drivers being installed to a key beneath HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\Wdf.  This key is normally created when the runtime is installed.  If it is removed, the coinstaller will not attempt to create it if it determines the runtime is present- it just emits an error message and fails the installation.

When we first encountered this, Ilias had me file a bug which we would fix for KMDF 1.9 [1.7, alas, was already shipped].  In this case it was happening [we believed] because a failed boot driver caused a system to revert to last known good, which undid the registry changes made by the coinstaller.  Well, when it came to be time to address the bug fix, we had one of those long discussions about whether and how to fix this.  In this case, argument comes closer to the passion and heat level- but I don't give up as easily as I used to [rather, I don't bide my time as easily].  Consensus wasn't totally reached, but I got a majority of the key players and that was enough to get what I wanted, and believed was best for our customers.

The issue as I saw it:  all of the information under that key was either informational, related to debugging [and thus expected to be entered manually, initially]  or kept around for "future uses" which we didn't test and had no clear rationale for.  You can remove the key and all your installed drivers will function just fine.  WdfVerifier will create it if it doesn't exist [the switch for turning on diagnostic output from the loader lives in there].  So its nonexistence should just be ignored- creating it might be a good idea, but I was fine with even skipping it.

The primary opposing viewpoint was that it was installed as a part of the installation, and if it was missing, that should be treated as an indication that the installation had been corrupted, and the correct way to fix it was to re-install.  I was vehemently opposed to this on several grounds:

  1. Eventually we would have a greater minor version than 1.9 on the machine, and this logic would have the 1.9 coinstaller overinstall that and break any 1.10+ drivers [I didn't push it much though, so perhaps I didn't think of it at the time, and this is another instance of my muddled memory].
  2. The assumption is being made that the installation process is 100% reliable, and we know from experience that it isn't, and given that part of the issue is unexpected interference by unknown 3rd parties, we can never be certain of reaching that level.
  3. As I said above- there was nothing under that key that was absolutely required to make drivers operational.

My basic argument revolved around the last two: if we know the existing installation is working except for that key existing, we should not assume it is corrupted to the point of requiring reinstallation because a key (or subkey of that key) that can be removed without disturbing its operation is no longer there.  We can know the existing installation is working, because we can now tell which version of KMDF is actually loaded and running [beginning with 1.7].  I basically relied on the old maxim "If it ain't broke, don't fix it".

I still believe this was the correct decision- it makes it harder to break an existing installation, and we already have seen cases where this sort of thing has happened.  But let's just say there isn't universal agreement on that point [or at least there wasn't at the time].

The real good news

That would be that Wei is putting together a comprehensive set of coinstaller and versioning tests, and he has done a very thorough job of it.  It's a lot easier to get the job done right when you've got enough people to get it done, and they are capable people.  I'm making certain that this testing includes doing more "real world" testing- deliberately breaking keys, deleting and moving files, etc.  Eventually I hope we can utilize more of our fault injection technologies to assess the overall robustness of the installation itself [but there will always be holes there we can't plug- some faults you simply cannot recover from gracefully].  The problem is in good hands [in my opinion, anyway].

So the best I can say for now to those we failed in the case Dana gave me information about is that we're learning from their experiences and doing our best to not repeat old mistakes.  More to the point, we're not falling back upon quibbling about the fact that your experiences are relatively rare- we're trying to harden this as best we can.  After all- a 1% failure rate of something that occurs 1 billion times is 10 million failures- a number that had better be daunting!

I must also add that I am grateful to all those who have shared this sort of detailed information with us, rather than simply throwing up their hands in disgust and assuming we don't care about their problem.  I can find cases of people having problems using serach engines, but all too often there's not enough information to tell what broke, and I see a lot of dangerous [to future users of WDF on the same machine] solutions being promoted, as well.

The links I promised: this one sets the default runtime service settings, while this one ensures the required control key settings for pre-1.9 coinstallers are present.  You may copy the text from both (or just download them with a "Save Target As", etc.).  You can rename them as .REG files and use the right-click to install method, or leave them as is and use them with the OS "reg import" command line tool.  Please do not make these part of your normal installation process- the first should never be necessary beginning with KMDF 1.7, and the second beginning with KMDF 1.9- widespread use of them could prevent us from making future improvements to the installation process and thus make life harder for everyone [including eventually yourself].  If your driver is a boot driver, then you may need to tweak the default service settings further, as the default is for non-boot.  As always. if you have problems using these or other questions, let me know!

Published Sunday, May 04, 2008 1:46 PM by BobKjelgaard
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

No Comments

Leave a Comment

(required) 
required 
(required) 
 
Page view tracker