Well, yes and no...
I got involved (probably quite foolishly) in a discussion on the NTDEV mailing list that really wasn't much of my business yesterday, but I thought I'd expand a bit on my thoughts there, anyway.
It probably sounds in one of my first posts there that I'm completely disagreeing with Peter Viscarola, although in fact, I don't. I just think there's a lot more to the story than that- both the reasons why things are as they are, and the broad general effect it has.
The Purina Effect
I think I'll structure the rest as a Q & A with some imaginary and perhaps occasionally hostile interviewer- we'll see where that takes me. Not that there's a need for this, but it's my blog, so I get to make these choices...
- So why do so many people at Microsoft run the latest of everything?
- Because we hire a lot of technophiles- we are people who like new technology and believe innately in its transformative power. We know such people do well in our environment, and make great contributions. Its a good trait to have if you're seeking a job here. You'll get picked on sometimes for using Office 2003 for the same reason some people will give you grief about out of date fashions, or a stodgy car, or... So the desire for the latest and greatest is inbuilt.
- Because so many of us are involved in developing new products, we understand quickly the economics of their test, and that includes the idea that finding problems earlier is fixing them cheaper. In part due to that first bullet, there's usually a ready pool of people available to help test your new product, and that's a good thing. Most of the time, the problem is keeping them away from things before you're prepared for their input. This is also good over the long run- we call it "dog food" [short for "eating the dog food"], and it is well-baked into the corporate culture. It occurs on voluntary and ad-hoc bases and in planned deployments. We always use things before you do. We always will. [And unfortunately we will still miss things- there is no perfect test process, at least, not that I know of]. I began running Windows 7 within weeks of the release of Windows Vista, and used it for most of my work (and most of my blog posts, except at times like this when I post from home). Unlike most driver developers, I am used to an OS whose code changes each and every day, and exists in multiple forms at nearly the same time.
Just looking at the last couple of years, I've run beta or earlier versions of a couple of OS, Office, Windows Live, Halo 3, and Halo Wars. I didn't have the big XBox Live update because I knew I wouldn't have time to work with it [but I know people who did]. I'm not particularly on the bleeding edge, usually, even with all that.
- Why do things change and old ways no longer work?
Sometimes, it is much as Peter said [with my demurrals above flavoring it a bit]- a lot of focus on the new. But many times these are done deliberately, and "cool" [in the sense of some form of hipness] may occasionally be a factor, but not one that gets much weight [sticking to the OS- different products, different markets, differenf factors].
Most of the cases I can think of were the result of analysis of real data- watching and observing how things are really used. Looking for the problem areas- trying to find ways to minimize the number of steps to accomplish the most important and the most common tasks. Making things work better, and more easily. I don't think that should be news- I expect several of the Windows team blogs say so directly or indirectly. Now we think of that as "cool", but that's not the sense the word was being used in in that thread.
But we don't always know the full impact of changes sometimes, and it never hurts to check things in the real world. That's one reason we have beta tests. The TARGETPATH issue is (in my opinion) something we will take as a lesson for the future, but I also think this is proof the beta test process works. Would it have been better for it to be this way in the final version? Was the entire beta so good that the only conclusion was that we would not change anything from the feedback resulting from the beta, making it some sort of trial balloon we'd pull back, buff up, and then reissue as the final thing?
- Is your "love of the new" [I won't repeat calling it the Purina Effect, although perhaps they'd be happy I find the name mnemonic for pet food] the sole reason you miss down-level breaks?
No, we also miss them, in my opinion, because:
- We develop blind spots in our test procedures and our thinking about how our products are used. Most test scenarios have large variability and we have to narrow down the problem spaces to develop workable solutions. Sometimes this process leaves holes. More on that in a minute.
- Peculiar to this case is the mechanisms through which our build tools reach the WDK. As I'm sure has been discussed on the Engineering Windows 7 blog, we put a lot of time and effort into improving our internal processes, and that definitely included how we build Windows. The new mechanisms for binplace versus the old targetpath mechanisms has been a big win for us in terms of working with the source code. But to minimize the impact to product teams, a lot of the initial legwork for it was done by the tool developers, and pushed out to us along with the tool changes. I did some rework in our test code [things I privately had but weren't part of the main Windows source], so I was aware of some of this work, but I suspect that for most developers, this just happened [and that's good- it saves a lot of R&D money when things work that way]. This "magic elf" phenomenon may have helped develop something of a blind spot.
- Did you, personally, test the WDK at all?
Actually, yes [and it has its own test organization, as well]. I and other members of our test team have installed it, built samples, run them, read the documentation and the source code, used the tools, and filed bugs as a result of what we've found. That's happened more than once. Periodically we have held what we call "bug bashes" where everyone in the organization that includes our team goes to an out-of-office location, splits up into teams, and competes to see who can find the most bugs, or the most interesting bugs, and so on. But no, we didn't take SP1 DDK code and run it against the latest WDK [and there are multiple reasons that wouldn't work for WDF, anyway]. But I wouldn't be surprised to hear someone's considering that after all of this.
- Is it your fault this happened?
I certainly ask myself that- or at least, if I shouldn't have done more, or paid more attention, and if I had, maybe I could have realized TARGETPATH would be an issue and flagged it before it went out the door. But probably not- it was one of those mostly deprecated things that stayed in the WDK long after its need in Windows proper had passed on. I developed a blind spot.
I bet I'm not the only SDET in Device Platform Technologies (which encompasses the WDK and WDF groups) who's doing that kind of thinking these days, although I certainly haven't conducted a survey on the subject. I said above I'd get back to test holes, and this is a good place for it.
They exist. We prefer when they're complicated enough it makes some sense to have missed them, but we're human enough it doesn't always work that way. In a way, it should be hard to get too big a sense of oneself as a computer programmer. The blasted thing always does exactly what you tell it to do. You often don't tell it the right thing, even when you're sure that you have. But being human, we get that sense sometimes anyway. Maybe I had more of it than I should have when I made my first post on that thread.
Time will tell.
Not funny. I've been laid off twice in my career, both times as an SDE, and both times in a difficult economy, and this economy is worse than it was either of those times. Nobody I know well is affected, but I am on some mailing lists where affected people made their brief farewells.
I sympathize with their situation and hope for the best for them. I'll add that as far as I'm concerned, whomever hires them gets a good deal, because I can't remember the last time I saw a Microsoft FTE I thought wasn't a good employment catch. I'll also add that I have a lot of faith in the people who made those decisions- I've seen enough to believe it had to be done, and it wasn't done lightly.
- So why did you hint at unemployment?
Someone taking that thread and blowing it up into a news story about Microsoft not trusting its own products? Not that it would (or will- I can't predict the future on something like this) get me terminated, but I wouldn't blame anyone but myself if it did. For that matter, I may have said more about internal affairs than I should have. Business procedures (and that includes security procedures) are proprietary information and covered by employment agreements. By not checking first, I may have doomed myself, and that's life [corporate at any size- the principles don't differ that much from place to place].
I don't think I was discussing anything other than recommended best practices, though, so I'm not really as worried now as I was when I first thought about it.
Next time I intend to discuss something [anything] else- that's a certainty.