Larry Osterman's WebLog

Confessions of an Old Fogey
Blog - Title

What I did on the 4th of July...

What I did on the 4th of July...

  • Comments 23

I originally wasn't going to write this up, but Valorie thought it was a cute story.

I need to describe some terminology first.  Here on the Windows team, we call a full build of Windows a "timebuild" - normally, you don't build the entire Windows product to make a change to your component - you typically just have to rebuild the individual DLL (or EXE) and test it in isolation.

But before you check any new feature, especially a feature that affects multiple major subsystems of Windows, you need to do a full build (among other reasons, to insure that any changes you made to headers don't break other components in Windows). 

One other thing that a timebuild provides is a baseline for the test team.  You see, the test team can run tests with privately built bits (from the developer) and they can get a reasonable degree of confidence in the fix, but for a significant change, it's better to have a complete build of the world - that ensures that the testers are testing a system that's as close to a real system as they can possibly get.

Anyway, I'm close to finishing up two of my features for Longhorn, so it was time to do the timebuild.  I started it on Friday, found some issues and fixed them, and left for the weekend, figuring I'd check up on the timebuild over the weekend (not surprisingly, it takes more than one or two hours to build all of Windows).

I was busy on Saturday, but didn't worry about it.

Sunday morning, I decided to check on the timebuild, so I RAS'ed in to check on the build.

The RAS connection worked just fine, but during the post connect security checks, I got a wierd error, something I'd never seen before (fatal execution error or something like that).

So it was time to get on the horn with Helpdesk and see if they had any idea.  One of the nice things about working for a big company is that they have a helpdesk that's staffed 7x24, even on holiday weekends.  So I got on the phone and waited for about 10 seconds for the tech to pick up.

She worked me through a couple of suggestions, none of which worked, so she escalated the call up the chain.  While this was going on, I drove into work and fixed a couple of problems and restarted the timebuild.

Later that afternoon, we had a BBQ at some friends, so I didn't bother trying again.  On the other hand, I enjoyed myself immensely at the BBQ, so it was worth it.

On Monday morning, I had an email from helpdesk waiting for me.  It turns out that since my machine at home is joined to a corp domain one of the scripts was depending on a tool that hadn't been pushed to my machine yet (classic chicken and egg problem - the tool would be pushed when I connected to the domain, but since I couldn't RAS in, the tool couldn't be pushed to my machine).  Silly, but stuff like that happens.

The suggestion from helpdesk was that I unjoin the domain and re-join the domain.  They were very careful to remind me to ensure that I knew the password for the administrator account on the machine since my domain account would no longer work for logon (this is the only one of my machines on which I routinely run as an admin, because our RAS logon process requires admin access - there are workarounds but I'm lazy :().

It turns out that I HAD forgotten the password for the local administrator account.  So, being the bright boy, I reset the password to something known and unjoined the domain.

I then logged in as the administrator account and tried to connect to rejoin the domain.  And I got an error.

That was weird, the error was the exact same error you'd expect to get when you're not running as an admin.

So I did a "NET LOCALGROUP ADMINISTRATORS" from the command line.

There was only one entry, "Sharron".

Oh crud.  Then the memories came flooding back.  Five years ago, when I set this machine up, I had just gotten DSL, and didn't have a hardware firewall, and the machine was running W2K, so it didn't have a built-in firewall.  I was running ZoneAlarm at the time, but I wanted an additional level of security (stupid, I know, but that was 5 years ago).  So I tried to set up a sort-of "honeypot" - I renamed the administrator account to be "Sharron" and created a new account in the guests group (this was on W2K) called Administrator.

Not only that, but I couldn't remember the password on the "Sharron" account.  So now I have a machine on which the only account that I have a password is the guest account.

The phrase "Hoist by my own petard" comes to mind.  And I was SO PROUD of myself for remembering to ensure that I knew the administrator account password.

 

And now, the mistakes start piling on fast and furious.  For some reason, instead of trying to boot to the recovery console and resetting the password on the Sharron account, I decided to re-install Windows XP.

But, of course, the only copy of Windows XP I had was a Windows XP RTM disk.  This isn't a problem because I trust the hardware firewall to keep my machine safe while I install XP and download SP2 onto it.

No big deal, right?

Wrong.

About 2/3rds of the way through the installation, I get a popup about the setup failing.  The setup log doesn't have anything reasonable in it, and the setup is past the point where you can undo the setup.

The installation of Windows on that machine is toast.  I'm swinging in the wind here, folks.  The old installation is toast, the new installation didn't work, I don't know what to do.

So I hauled the disk out, and restarted the installation, this time doing a clean installation.  Any apps will have to be reinstalled, etc, but at least I didn't have to reformat the drive.

Oh yeah, the product key.  That's right.  This machine is owned by Microsoft, I got the product key from Microsoft, and I don't have original media for it (I do for all my other machines, just not this one, since it's Microsoft's machine).  The product key's sitting on a server at work, if I could RAS into work, I could get it...

Fortunately, my other machine is more than capable of running RAS (it's the "good" machine).  So I installed RAS on my other machine, dialed in, and got the product key off the server at work.

Installation continued on the newly-reinstalled machine, but for some reason XP RTM didn't recognize the NVidia TI-4400 adapter in the machine (I think the TI4400 came after XP RTM).  I installed the antivirus software and SP2, downloaded all the security patches, and I was good to go.

But even though XP recognized the NVidia adapter, it STILL didn't stick.  It wasn't until I downloaded the latest WHQL drivers from NVidia that the driver stuck.

At this point, it's about 4PM on Monday, and I'm back to where I was back on Sunday Morning

So I reinstalled RAS and try to connect, and...

I get a different error code, this one coming from the smartcard reader.  I've still not resolved that one (tonight, if I have the time).

Sometimes, it's just not worth waking up in the morning.

On the other hand, the timebuild worked great, I installed it on my test machine when I came into work yesterday and the feature worked!  Now I'm doing a timebuild of my OTHER feature, when that one completes (in about an hour or so), I'll be installing that to ensure that that works.  And I finished the latest Misty Lackey novel (Sanctuary) and a Judge Dee novel (The Chinese Bell Murders, by Robert van Gulik).

So the day wasn't a total loss.  And I haven't had to reformat the C drive (yet). 

But it sure was annoying.

  • When some people forget the password, they tent to use utilities like this http://www.petri.co.il/forgot_administrator_password.htm#1

  • I dont know HOW many times I've ran into similar problems while doing Windows Administration.

    I renamed my original Admin accout as well, and did a similar thing to adding a "fake" Admin account to the Guests group. Unfortunately for me, it wasn't as simple as a re-install because after I was done fumbling with it, it was quite the mess.

    I had to end up formatting for mine to work.

    Glad to hear you're mostly working and in order though. Hope it didn't ruin your long weekend too much!
  • Just out of curiosity, how long does a build of windows take?
  • The Windows source is cut up into several different pieces that you can opt-in to seperately. The last time I built *everything* it took a day to a day-and-a-half. My normal subset is about 6 hours.
  • Just goes to show that free software is only free if your time has no value.... oh, wait... hmmm.

    It's good windows has such exemplary driver support... oh wait....

    Anyway, shame on me for being mean. I actually do hope you get things sorted out, I've had similar problems over the years with just about every OS combination. It's just interesting when a top developer of a self-proclaimed user-friendly low TCO OS can have such troubles.
  • The title of this entry should be "ms windows developer chokes on own dog food during bbq"

    regardless, I'm sorry to hear that you had to go through so much trouble to remote in. But hey, "the timebuild worked great", and in the end, that's all that matters.
  • Chris, it depends on the build machine. We've got machines that can do it in 6 hours (but they're somewhat expensive (not as much as a car, but...).

    Our timebuild machine took about 23 hours to do the build over the weekend, but yesterday, we restriped the drives and it took about 12 hours total for the timebuild that just finished this morning.

    vince, the issues I had had absolutely nothing to do with the TCO of Windows, and everything to do with the pieces that our IT department has layered on top of it. I'm quite sure that any platform can be messed up by people rolling out bad tools.

    Hasani, you're 100% right - the final result is all that counts.
  • LarryOsterman said: the issues I had had absolutely nothing to do with the TCO of
    > Windows, and everything to do with the pieces that our IT department
    > has layered on top of it.

    I wasn't referring to your problems logging in to the MS network. I was thinking of the fact that your install crashed during boot, that you had to waste time fighting with video drivers, and that even though you had a legal copy you had to go through pains to get a license key before you could finish. These are all limitations of the platform, not necessarily due to your organization.

    I have been co-erced into doing free Windows support due to various family obligations [I don't run Windows myself], and I've often had to fight with blue-screens during install, tracking down long-lost license keys, and driver issues.... I admit things like that can happen with Linux too, but then again when you pay $$$ for an MS OS you'd hope that these issues wouldn't occur. It makes me feel a bit better that even knowledgeable MS employees have similar issue [although it doesn't make me want to personally switch to using MS products].
  • Vince, the reinstall problems were because I took a Win2K machine, upgraded it to XP, XP SP2 (pre RTM), removed XP SP2 (pre RTM) (seveal times), upgraded to XP SP2 final, and then attempted to reinstall XP RTM on top of this munged XP SP2 machine.

    I doubt that any customer would ever see this behavior. The video stuff is the fault of NVidia (who provided the video card driver). And the license key again is not a customer situation - it's because this is a Microsoft owned machine. I have the license keys for all my machines easily available.
  • Something I've been wondering about these builds: I recently noticed that my copy of Windows 2000 has a build number of 2195, Windows 2003 is build number 3790, and some of the Longhord preview sites are mentioning build numbers in the 5000 range. Is this indicative of an increasing number of pre-release builds in each version, or are those numbers sequential through all the versions? If they're sequential, where and when did they start?
  • Brrok Moses: Build numbers always increase, but are not sequential - they may jump whenever Windows release management feels like (usually around milestones - beta, etc).

    For example, suppose today's build is 57, and we need to branch (=fork) the sourcebase into 2 branches - Beta (to be released soon) and Main (to be released later). In that case, Beta gets allocated some number (say 58-99), and Main jumps to 100.

    As for Windows NT, here be numbers: http://en.wikipedia.org/wiki/Windows_NT
  • Now what I want to know is this stuff you check in and built have anything to do with this post of yours? At least then we will know we are getting close to knowing what your team has been working on for 3 years.
    http://blogs.msdn.com/larryosterman/archive/2005/02/01/364840.aspx

    Grumbles, while I like "the wife" and I think she definately keeps you on your toes, she has insider information.
  • Jeff, yes, Valorie does have insider information :)

    I've actually dropped enough hints and a number of people have pointed to the winhec slide deck for what my team is doing.

    And yes, the shackles on talking about what I'm doing are about to go off. And when I RI the feature I'm working on, I'm going to be shouting it to the heavens.
  • Maybe you should have installed the timebuild instead. ;) If nothing else, it'd be interesting. (Of course, the chances of all your domain-specific stuff working isn't high.)
  • foxyshadis: what an "interesting" idea.

    But I'm not sure I'm willing to download the timebuild to try it out on my home machine... I'll put the beta on it, but not a timebuild... :)
Page 1 of 2 (23 items) 12