Tales Of A Paranoid SysAdmin (Part 2)

  • Comments 0

"Welcome back! You join us as Alex is trying to decide whether to act out his Star Wars fantasy with an R2 detour (D2-er, get it? Maybe not). With several hundreds of newly acquired gigs in the servers, will he risk upgrading from the so-last-decade Windows Server 2008 to the shiny new R2 edition? Especially now SP1 is out there."

In fact, now that I have plenty of room for new Hyper-V VMs it seemed like it was worth a try. As long as ADPrep doesn't screw up my Active Directory I can export the existing domain controller and other server VMs and then upgrade imported-alongside copies. If it all goes fruit-dimensional I can just dump the new VMs and fire up the old ones again. And if it does all work out OK I'll be less worried about upgrading the physical machine installations of 2008 that host the VMs.

So early on a Saturday morning I start the process. I've always dreaded running ADPrep since the time I tried to upgrade a box that started life on NT4 as an Exchange Server, was upgraded to Windows 2000 Server, and then upgraded again to Windows 2003 Server. The NT to 2000 upgrade required two days playing with ADSIEdit afterwards, and the 2000 to 2003 upgrade destroyed the domain altogether. However, this time the ADPrep 2008 to R2 upgrade ran fine on both forest and domain, so it was all looking peachy.

Have you ever wondered why things that go well are compared to peaches, while things that go wrong are pear-shaped? Especially as my wife can confirm that I will have absolutely nothing to do with hairy fruit (but that's another story).

And now I can expand the size of the VMs disks in Hyper-V Manager and then extend the volumes using the Storage Management console within each VM's O/S to get the requisite 15 GB of free space. Then bung in the DVD, cross my fingers and toes, mutter a short prayer to the god of operating system upgrades, and hit Install. Except that it says I have to stop or uninstall Active Directory Federation Services (ADFS) first.

So I go and read about upgrading ADFS. This doc on MSDN for upgrading and uninstalling ADFS goes through all the things you need to do with IIS configuration, PowerShell scripts, and editing the Registry to properly remove the standalone v 2.0 installation. But another says that the R2 upgrade will just remove it anyway. There is an ADFS Role in 2008 R2, but note that this is ADFS 1.1 not 2.0. And I never managed to make this role work anyway; probably because I didn't do all the uninstall stuff first. If you want to run ADFS 2.0 I suggest you follow the full uninstall and clean-up instructions before you upgrade to R2. Then, after you upgrade the O/S, just download and install the ADFS 2.0 setup file for 2008 R2 (make sure you select RTW\W2K8R2\amd64\AdfsSetup.exe on the download page) instead of enabling the built-in Federation Service Role.

Next, install the 72 updates for R2 (thank heavens for WSUS) and then install SP1. And then some more updates. But, finally, my primary (FSMO) domain controller was running again. And most of the 100 or so errors and warnings in Event Viewer had stopped re-occurring. Except for a couple of rather worrying ones. In particular: "The DHCP service has detected that it is running on a DC and has no credentials configured..." and "Volume Shadow Copy Service error: Unexpected error calling routine RegOpenKeyExW(-147483646,SYSTEM\CurrentControlSet\Services\VSS\Diag,[account name]). hr = 0x80070005, Access is denied".

Solving the VSS error is supposed to be easy - you can tell which account failed to access the Registry key from the message. Except that there is no account name in my error message. In this (not unknown) case, the trick with this VSS error, so they say, is to locate another error that occurred at the same time - which is usually the cause of the VSS error. In my case it seemed like it was the DHCP error, and this page on the Microsoft Support site explains how to fix it. I've never had this error before in Server 2008, but the fix they suggest seems to have cured the DHCP error.

Deleting a DHCP entry in DHCP Manager and then viewing DNS Manager shows it removes that machine from the DNS as expected, and ipconfig /renew on that machine creates a new DHCP entry that replicates to DNS. And no errors in Event Viewer, which hopefully indicates that it's working as it should. However, this hasn't so far cured the VSS error, and now there are no other errors occurring at the same time. But after some searching I found this page that explains why it's happening and says that you can ignore it.

Next I can upgrade the backup domain controller, and for some reason I don't get the same DHCP error even though it also runs DHCP (with a separate address range in case the primary server is down). Very strange... unless it was initially an AD replication issue when only one DC was running. Who knows? Though I do get the same VSS error here, confirming that it wasn't actually the DHCP problem causing it last time.

Anyway, at last I can tackle the more nerve-wracking upgrade of the base O/Ss of the machines that host the VMs. This time setup stops with a warning that I have to stop the Hyper-V service. However, this blog post from the Hyper-V team says I can just ignore this message and they are correct - it worked. The VMs fired up again afterwards OK, though the Server 2003 one did require an update to the Hyper-V Integration Services; which means you have to stop it again and add the DVD drive to it in Hyper-V Manager because you forgot to do that first...

One remaining cause of concern is the error on the primary DC that "Name resolution for the name [FQDN of its own domain] timed out after none of the configured DNS servers responded". NSLookup finds it OK, Active Directory isn't complaining, and everything seems to be working at the moment so it's on the "pending" list. A web search reveals hundreds of reports of this error, and an equally vast range of suggestions for fixing it - including buying a new router and changing all the underlying transport settings for the TCP protocol. Think I'll give that a miss for the time being.

Of course, a few more upgrade annoyances arose over the next couple of days. On the file server that is also the music server the upgrade to R2 removes the Windows Media Service role. After the upgrade you have to download the Windows Server 2008 R2 Streaming Media Service role from Microsoft and install it, then enable the role in Server Management and configure the streaming endpoints again. And, of course, it's been so long since you did this last time you can't remember what the settings were. Don't depend on the help file to be much user either.

And as with other upgrades and service packs, the R2 upgrade silently re-enables all of the network connections in the Hyper-V host machine's base O/S, so that the connections to the outside world are enabled for the machine that is typically on the internal network (see this post for details). You need to go back into the base O/S's Network Connections dialog and disable those you don't want. However, in R2 you can un-tick the Allow management operating system to share this network adapter option in Virtual Network Manager to remove these duplicated connections from the base O/S so that updates and patches applied in the future do not re-enable them.

But of much more concern was the effects of the upgrade on my web server box. After it was all complete, patched, SP'd, and running again I decided to have a quick peep at the IIS and firewall settings. Without warning the update had enabled the FTP Service (which I don't run) and set it to auto-start, then added a heap of Allow rules to the Public profile to allow FTP in and out. Plus several more to allow DCOM in for remote management. As usual, after any update, remember to check your configuration for unexpected changes. If you don't need the FTP service, remove it as a Feature in Server Manager, which prevents it from automatically enabling the firewall rules.

And a day or so later I discovered that the R2 upgrade also set the SMTP service to Manual start as well, so the websites and WSUS could no longer send email. The service started OK and so I set it to Automatic start and thought no more about it until WSUS began reporting three or four times a day that it was unable to send email. Yet testing it in the WSUS Email Options dialog reveals that it can send email. So I added the configuration settings in the IIS7 Manager for SMTP (even though I never had to do this before), and it made no difference. Every day I get an email from WSUS with the all the newly downloaded updates listed, and three Event Log messages saying it can't send email. Perhaps next week it will start sending me emails to tell me it can't send email...  

Finally, by late Monday evening, everything was up and running again. OK, there are still a couple of Event Log errors and warnings to track down and fix, but mostly it all seems to be working. And, I guess, the whole process was a lot less painless than I'd expected. O/S upgrades have certainly improved over the years, and I have to say that the server guys really did an excellent job with this one. It was certainly worth it just to be able to run the latest roles, and - at least so far - I even have proper working mouse pointers in all the VMs!

What I did notice is how, for a short period post upgrade, life seems a lot more exciting. Well, at least the server-related segments of my day do. Each reboot is accompanied by that wonderful sense of anticipation: Have I broken it? Will it restart? Will I get some exciting new errors and warnings?

It's as though the new O/S is a bit delicate and you need to handle it gently for a while. Like when you've just glued the handle back on your wife's favorite mug you broke when doing the dishes, and you're not sure if it will all just fall to pieces again. Until you're really convinced it's settled down you don't want to click too quickly, or wave the mouse about too much. Or open too many applications at one go in case it gets annoyed, or just can't cope until it's finally unpacked its suitcase and settled in.

Or maybe I really do need to get a life...

Leave a Comment
  • Please add 4 and 2 and type the answer here:
  • Post

Tales Of A Paranoid SysAdmin (Part 2)