Random Disconnected Diatribes of a p&p Documentation Engineer
What's the best way to document an API? It's a question that came up when we were documenting the Enterprise Library 5.0 project a while ago, and has resurfaced recently with another project I unexpectedly found myself attached to. It's also one of those annoying questions that typically offer three dozen wildly varying answers; none of which really appears to provide the optimum result. Yet good documentation of APIs is vital for developers to get the best from the code.
While I'm not actually a developer as such, I do write quite a lot of code. Most of it is examples for others to use and reuse, though sometimes I take my life in my hands and actually write stuff that I run in a production environment. And, inevitably, most of the samples I write are concerned with the newest, undocumented, and often beta technologies. So all I have to work with is Visual Studio Object Browser, IntelliSense, and (if I'm feeling particularly inquisitive) .NET Reflector.
Of course, tools such as Sandcastle and others can generate the HTML docs from the source code automatically, and these will (hopefully) contain meaningful summaries and parameter descriptions written by the original class developer within the source code. So all I need do is provide a brief explanation of any particular intricacies when using the class or class member, and add a short sample of code that shows how that class member works. Surely I can turn out all the required content in a few hours...?
But it's generally not that simple once you start to think about what developers might expect to find when they hit F1 in Visual Studio, or Bing for a class or member reference page. For example:
In an ideal world I would write one or more examples for each member of each class in the API. But should I write samples that use several members of the class that I can reuse in more than one class member page? This sounds like a time-saver, but generally results in a sample that is over-complicated and may even make it harder to understand, or hide some members in the midst of a big block of code.
More to the point, do I actually have the resources available to write specific samples for every member of every class in an API that, when you include member overloads, might have many hundreds of individual pages. Years ago when I was documenting the API for Active Server Pages 1.0 (in the now almost forgotten pre-.NET era), it was easy enough to document the very few members of the five classes that made up ASP 1.0. But even a reasonably small framework such as Enterprise Library 5.0 has more than 1000 pages in the API reference section.
The path we took with Enterprise Library was to avoid writing samples in the API pages, and instead document the key scenarios (both the typical ones and some less common ones) in the main product documentation. This allows us to explain the scenario and show code and relevant details for the classes and class members that accomplish the required tasks. In fact, even getting this far only came about after some reconsideration of our documentation process (see Making a Use Case for Scenarios).
So, if I was documenting the file access classes in System.IO I could spend several months writing different and very short samples for each member of the File, FileInfo, Directory, DirectoryInfo, TextReader, FileStream, Path, and many more classes. Or I could try and write a few meaningful examples that use all the methods of a class and include them in several member pages, though it's hard to see how this would be either meaningful or easy to use as a learning aid. And it's certain to result in unrealistic examples that are very unlikely be "copy and paste ready".
Instead, perhaps the way forward is to make more use of scenarios? For example, I could decide what the ten main things are that people will do with the File class; and then write specific targeted examples for each one. These examples would, of necessity, make use of several of the members of the class and so I would put them in the main page for the class instead of in the individual class member pages. And each one of these scenario solutions could be a complete example that is "copy and paste ready", or a detailed explanation and smaller code examples if that better suits the scenario. Each class member page would then have a link "scenarios and code examples" that points to the parent class page.
The problem is that people tell me developers expect to see code in the class member page, and just go somewhere else if there isn't any. What they don't tell me is how often developers look at the code and then go somewhere else because the one simple code example (or the much repeated over-complex example) doesn't satisfy their particular scenario.
For example, if you want to find out how to get the size of a disk file where do you start looking? In the list of members of the File class, or the FileInfo class. Or search for a File.Length property? Or a File.GetLength method? If the File class had a scenario "Find the properties of a disk file" you would probably figure that it would be a good place to look. The example would show that you need to create a FileInfo instance; and that you can then query the Length property of that instance.
Or, when using the SmtpClient class to send an email, one of the scenarios would be "Provide credentials for routing email through an SMTP server". That way the majority of examples would just use the default credentials, simplifying them and reducing complexity for the most typical scenarios. If the developer needs to create and set credentials, the specific scenario would show how to create different kinds of NetworkCredential instances that implement ICredentialsByHost for use with the SmtpClient class, but wouldn't need to include all the gunk for adding attachments and other non-relevant procedures.
I know it would be impossible to always have the exact scenario and code example that would satisfy the needs of every developer each time they use the API reference pages, but it does seem like the scenarios approach could cover the most common tasks and requirements. It could also be easily extended over time if other scenarios become obvious, or in response to specific demands. OK, so it would mean a couple of extra mouse clicks to find the code, but that code should more closely resemble the code you need to use, and be easier to adapt and include in your project.
Why not tell me what you think? How do you use an API reference, and - more important - what do you actually want to see in it?
In the days when I used to visit my Uncle Gerald, who was a keen gardener, he would often present me around this time of year with a large bundle of rhubarb and the instruction to "give these to your Mother and wish her a moving Easter". I suspect that the comment was somehow related to the laxative properties of rhubarb. We haven't had rhubarb in our house lately, but I still managed to have a moving Easter. I was moving all my VMs from a dead server to the backup one.
Yep. Woke up on Good Friday morning with the sun shining and plans for a nice relaxing day in the garden only to find the main server for my network sulking glumly in the corner of the server cabinet with no twinkly lights on the front and no whooshing of stale air from the back. Poke the "on" button and it runs for five seconds then dies again. Start to panic. Keep trying, no luck. Open the box and peer hopefully around inside. Nothing missing, no smoke or burnt bits, nothing looking like it was amiss.
Wiggle some wires and try again (the total extent of my hardware fault diagnosis capabilities). Disconnect the new hard drive I fitted a couple of weeks ago. Look in the BIOS logs, but they're empty. The most I could get it to do on one occasion was run as far as the desktop before it just died again. So, in desperation, phone a local Dell-approved engineer who offers to come and fix it the same day. But after three hours of testing, swapping components, general poking about with a multi-meter, and much huffing and mumbling, he comes the sad conclusion that the motherboard is faulty. And a new one is going to cost around 500 pounds in real money. Plus shipping and fitting.
The server is only two and a half years old (see Hyper-Ventilation, Act I), and I buy Dell stuff because it usually outlasts the lifespan of the software I run and ends up being donated to a needy acquaintance (with the hard drives removed, of course). But I suppose the sometimes extreme temperatures reached in the server cabinet can't have helped, especially as we've had a couple of very warm years and last week was a scorcher here. Though it has made me feel less like I trust the backup server I bought at the same time.
Ah, but surely there's no problem when a server fails? Just fire up the exported VM image backups on the other machine and I'm up and running again. Except that, unfortunately, I've been less than strict about setting things up generally on the network. Thing is, I was planning for a disaster such as a disk failure, which is surely more likely that a motherboard failure. With a disk failure it's just a matter of replacing the disk then restoring from a backup or importing the exported VMs. But a completely dead box raises lots of different issues. I know I should have nothing running within the Hyper-V host O/S, but somehow I ended up with one server having the backup domain controller running on the host O/S and the other (the main one) with the host O/S running WSUS, the SMTP server, Windows Media Services, the scheduled backup scripts, the website checker, and probably several other things I haven't discovered yet.
Therefore, while that main hosted server VMs (the FSMO domain controller, web server, ISA server, and local browser) fired up OK on the backup server, all the other stuff that makes the network work was gone. And then it got worse. The backup of the FSMO domain controller was a week old, and so it kept complaining that it didn't think the FSMO role was valid. And none of the recommended fixes using the GUI tools or ntdsutil worked. So I ended up junking the FSMO domain controller, forcing seizure of the roles on the backup domain controller, and then using ntdsutil to clean up the AD metabase. Afterwards, I discovered this document about virtualizing a domain controller which says "Do not use the Hyper-V Export feature to export a virtual machine that is running a domain controller" and explains why.
I certainly recommend you read the domain controller document. There's a ton of useful information in there, even though much is aimed at enterprise-level usage. However, when you get to the part about disabling write caching and using the virtual SCSI disk controller, look at this document that says you must use the virtual IDE controller for your start-up disk in a VM. But, coming back to the issue of backing up/exporting a VM'd domain controller, it looks like the correct answer is to run a regular automated backup within the DC's VM to a secure networked location instead. I've set it up for both the virtual and physical DCs to run direct to a local share and then get copied to the NAS drive, which will hopefully give me a fighting chance of getting my domain back next time. After you set up a scheduled backup in Windows Server Backup manager you can open Task Scheduler, find the task in the Microsoft | Windows | Backup folder, and change the schedule if you want something different from one or more times a day. And make sure any virtual DC VMs are set to always start up when the host server starts so that the FSMO DC can confirm it actually is the valid owner of the roles.
It does seem like a workable last resort disaster recovery strategy if a DC does fail is to force its removal from the domain and rebuild it from scratch. As long as you have one DC still working, even if it's not the FSMO, you should still be able to get (most of) your domain back by using it to seize the FSMO roles that were held by the dead DC and then cleaning it up afterwards. However, I wouldn't recommend this as a back-up strategy.
So after spending most of the holiday weekend with my head in the server cabinet, I managed to get back to some level of normality. I'm still trying to resolve some of the issues, and still trying to figure the ideal solution for virtualized and physical domain controllers. There's tons of variable advice on the web, and all of it seems to point to running multiple physical servers to overcome the problem of a virtualized DC not being available when a host server starts. Nobody is suggesting running Hyper-V on the domain controller host. However, my backup server that is valiantly and temporarily supporting the still working remnants of my network has both Domain Services (it's the FSMO domain controller) and Hyper-V roles enabled (it's hosting all the Hyper-V VMs).
Even though no-one seems to recommend this, they do grudgingly agree that it works and it does seem to be one way to cope with redundancy and start-up issues on a very small and lightly loaded network like mine, and when I get a new server organized it will also be a DC. Meanwhile I've created a "server operations" VM that contains all the other stuff that I lost - WSUS, SMTP server, Media Services, scheduled backup scripts, web site monitoring, etc. That way all I actually need on the base hosting server is Active Directory (so it is a DC) and the Hyper-V role with the correct network configuration. Oh, and the correct UPS configuration. And probably more esoteric setup stuff I'll only find out about when I get there.
Mind you, after I complained to my Dell sales guy about the failed server he's done me an extremely good deal on a five year pro support warranty with full onsite maintenance for the new box. So next time it fails I can just phone them and tell them to come and fix it. And until it arrives and is working so that I again have some physical server redundancy, I can only ruminate as to whether the fear of waking up to a dead network is as good a laxative as rhubarb...
My wife has been asking me why I haven't written about the recent Royal Wedding. Mainly it's because, surprisingly, I didn't receive an invitation; and so was unable to apply my usual highly perceptive and amazingly incisive documentation engineering capabilities to the occasion without first-hand, on-site experience. So I decided to write about the Royal Mail instead.
It seems that an outside broadcast presenter at one of our local radio stations phoned Royal Mail to ask where the post boxes are located in his town so that he could post letters to his listeners as he travelled around the locality. They told him that the information was "not available to the public", so - just to see what would happen - he applied officially for the details under the Freedom of Information Act.
The letter he got back stated that "releasing information on the locations of post boxes would clearly be likely to prejudice the commercial interests of Royal Mail", and that such information "would undermine their commercial value, significantly reducing Royal Mail's ability to exploit the information commercially". They even said that there was "significant public interest" in keeping the information private. OK, so I'm only an insignificant member of the public, but I've never shown any interest at all in keeping the whereabouts of our local post box a secret...
Obviously nobody at Royal Mail uses a road atlas, phone directory, sat-nav, or mapping website or they would have discovered that all of these show the locations of post office branches of Royal Mail. Surely these, each measuring several hundred square yards and often located in prime city centre locations, are more "commercially valuable" than the two square feet of pavement (sidewalk) taken up by a post box? Should I write and tell them about this alarming leak of commercially valuable information?
Of course, it could be that they are right about keeping valuable commercial locations secret. Just in case I've emailed the press office of a couple of national supermarket chains and hi-fi retailers, all of whom have a "Find your nearest branch" page on their websites. I haven't had any replies yet, but I confidently expect this dangerous feature to disappear from their sites very soon. I mean, just think of the commercial value of the ten acre town-centre site our local Tesco store inhabits. And they even have the naivety to display a huge sign on the roof!
And the same could just as easily apply to us here at Microsoft. I'm sure that the domain name alone is worth a few bob (dollars), and the huge number of sites and pages that hang off it must be of not inconsiderable commercial value. I need to warn our IT people that they should immediately remove us from all the DNS servers around the world, and disguise the sites so that people don't encounter them by accident and reveal their location. Just think how that could undermine their commercial value!
Mind you, as our roving local radio reporter pointed out, several people probably already know where the post boxes in his town are. Let's face it, a five foot high bright red box that, in many cases has been there since Victorian times, is hard to disguise. And if you were that interested, you'd only need to follow a post van on its rounds to find them. They even help you by painting the words "Royal Mail" in big letters on the side of the vans.
And I've just realized why I didn't get my invitation to the Royal Wedding! Obviously nobody would tell Kate where to find a post box to send it...
"Welcome back! You join us as Alex is trying to decide whether to act out his Star Wars fantasy with an R2 detour (D2-er, get it? Maybe not). With several hundreds of newly acquired gigs in the servers, will he risk upgrading from the so-last-decade Windows Server 2008 to the shiny new R2 edition? Especially now SP1 is out there."
In fact, now that I have plenty of room for new Hyper-V VMs it seemed like it was worth a try. As long as ADPrep doesn't screw up my Active Directory I can export the existing domain controller and other server VMs and then upgrade imported-alongside copies. If it all goes fruit-dimensional I can just dump the new VMs and fire up the old ones again. And if it does all work out OK I'll be less worried about upgrading the physical machine installations of 2008 that host the VMs.
So early on a Saturday morning I start the process. I've always dreaded running ADPrep since the time I tried to upgrade a box that started life on NT4 as an Exchange Server, was upgraded to Windows 2000 Server, and then upgraded again to Windows 2003 Server. The NT to 2000 upgrade required two days playing with ADSIEdit afterwards, and the 2000 to 2003 upgrade destroyed the domain altogether. However, this time the ADPrep 2008 to R2 upgrade ran fine on both forest and domain, so it was all looking peachy.
Have you ever wondered why things that go well are compared to peaches, while things that go wrong are pear-shaped? Especially as my wife can confirm that I will have absolutely nothing to do with hairy fruit (but that's another story).
And now I can expand the size of the VMs disks in Hyper-V Manager and then extend the volumes using the Storage Management console within each VM's O/S to get the requisite 15 GB of free space. Then bung in the DVD, cross my fingers and toes, mutter a short prayer to the god of operating system upgrades, and hit Install. Except that it says I have to stop or uninstall Active Directory Federation Services (ADFS) first.
So I go and read about upgrading ADFS. This doc on MSDN for upgrading and uninstalling ADFS goes through all the things you need to do with IIS configuration, PowerShell scripts, and editing the Registry to properly remove the standalone v 2.0 installation. But another says that the R2 upgrade will just remove it anyway. There is an ADFS Role in 2008 R2, but note that this is ADFS 1.1 not 2.0. And I never managed to make this role work anyway; probably because I didn't do all the uninstall stuff first. If you want to run ADFS 2.0 I suggest you follow the full uninstall and clean-up instructions before you upgrade to R2. Then, after you upgrade the O/S, just download and install the ADFS 2.0 setup file for 2008 R2 (make sure you select RTW\W2K8R2\amd64\AdfsSetup.exe on the download page) instead of enabling the built-in Federation Service Role.
Next, install the 72 updates for R2 (thank heavens for WSUS) and then install SP1. And then some more updates. But, finally, my primary (FSMO) domain controller was running again. And most of the 100 or so errors and warnings in Event Viewer had stopped re-occurring. Except for a couple of rather worrying ones. In particular: "The DHCP service has detected that it is running on a DC and has no credentials configured..." and "Volume Shadow Copy Service error: Unexpected error calling routine RegOpenKeyExW(-147483646,SYSTEM\CurrentControlSet\Services\VSS\Diag,[account name]). hr = 0x80070005, Access is denied".
Solving the VSS error is supposed to be easy - you can tell which account failed to access the Registry key from the message. Except that there is no account name in my error message. In this (not unknown) case, the trick with this VSS error, so they say, is to locate another error that occurred at the same time - which is usually the cause of the VSS error. In my case it seemed like it was the DHCP error, and this page on the Microsoft Support site explains how to fix it. I've never had this error before in Server 2008, but the fix they suggest seems to have cured the DHCP error.
Deleting a DHCP entry in DHCP Manager and then viewing DNS Manager shows it removes that machine from the DNS as expected, and ipconfig /renew on that machine creates a new DHCP entry that replicates to DNS. And no errors in Event Viewer, which hopefully indicates that it's working as it should. However, this hasn't so far cured the VSS error, and now there are no other errors occurring at the same time. But after some searching I found this page that explains why it's happening and says that you can ignore it.
Next I can upgrade the backup domain controller, and for some reason I don't get the same DHCP error even though it also runs DHCP (with a separate address range in case the primary server is down). Very strange... unless it was initially an AD replication issue when only one DC was running. Who knows? Though I do get the same VSS error here, confirming that it wasn't actually the DHCP problem causing it last time.
Anyway, at last I can tackle the more nerve-wracking upgrade of the base O/Ss of the machines that host the VMs. This time setup stops with a warning that I have to stop the Hyper-V service. However, this blog post from the Hyper-V team says I can just ignore this message and they are correct - it worked. The VMs fired up again afterwards OK, though the Server 2003 one did require an update to the Hyper-V Integration Services; which means you have to stop it again and add the DVD drive to it in Hyper-V Manager because you forgot to do that first...
One remaining cause of concern is the error on the primary DC that "Name resolution for the name [FQDN of its own domain] timed out after none of the configured DNS servers responded". NSLookup finds it OK, Active Directory isn't complaining, and everything seems to be working at the moment so it's on the "pending" list. A web search reveals hundreds of reports of this error, and an equally vast range of suggestions for fixing it - including buying a new router and changing all the underlying transport settings for the TCP protocol. Think I'll give that a miss for the time being.
Of course, a few more upgrade annoyances arose over the next couple of days. On the file server that is also the music server the upgrade to R2 removes the Windows Media Service role. After the upgrade you have to download the Windows Server 2008 R2 Streaming Media Service role from Microsoft and install it, then enable the role in Server Management and configure the streaming endpoints again. And, of course, it's been so long since you did this last time you can't remember what the settings were. Don't depend on the help file to be much user either.
And as with other upgrades and service packs, the R2 upgrade silently re-enables all of the network connections in the Hyper-V host machine's base O/S, so that the connections to the outside world are enabled for the machine that is typically on the internal network (see this post for details). You need to go back into the base O/S's Network Connections dialog and disable those you don't want. However, in R2 you can un-tick the Allow management operating system to share this network adapter option in Virtual Network Manager to remove these duplicated connections from the base O/S so that updates and patches applied in the future do not re-enable them.
But of much more concern was the effects of the upgrade on my web server box. After it was all complete, patched, SP'd, and running again I decided to have a quick peep at the IIS and firewall settings. Without warning the update had enabled the FTP Service (which I don't run) and set it to auto-start, then added a heap of Allow rules to the Public profile to allow FTP in and out. Plus several more to allow DCOM in for remote management. As usual, after any update, remember to check your configuration for unexpected changes. If you don't need the FTP service, remove it as a Feature in Server Manager, which prevents it from automatically enabling the firewall rules.
And a day or so later I discovered that the R2 upgrade also set the SMTP service to Manual start as well, so the websites and WSUS could no longer send email. The service started OK and so I set it to Automatic start and thought no more about it until WSUS began reporting three or four times a day that it was unable to send email. Yet testing it in the WSUS Email Options dialog reveals that it can send email. So I added the configuration settings in the IIS7 Manager for SMTP (even though I never had to do this before), and it made no difference. Every day I get an email from WSUS with the all the newly downloaded updates listed, and three Event Log messages saying it can't send email. Perhaps next week it will start sending me emails to tell me it can't send email...
Finally, by late Monday evening, everything was up and running again. OK, there are still a couple of Event Log errors and warnings to track down and fix, but mostly it all seems to be working. And, I guess, the whole process was a lot less painless than I'd expected. O/S upgrades have certainly improved over the years, and I have to say that the server guys really did an excellent job with this one. It was certainly worth it just to be able to run the latest roles, and - at least so far - I even have proper working mouse pointers in all the VMs!
What I did notice is how, for a short period post upgrade, life seems a lot more exciting. Well, at least the server-related segments of my day do. Each reboot is accompanied by that wonderful sense of anticipation: Have I broken it? Will it restart? Will I get some exciting new errors and warnings?
It's as though the new O/S is a bit delicate and you need to handle it gently for a while. Like when you've just glued the handle back on your wife's favorite mug you broke when doing the dishes, and you're not sure if it will all just fall to pieces again. Until you're really convinced it's settled down you don't want to click too quickly, or wave the mouse about too much. Or open too many applications at one go in case it gets annoyed, or just can't cope until it's finally unpacked its suitcase and settled in.
Or maybe I really do need to get a life...
Oh dear. Here in this desolate and forgotten outpost of the p&p empire it's pretend-to-be-a-sysadmin time all over again. Daily event viewer errors about the servers running out of disk space and shadow copies failing (mainly because I had to disable them due to lack of disk space) are gradually driving me crazy. Will I finally have to abandon my prized collection of Shaun The Sheep videos, or risk my life by deleting my wife's beloved archive of Motown music? And, worse still, can I face losing all those TV recordings of wonderful classic rock and punk concerts? Or maybe (warning: bad pun approaching) I just need to find some extra GIGs to store the gigs.
Yep, I finally decided it was time to bite the bullet and add some extra storage to the two main servers that run my network and, effectively, my life. Surprisingly, my two rather diminutive Dell T100 servers each had an empty drive bay and a spare SATA port available, though I'll admit I had to phone a friend and email him some photos of the innards to confirm this. And he managed to guide me into selecting a suitable model of drive and cable that had a reasonable chance of working. The drives even fitted into the spare bays with some cheap brackets I had the forethought to order. Of course, it was absolutely no surprise when Windows blithely took no notice of them after a reboot. I never really expect my computer upgrades to actually work. But at least the extra heat from them will help to stop the servers freezing up during next winter's ice-age.
However, after poking around in the BIOS and discovering that I needed to enable the SATA port, everything suddenly sprang into life. For less than fifty of our increasingly worthless English pounds each server now has 320 brand new gigs available - doubling the previous disk space. Amazing. And after some reshuffling of data, and managing to persuade WSUS to still work on a different drive, I was up and running again.
Mind you, setting the appropriate security permissions and creating the shares for drives and folders was an interesting experience. One tip if you want to know how many user-configured shares there are on a drive is to open the Shadow Copies tab of the Properties dialog for a drive. It doesn't tell you where they are, but just type net share into a command window to get a list that includes the path - though it includes all the system shares as well. And if you intend to change the drive letter, do it before you create the shares. If not they disappear from Windows Explorer, but continue to live as hidden shares pointing to the old drive letter. You have to create new shares with the same name and the required path, and accept the warning message about overwriting the existing ones.
And now I can move a couple of the Hyper-V VMs to a different drive as well, instead of having all four on one physical drive. Maybe then it won't take 20 minutes for each one to start up after the monthly patch update and reboot cycle. So, being paranoid, I check the security permissions on the existing VM drive and the new one before I start and discover that the drive root folder needs to have special permissions for the "Virtual Machines" account. So here's a challenge - try and add this account to the list in the Security tab of the Properties dialog for a drive. You'll find, as I did, that there is no such account. Not even the one named NT VIRTUAL MACHINES mentioned in a couple of blog posts. But as the MS blogs and TechNet pages say that you can just export a VM, move it, and then import it, there should be no problem. Maybe.
Of course, they also say you can use the same name for more than one VM as long as you don't reuse the existing VM ID (un-tick this option in the Import dialog). Or you can use the same ID if you don't intend to keep the original VM. Obviously I can't run both at the same time anyway as they have the same DNS name and SIDs. So should I export the VM to the new drive, remove it from Hyper-V Manager, and then import it with the same ID? Or import it alongside the original one in Hyper-V Manager but allow it to create a new ID and then delete the old one when I find out if it works?
As the VM in question is my main domain controller and schema master, I'm fairly keen not to destroy it. In the end I crossed all my fingers and toes and let it create a new ID. And, despite my fears, it just worked. The newly imported VM fired up and ran fine, even though there are two in Hyper-V Manager with the same name (to identify which is which, you can open the Settings dialog and check the path of the hard disk used by each VM). And the export\import process adds the required GUID-named account permission to the virtual disk file automatically (though not to the drive itself, but it seems to work fine without).
What's worrying is how I never really expect things like this to just work. Maybe it's something to do with the aggravation suffered over the years fighting with NT4 and Windows 2000 Server, and the associated Active Directory and Exchange Server hassles I encountered then. I really must be paranoid about this stuff because I even insist on installing my Windows Updates each month manually rather than just letting the boxes get on with it themselves. So it was nice to see that Hyper-V continues to live up to its promise, and I'm feeling even more secure that my backup strategy of regularly exporting the machines and keeping multiple copies scattered round the network will work when something does blow up.
So anyway, having gained all the new gigs I need, should I finally risk my sanity altogether and upgrade the servers and Hyper-V VMs from Windows Server 2008 to Windows Server 2008 R2? I abandoned that idea last time because I didn't have the required 15 or so gigs of spare disk space for each one. But it seemed like as good a time as any to have another go at testing my reasonably non-existent sysadmin capabilities. Maybe I would even get properly working mouse pointers in the VMs with R2 installed.
So as they say in all the best TV shows (and some of the very dire ones), "Tune in next week to see if Alex managed to destroy his network by upgrading it to Windows Server 2008 R2..."
One of the joys of being a documentation engineer is the variety of projects I tackle. At the moment I'm sharing my time between two projects at diametrically opposite ends of the complexity and target audience spectrums. It even seems as though it requires my brain to work on different wavelengths at the same time. It's almost like a Rainbow Coalition, except that there's only me doing it.
For the majority of my working week I'm fighting to understand the intricacies of claims-based applications that use ASP.NET, WCF, and WIF; and to create Hands-On-Lab documents that show how you can build applications that use tokens, federated identity, and claims-based authentication. In between I'm writing documents that describe what a developer does, the skills they require, and the tools and technologies they use for their everyday tasks.
What's worrying is I'm not sure I really know enough about either of the topics. Claims-based authentication is a simple enough concept, but the intricacies that come into play when you combine the technologies such as ASP.NET sessions, browser cookies, WCF, WIF, ADFS, and Windows Azure Access Control Service can easily create a cloud (ouch!) of confusion. Add to that interfacing with Windows phone and, just to make matters even more complicated, SharePoint Server, it's easy to find yourself buried in extraneous detail.
What became clear during research on these topics was why some people complain that there is plenty of guidance, but it doesn't actually tell you anything useful. I must have watched a whole week's worth of videos and presentations, and got plenty of information on the concepts and the underlying processes. But converting this knowledge into an understanding of the code is not easy. One look at a SharePoint web.config file that's nearly 1000 lines long is enough to scare anybody I reckon. Simply understanding one area of the whole requires considerable effort and research.
Contrast that with the other task of describing what a software developer does. If you ask real developers what they do, chances are the answer will be something like "write code" or "build applications". Yet when you read articles on how development methodologies such as agile work, you soon come to the conclusion that developers don't really have time to write code at all. Their working day is filled with stand-up meetings, code reviews, customer feedback consultations, progressive design evolution, writing unit tests, and consulting with other team members.
And what's becoming even more evident is that everything now is a cross-cutting concern. At one time you could safely assume that someone using Visual Basic was writing a desktop application, and that the developer using ASP.NET was building a website. Or the guy with a beard and sandals was writing operating system software in C++. Now we seem to use every technology and every language for every type of application. Developers need to know about all of them, and weave all of the various implementation frameworks into every application. And find time to do all that while writing some code.
For example, just to figure out how our claims-based examples work means understanding the ASP.NET and C# code that runs on the server, the WCF mechanism that exposes the services the client consumes, the protocols for the tokens and claims, how ACS and ADFS work to issue tokens, how they interface with identity providers, how WIF authentication and authorization work on the client, and how it interfaces with the ASP.NET sessions to maintain the credentials. And don't get me started on the additional complexities involved in understanding how SharePoint 2010 implements multiple authentication methods, how it exposes its own token issuer for claims, and how that interacts (or, more significantly, doesn't) with SharePoint groups and profiles.
At one point in the documentation about developer tasks, I started creating a schematic that maps development tools, technologies, and languages onto the four main types of application: web, desktop, cloud, and phone. It starts out simple enough until you realize that you forgot about Silverlight and XNA, or still need to find a slot for FxCop and the Expression Blend. And where do you put the Microsoft Baseline Security Analyzer? Meanwhile, Visual Studio seems to be all things to all people and to all tasks. Until you try to divide it up into the full versions, the Express editions, and the special editions such as the one for Windows Phone or the add-ons for WIF and Windows Azure. I don't think anybody has a screen large enough to display the final schematic, even if I could manage to create it.
And just like Rainbow Coalitions in government, it sometimes seems like there are lots of different actors all trying to present a single face to the world but, underneath, all working to a different script. Should I use Razor in WebMatrix for my new website, or MVC, or just plain old ASP.NET Web Forms? Is WPF the best technology for desktop applications, or should I still be using Windows Forms? Or do it in Silverlight in the browser. And in which language? It's a good thing that, here at p&p, we provide plenty of guidance on all these kinds of topics.
Still, at least I can create my technology matrix schematic using all of the colors of the rainbow for the myriad boxes and shaded sections, so it actually does look like a rainbow and provides a nice calming desktop background - even if it does nothing else more useful.
If I asked you what they manufacture in Seattle, I'd guess you'd say "software and aeroplanes". Obviously I'm biased, so Microsoft is the first name to spring to mind. And I discovered from a recent Boeing factory tour that they build a few 'planes there now and then. You might also, after some additional thought, throw in "coffee shops" (Starbucks) and "book stores" (Amazon). But I bet you didn't include "doorbells" in your list.
I know about the doorbells because I just bought a SpOre push button from a UK distributor and it proudly says "Made in Seattle" on the side of the box. Unless there is another Seattle somewhere else in the world, I'll assume that somebody expert in working with aluminium got fed up nailing wings onto 747s and left to set up on their own. Though you have to wonder about the train of thought when creating their business plan. "Hmmm, I'm an expert in building massively complex, high-tech, hugely expensive pieces of equipment so I think I'll make doorbells..."
But the point of this week's wandering ramble is not specifically doorbells (a subject in which, I'll admit, I'm not an expert). What started this was the time and effort required to actually find the item in the first place. We don't live anywhere near a city that contains one of those idiosyncratic designer showrooms, and I tend not to spend my weekends at building exhibitions. So, when my wife decides that she wants "something different" in terms of hardware, furniture, or other materialistic bauble that the average DIY store doesn't stock, I typically end up trailing through endless search engine results trying to track down products and suppliers.
Inevitably, what seems like an obvious set of search terms fails to locate the desired items. For example, rather than the usual "black plastic box and white button" that typifies the height of doorbell-push style here in England, searching for "contemporary doorbell push" just finds tons of entries for shopping comparison sites, ugly Victorian-style ironmongery, a few rather nasty chrome things, and (of course) hundreds of entries on EBay. I finally found the link to the SpOre distributor on what felt like page 93.
Much the same occurred when searching for somebody who could supply a decent secure aluminium front door to replace the wooden one we have now (which was already rotting away before the ice-age winter we just encountered here). It took many diligent hours Binging and Googling to find a particularly well-disguised construction suppliers contact site, which linked to a manufacturer in Germany, who finally forwarded my email to a garage door installation company here in England. When I looked at their site, it was obvious that they did exactly what we wanted, but there was pretty much zero chance of finding them directly through a web search.
And, not satisfied with all this aggro, it seems that the door manufacturers in Germany won't put a letter box slot in the door. They can't believe that anyone buying a properly insulated secure entrance door would want to cut a hole in it just for people to shove letters though (they tell me that only people in the UK do stupid things like this), so I have to figure out another way to provide our post lady with the necessary aperture for our mail. The answer is a proper "through the wall post box", and I'll refrain from describing the web search hell resulting from locating a UK supplier for one of these.
Of course, the reason for the web search hell is that I don't know the name of the company I want before I actually find it. If I search for "spore doorbells" or "hormann doors", the site I want is top of the list. Yet, despite entering a bewildering array of door-oriented search terms, all that comes up unless you include the manufacturer's name is a list of double-glazing companies advertising plastic panelled doors with flower patterns; or wooden doors that wouldn't look out of place on a medieval castle.
The problem is; how do you resolve this? There are obviously lots of very clever people working on the issue; and for website owners the solution is, I suppose, experience in the black art of search engine optimization (SEO). But there are only a limited number of obvious generic search terms - none of which are unique - compared to the millions of sites out there that may contain marginally relevant content. It seems that only searches for a product name (a registered trade mark) can really get you near to the top of the list. Even the sponsored links that most sites now offer are little help unless you can afford to pay whatever it costs to get your site listed whenever someone searches for a non-unique word such as "door". Meanwhile, most product and shopping comparison sites are more about how cheap you can buy stuff than helping you find what you are looking for.
One alternative is the tree-style organization of links. When done well, this can be a great way to help you find specific items. Most search engines have a Categories section that allows you to narrow the search by category, but the logic still depends on how the search engine analysed the page content. It's really just an intelligent filter over the millions of matching hits in the original list of results. It's easier, of course, if you only need to find something within a site that can manage the content and categorization directly. An example is the B&Q website at http://www.diy.com - and when you consider the vast number of lines they stock it makes it really easy to navigate down through the categories to find, for example, 25mm x 6mm posidrive zinc plated single thread woodscrews.
Mind you, tree navigation is not always ideal either. Some products will fit well in more than one category, while others may not logically fit into any category other than the useless "Miscellaneous" one. And once the tree gets to be very deep, it's easy to get lost - even when there is a breadcrumb indicator. It's like those automated telephone answering systems where you only find out that you should have pressed 3 at the main menu instead of 2 once you get two more levels down. And then you can't remember which option you chose last time when you start all over again. But at least with a phone system you can just select "speak to a customer advisor...".
I remember reading years ago about the Resource Description Framework (RDF). Now part of the W3C Semantic Web project, RDF has blossomed to encompass all kinds of techniques for navigating data and providing descriptive links across topic areas. It allows you to accurately define the categories, topics, and meaning of the content and how it relates to other content. So a site could accurately specify that it contained information in the categories "Construction/Doors/Entrance/Residential/Aluminium/Contemporary" and "Building Products/Installers/Windows and Doors/Residential/". And, best of all, RDF supports the notion of graphs of information, so that an RDF-aware search engine can make sensible decisions about selecting relevant information.
Yet it's hard to see how, without an unbelievably monumental retrofit effort across all sites, this can resolve the issue. It does seem that, for the foreseeable future, we are all destined to spend many wasted hours paging and clicking in vain.
Have you ever wondered what insurance companies do with all the money you pay them every month? It seems that one UK-based insurance company decided that a good way to use up some of the spare cash was to discover that, every day, people in the UK are carrying around over 2,000 tons of redundant keys. I'm surprised they didn't come up with some conclusion such as this requires the unwarranted consumption of 10,000 gallons of fuel, which emits enough carbon to flood a small Pacific island.
It seems they questioned several hundred people about the number of keys they carry and which locks they fit, and the results indicate that everyone carries around two keys that they don't know what they are for. However, I did a quick survey amongst six family members and friends and discovered only a single unknown key amongst all of us. So, on average, we are only carrying one sixth of an unknown key around. Extrapolating this percentage across the country means that there must be several hundred people carrying bunches of keys around when they don't know what any of them are for.
Of course, statisticians will tell you that you can't just average out results in this way - it's not mathematically logical. It's like saying that, because some cats have no tail, the average cat has 0.9 tails. Yet insurance companies rely on averages every day to calculate their charge rates. They use the figures for the historical average number of accident claims based on your age and driving record, or the average number of claims for flooding for houses in your street. Or your previous record of claims for accidental damage to your house contents.
What's worrying, however, is how these numbers affect your premiums. I just changed the address for my son's car insurance when he moved a quarter of a mile (to the next street in the same town) and the premium came down by nearly a quarter. I've suggested that he move house every month so, by next year, we won't be paying anything. Though it probably doesn't work like that...
Anyway, if the way they calculate premiums is already this accurate, just think what it will be like in a few years' time as more powerful processors, parallel algorithms, and quantum computing continue to improve the prediction accuracy. The inevitable result is that, when you apply for an insurance quote, the company will actually be able to tell exactly how much they will need to pay out during the term, and will charge you that plus a handsome profit margin. So you'll actually be better off not having insurance at all, and just paying the costs of the damage directly!
And this is the interesting point. Insurance is supposed to be about "shared risk". When insurance first started after the severe fires in London in the 17th Century, the premiums were based on the value of the property and the type of construction. Wood houses cost twice as much to insure as stone or brick houses. Other than that, everyone paid the same so they shared the costs equally. Of course, you can't really argue with the concept that people who have lower risk should pay less, or that people whose property is more valuable (and will therefore cost more to replace) should pay more. But I wonder if we are starting to take this to extremes.
Ah, you say, but even with the pinpoint accuracy of the current predictions of risk, they are still averages. So if you are lucky (or careful) you can beat the odds and not need to claim, while if you are unlucky (or careless) you will get back more than you paid in premiums. True, but next year they'll just use an even more powerful computer to recalculate the risk averages. Like a derivative function in calculus, the area of variability under the curve can only get smaller.
But I suppose it will be useful in one respect. When you get your motor insurance renewal telling you that this summer you will collide with a 2004 registered blue Volkswagen Beetle at a traffic signal in Little Worthing by the Sea, you can simply cancel your policy and use the money you save to go on holiday to Spain instead.