Random Disconnected Diatribes of a p&p Documentation Engineer
Reading in the newspaper this week about the technological advances in political campaigning set my mind wondering about whether there is an ethics/success trade-off in most areas of work, as well as in life generally.
I don't mean cheating in order to win; it's more about how you balance what you do, with what you think people want you to do. The article I was reading focused on the area of national politics. Technologies that we in the IT world are familiar with are increasingly being used to determine the "mood of the people" and to target susceptible voters. In the U.S. they already use Big Data techniques to profile the population and to analyze sectors for specific actions. The same is happening here in Britain.
What I can't help wondering is whether this spells the end to true political conviction. If, as a party, you firmly believe that policy A is an absolutely necessary requirement for the country, and will provide the best future for the people, what happens when your data analysis reveals that it's not likely to be as popular as policy B? Do you try to adapt policy A to match the results from the data and sound like policy B, abandon it altogether in favour of policy C that is even more popular, or carry on regardless and hope that people will finally realize policy A is the best way to go?
Some of the greatest politicians of the past worked from a basis of pure conviction, and many achieved changes for the better. Some pushed on regardless and failed. Does the ability to get accurate feedback on the perceived desires of the population, or of specific and increasingly narrowly defined sectors, reduce the conviction that has always been at the heart of real politicians? Perhaps now, instead of relying on the experts that govern us to make a real difference to our lives, we just get the policies we deserve because we all just want what's best for each of us today - and politicians can discover what that is.
There's an ongoing discussion that the same is true of many large companies and organizations. They call it "short-termism" because public companies have to focus on what will look good in the next quarter's results in order to keep shareholders happy, rather than being able to take the long view and maximize success through long term changes. Even though governments generally get a longer term, such as five years, the same applies because it's pretty much impossible to make real changes in politics in such a short space of time.
Of course, there are some organizations where you don't need to worry about public opinion. In private companies you can, in theory, do all the long term planning you need because you have no shareholders to please. You just need to be able to stay in business as you plan and change for the future. In extreme cases, such as here in the European Union, you don't even need to worry what the public thinks. The central masters of the project can just do whatever they feel is right for the Union, and nobody gets to influence the decisions. Maybe the EU, and other non-democratic regions of the world, are the only place where the politics of conviction still apply.
So how does all this relate to our world of technology? As I read the article it seemed as though it was a similar situation to that we have in creating guidance and documentation for our products and services. Traditionally, the process of creating documentation for a software product revolved about explaining the features of the product. In many cases, this simply meant explaining what each of the menu options does, and how you use that feature.
I've recently installed a 4-channel DVR to monitor four bird nest boxes, and the instructions for the DVR follow just this pattern. There are over 100 pages that patiently explain every option in the multiple menus for setting up and using it, yet nowhere does it answer some obvious questions such as "do I need to enable alarms to make motion detection work?", "why is the hard disk light flashing when it's not recording anything?", and "why are there four video inputs but only two audio inputs?" And that's just the first three of the unanswered questions.
Over the years, we've learned to write documentation that is more focused on the customer's point of view instead. We start with scenarios for using the product, and develop these into procedures for achieving the most common tasks. Along the way we use examples and background information to try to help users understand the product. But, in many cases, the scenarios themselves come from our best guesses at what the user needs to know, and how they will use the product. It's still very much built from our opinions and a conviction that we know what the customer needs to know, rather than being based on what they tell us they actually want to know.
However, more recently, even this has started to change. The current thinking is that we should answer the questions users are asking now, rather than telling them what we think they need to know. It's become a data gathering exercise, and we use the data to maximize the impact we have by targeting effort at the most popular requirements. In most IT sectors and organizations, fast and flexible responsiveness is replacing principles and conviction.
Is it a good thing? I have to say that I'm not entirely persuaded so far. Perhaps, with the rate of change in modern service-based software and marketplace-delivered apps, this is the only way to go forward. Yet I can't help wondering if it just introduces randomness, which can dilute the structured approach to guidance that helps users get the most from the product.
Maybe if I could get a manual for my new DVR that answers my questions, I would be more convinced...
So I've temporarily escaped from Azure to lend a hand with, as Monty Python would say, something completely different. And it's a bit like coming home again, because I'm back with the Enterprise Library team. Some of them even remembered me from last time (though I'm not sure that's a huge advantage).
Enterprise Library has changed almost beyond recognition while I've been away. Several of the application blocks have gone, and there are some new ones. One even appeared and then disappeared again during my absence (the Autoscaling block). And the blocks are generally drifting apart so that they can be used stand-alone more easily, especially in environments such as Azure.
It's interesting that, when I first started work with the EntLib team, we were building the Composite Application Block (CAB) - parts of which sort of morphed into the Unity Dependency Injection mechanism. And the other separate application blocks were slowly becoming more tightly integrated into a cohesive whole. Through versions 3, 4, and 5 they became a one-stop solution for a range of cross-cutting concerns. But now one or two of the blocks are starting to reach adolescence, and break free to seek their fortune in the big wide world outside.
One of these fledglings is the block I'm working on now. The Semantic Logging Application Block is an interesting combination of bits that makes it easier to work with structured events. It allows you to capture events from classes based on the .NET 4.5 and above EventSource class, play about with the events, and store them in a range of different logging destinations. As well as text files and databases, there's an event sink that writes events to Azure table storage (so I still haven't quite managed to escape from the cloud).
The latest version of the block itself is available from NuGet, and we should have the comprehensive documentation available any time now. It started out as a quick update of the existing docs to match the new features in the block, but has expanded into a refactoring of the content into a more logical form, and to provide a better user experience. Something I seem to spend my life doing - I keep hoping that the next version of Word will have an "Auto-Refactor" button on the ribbon.
More than anything, though, is the useful experience it's providing in learning more about structured (or semantic) logging. I played with Event Tracing for Windows (ETW) a few times in the past when trying to consolidate event logs from my own servers, and gave up when the level of complexity surpassed by capabilities (it didn't take long). But EventSource seems easy to work with, and I've promised myself that every kludgy utility and tool I write in future will expose proper modern events with a structured and typed payload.
This means that I can use the clever and easy to configure Out-of-Process Host listener that comes with the Semantic Logging Application Block to write them all to a central database where I can play with them. And the neat thing is that, by doing this, I can record the details of the event but just have a nice useful error message for the user that reflects modern application practice. Such as "Warning! Your hovercraft is full of eels...", or maybe just "Oh dear, it's broken again..."
Probably there's not many people who can remember when TVs had just six buttons and a volume knob. You simply tuned each of the buttons to one of the five available channels (which were helpfully numbered 1 to 5), hopefully in the correct order so you knew which channel you were watching, and tuned the sixth button to the output from your Betamax videocassette recorder.
As long as the aerial tied to your chimney wasn't blown down by the wind, or struck by lightning, that was it. You were partaking in the peak of technical media broadcasting advancement. Years, if not decades, could pass and you never had to change anything. It all just worked.
And then we went digital. Now I can get 594 channels on terrestrial FreeView and satellite-delivered FreeSat. Even more if I chose to pay for a Sky or Virgin Media TV package. Yet all I seem to have gained is more hassle. And, looking back at our viewing habits over the previous few weeks, pretty much all of the programs we watch are on the original five channels!
Of course, the list of channels includes many duplicates, with the current fascination for "+1" channels where it's the same schedule but an hour later (which is fun when you watch a live program like "News At Ten" that's on at 11:00 o'clock). Channel 5 even has a "+24" channel now, so you can watch yesterday's programs today. A breakthrough in entertainment provision, which may even be useful for the 1% of the population that doesn't have a video recorder. How long will it be before we get "+168" channels so you can watch last week's episode that you missed?
What's really annoying, however, is that I've chosen to fully partake in the modern technological "now" by using Media Center. Our new Mamba box (see Snakin' All Over) is amazing in that it happily tunes all the available FreeView and FreeSat channels and, if what it says it did last night is actually true, it can record three channels at the same time while you are watching a recorded program. I was convinced that it's not supposed to do more than two.
However, it also seems to have issues with starting recordings, and with losing channels or suddenly gaining extra copies of existing channels. For some reason this week we had three BBC1 channels in the guide, but ITV1 was blank. Another wasted half an hour fiddling with the channel list put that right, but why does it keep happening? I can only assume that the channel and schedule lists Media Center downloads every day contain something that fires off a channel update process. And helpfully sets all the new ones (or ones where the name changed slightly) to "selected" so that they appear in the guide's channel list. I suppose if it didn't pre-select them, you wouldn't know they had changed.
Talking with the ever-helpful Glen at QuitePC.com, who supplied the machine, was also very illuminating. Media Center is clever in that it combines the multiple digital signals for the same channel into one (you can see them in the Edit Sources list when you edit a channel) and he suggested editing the list to make sure the first ones were those with the best signal so that Media Center would not need to scan through them all when changing channels to start a recording.
Glen also suggested using the website King Of Sat to check or modify the frequencies when channels move.
This makes sense because Media Center does seem to take a few seconds to change channels. Probably it times out too quite quickly when it doesn't detect a signal, pops up the warning box on screen, and then tries the other tuner on the same card. Which works, maybe because the card is now responding, and the program gets recorded. But when I checked yesterday for a channel where this happens, there is only one source in the Edit Sources list and it's showing "100%" signal strength.
And a channel that had worked fine all last week just came up as "No signal" yesterday. Yet looking in the Edit Sources list, the single source was shown as "100%". Today it's working again. Is this what we expected from the promise of a brave new digital future in broadcasting? I'm already limited to using Internet Radio because the DAB and FM signals are so poor here. How long will it be before I can get TV only over the Internet?
Mind you, Media Center itself can be really annoying sometimes. Yes it's a great system that generally works very well overall, and has some very clever features. But, during the "lost channel" episode this week, I tried to modify a manual recording by changing the channel number to a different one. It was set to use channel 913 (satellite ITV1) but I wanted to change it to use channel 1 (terrestrial ITV1). Yet all I got every time was the error message "You must choose a valid channel number." As channel 1 is in the guide and works fine, I can't see why it's invalid. Maybe because it uses a different tuner card, and the system checks only the channel list for the current tuner card?
It does seem that software in general often doesn't always get completely tested in a real working environment. For example, I use Word all the time and - for an incredibly complex piece of software - it does what I expect and works fine. Yet, when I come to save a document the first time onto the network server, I'm faced with an unresponsive Save dialog for up to 20 seconds. It seems that it's looking for existing Word docs so it can show me a list, which is OK if it was on the local machine or there were only a few folders and docs to scan. But there are many hundreds on the network server, so it takes ages.
Perhaps, because I use software like this all day, I just expect too much. Maybe there is no such thing as perfect software...
I don't know if General Custer ever made a last stand against the Apache, but I feel like I have. My Apache is, of course, the Hadoop one. Or, to be technically accurate, Microsoft Azure HDInsight. And, going on experience so far, this is unlikely to actually be the last time I do it.
After six months of last year, and about the same this year, it seems like I've got stuck in some Big Data related cluster of my own. We produced a guide for planning and implementing HDInsight solutions last year, but it's so far out of date now that we might as well have been writing about custard rather than clusters. However we have finally managed to hit the streets with the updated version of the guide before HDInsight changes too much more (yes, I do suffer odd bouts of optimism).
What's become clear, however, is how much HDInsight is different from the typical Hadoop deployment. Yes, it's Hadoop inside (the Hortonworks version), but that's like saying battleships and HTML are the same because they both have anchors. Or cats and dogs are the same because they both have noses (you can probably see that I'm struggling for a metaphor here).
HDInsight stores all its data in Azure blob storage, which seems odd at first because the whole philosophy of Hadoop is distributed and replicated data storage. But when you come to examine the use cases and possibilities, all kinds of interesting opportunities appear. For example, you can kill off a cluster and leave the data in blob storage, then create a new cluster over the same data. If you specify a SQL Database instance to hold the metadata (the Hive and HCatalog definitions and other stuff) when you create the cluster, it remains after the cluster is deleted and you can create a new cluster that uses the same metadata. Perhaps they should have called in Phoenix instead.
We demonstrate just this scenario in our guide as a way to create an on-demand data warehouse that you can fire up when you need it, and shut down when you don't, to save running costs. And the nice thing is that you can still upload new data, or download the existing data, by accessing the Azure blob store directly. Of course, if you want to get the data out as Hive tables using ODBC you'll need to have the cluster running, but if you only need it once a month to run reports you can kill off the cluster in between.
But, more than that, you can use multiple storage accounts and containers to hold the data, and create a cluster over any combination of these. So you can have multiple versions of your data, and just fire up a cluster over the bits you want to process. Or have separate staging and production accounts for the data. Or create multiple containers and drip-feed data arriving as a stream into them, then create a cluster over some or all of them only when you need to process the data. Maybe use this technique to isolate different parts of the data from each other, or to separate the data into categories so that different users can access and query only the appropriate parts.
You can even fire up a cluster over somebody else's storage account as long as you have the storage name and key, so you could offer a Big Data analysis service to your customers. They create a storage account, prepare and upload their source data, and - when they are ready - you process it and put the results back in their storage account. Maybe I just invented a new market sector! If you exploit it and make a fortune, feel free to send me a few million dollars...
Read the guide at http://msdn.microsoft.com/en-us/library/dn749874.aspx
One of the facts of life when you write technical documentation and guidance is that it will get reviewed by other people, resulting in regular changes to the content as you try to follow shifting advice, conflicting feedback, and suggestions that sometimes even make sense. It doesn't help, of course, if the technology you are documenting is also a moving target.
I don't profess to be an expert in all the technologies we cover, but I generally have a good grasp of the fundamentals for each one - such as what it's supposed to do, how it does, it, and how you can use it. But I depend on reviewers and feedback to make sure I covered all the relevant points, and that what I've written is accurate as well as being useful.
Over the years, I've come across many situations where it's useful to write defensively in order to minimize errors and illogical content, and to reduce the work required to get the stuff finished and out of the door. While they might not be applicable to everyone, here's a few things to think about:
And finally, my own hobby horse: always remember the phrase "Time flies like an arrow but fruit flies like a banana". Use "such as x" rather than "like x" when giving examples of things...
My new pet snake is installed, working, and really flies. Deathly silent, yet it instantly responds to every command. It's like somebody speeded up the world. Or at least speeded up my television. And, yes, this is a follow-on from last week's rambling post about our new "Mamba" Media Center box from QuietPC.com. In fact, even the title continues the not-quite-a-song theme.
The long and sometimes tortuous setup and installation is over. It's nestled neatly in the TV cabinet, and after a few days use it really does seem to be a superb machine - and a significant upgrade from the old I-US Media Center box. OK so most of the setup hassle was my fault (more later) because I wanted it to be on my local domain and integrated with the network. It needs to have remote Event Log access turned on, my "failed recording" monitor service installed, a custom screensaver, auto logon, and a few other tweaks.
What surprised me, though, was the benefits from the new TV cards. The old box had only one PCI slot, whereas most modern tuner cards are PCI-E only these days so I had to choose between terrestrial (DVB-T) and satellite (DVB-S). And none supported HD. The new Mamba has a dual DVB-T2 (HD) and a dual DVB-S2 (HD) card. And, amazingly, Media Center accepted both, and tuned both of them, so that we now get all of the terrestrial and the satellite channels. You can still record from only two tuner instances concurrently (either on the same tuner card or one from each tuner card) and watch a previously recorded program at the same time. But it's wonderful to get back some old favorite channels that aren't on satellite, and to finally be able to get all the HD channels.
Of course, the actual tuning process is still a pain, and really does need to come closer to the capabilities offered by ordinary TVs that can detect broadcast update signals and automatically retune channels that move around. Media Center has the facility to add new channels, but it never seems to fully work. In the past, when they moved channels around, I had to do a complete re-setup of all the channels - which means getting back the 500+ I don't want and had removed from the guide, and having to go through the laborious process of finding listings for channels where the channel name and the listing name are slightly different. Though maybe in the Windows 8 version of Media Center it will work better. No doubt I'll find out in time.
The final setup process was made more infuriatingly slow by a couple of unexpected hitches. For some reason, Media Center no longer has an option to start automatically when the system restarts from cold or when a user logs on. I have no idea why this option was removed, and it seems from a web search that lots of people are annoyed about it and have found an equally large number of kludges to fix it, including creating a profile and using a batch file in the \ProgramData\Microsoft\Windows\Start Menu\Programs\Startup folder. However, another solution seems to be obvious. Create a scheduled task that runs at logon and executes the file %windir%\ehome\ehshell.exe, and set the taskbar to auto-hide.
But the most annoying quirk was that my custom screensaver that displays details of photos never appeared. All I got was a nausea-inducing scrolling, panning, and zooming screenfull of black and white photos with odd ones occasionally appearing in colour - despite the Lock screen slideshow being turned off and my screensaver properly configured in Windows Personalization settings. I played with this for ages before finally searching the web for solutions. Most of which are totally confusing because the say to turn on the slideshow and then turn off the option to "show the lock screen instead of turning off the screen".
I even followed the advice on one site to use gpedit to disable the Lock screen altogether, but it made absolutely no difference. After I finally gave up and went back to configuring Media Center I found the screensaver option within the Media Center interface. Which is helpfully turned on by default. The Lock screen slideshow I was trying to get rid of wasn't actually the Lock screen at all. No wonder I had problems! After turning the Media Center screensaver off my own screensaver works fine. Doh!
I'm still not sure I'd recommend Media Center as a replacement for a normal TV to my non-technical friends, but it really is a superb system if you know something about computers, are prepared to fiddle with it, and accept the few shortcomings such as the usual need for updates and other maintenance tasks. Even the smart TVs I've seen can't compete with the full range of capabilities and flexibility of a powerful computer driving a big wall screen.
But I have to run. Now that I've got the "Dave" channel back again, there's ten episodes of "The Professionals" from 1978 I need to watch...
Media Center is alive and well! Yes, you can buy a proper no-noise Media Center appliance that just works out of the box, does satellite and terrestrial TV, and looks good on your TV stand. You can even watch the YouTube video on it that inspired the title of this week's rambling post.
Our current I-US Media Center box is struggling. It's had a selection of new hard disks, and is on its second button cell memory BIOS backup battery and third video card. It sounds like a bag of bolts when it's cool, and evolves into a jumbo jet by the time it's warmed up and is recording two concurrent TV programs. Vacuuming five years of accumulated crud from the heat-pipe radiator did help a bit, but the fact that it sometimes takes three reboots to find the hard drive makes me increasingly nervous about its longevity.
So it's being replaced by a shiny new one. Or, to be more accurate, a matt black new one - the attractive-in-an-understated-way Mamba from a local company here in the UK called QuietPC.com. It looks superb, feels really solid, and has an impressive component spec. Even the packaging is glorious, and you get all of the manuals and O/S disks you could ever want - including a magnified photocopy of the Windows product key in case your eyesight is failing and you can't read the label on the back of the case.
So far the initial setup experience has been excellent. It comes pre-configured for use as a TV with all the proper BIOS settings, there's no junkware installed, and it's absolutely silent when running - without showing any signs of getting beyond mildly warm. No doubt the separate power supply helps, and the solid aluminum case. We'll see what happens when it's under a real-life load recording two programs at the same time as my wife is watching yesterday's episode of Coronation Street. Mind you, the O/S is on an SSD and the main data drive is a hybrid beast with SSD cache, so it should be fairly speedy.
As usual, the hardest part of the setup is deciding on the name for the new PC. I was tempted by "Dendroaspis" (as in Dendroaspis polylepis, the black mamba tree snake), but that seems a bit too esoteric. There is a Finnish schlager band called Mamba, but their best-known songs are all unspellable words, and I'm not sure you can use accented letters in a BIOS network name anyway. I can imagine trying to solve some weird errors that might cause in my Active Directory and WSUS servers. In the end I settled on MAMBA-TV in case I forget what it is next time I'm doing my pretend-to-be-a-network-administrator thing.
The one "not quite fully prepared" bit is Media Center itself. I specified Windows 8.1 O/S, so I had to buy and install the Media Center add-on for a few pounds. Which would have been fine except that the credit card payment system they use was broken that day, so I ended up having to open a PayPal account and then close it again afterwards (part of reducing my attack surface). At one time you could just use PayPal to make a payment with needing to open an account - but I guess (like most other sites) they want to capture your personal information to sell to advertisers.
And how come the printer driver that Windows Update offers for my old Dell 5100CN printer is broken? Every attempt to print something just raises an error. I had this issue on another 8.1 machine, though it did start working after a few uninstall/reinstall passes. But as I'll rarely print from this machine, that's a minor issue. I installed a driver that's similar and works instead, even though it has fewer image adjustment options.
However, my custom screensaver that displays our photos when nothing else is happening is a necessity, and unfortunately I compiled it to use .NET 3.5. I had the usual error 0x800F0906 "Download failure" when it tried to install the .NET 3.5 framework because I use Windows Software Update Service (WSUS) to manage patching on my network. The solution, and a description of why it occurs, is in this blog post.
Now I just need to spend a day setting up the final bits and pieces, and adapting the wiring and the ventilation holes in the cabinet in the lounge where it will live. And configuring the TV setup and channel guide for the twenty or so channels that are worth watching out of the 600+ channels of junk that arrive over Free-Sat...
It looks like our Big Data is getting smaller. At least that's the impression I get from the exhaustive investigation carried out by the editor on our current project. Of course, "data" is actually a plural word so perhaps it's just that each datum is getting smaller. Or maybe there are fewer (not "less") of them.
The reason for all of this shrinkage is that, according to our most recent documentation style guides, there are no capital letters in our large quantities of information. It's now only "big data" and no longer "Big Data". Is it just me that feels it somehow loses its impact when deprived of capitals? "Big Data" seems to be saying "look at me - see how huge and important I am!" But "big data" just looks like its shrinking into a corner and hoping it won't be noticed.
And the enforced decapitalization makes some of my wonderfully delicate and skilfully crafted text sound plain odd. After I watched a documentary about how Big Data techniques are being used by many organizations around the world, I came up with:
Police forces are using Big Data techniques to predict crime patterns, researchers are using them to explore the human genome, particle physicists are using them to search for information about the structure of matter, and astronomers are using them to plot the entire universe. Perhaps the last of these really is a big Big Data solution!
But now that last sentence makes no sense at all, it just looks as though I had brain fade while I was writing it:
Perhaps the last of these really is a big big data solution!
I suppose it's all part of our ongoing drive to remove capital letters from the language - perhaps to suit the "txting" generation who don't know what a Shift (shift?) key is. My rants in An Upper Case of Indecisive Instruction, Hyphenless Decapitalization, and even I Can't Yell Any More obviously had no impact at all on our style guidance team.
perhaps it's time i gave up my crusade, and just be grateful i have a job at microsoft writing about sql server and azure...