Random Disconnected Diatribes of a p&p Documentation Engineer
A few influential people in our little world of Developer Guidance here are Microsoft have recently been avoiding the word "scenario". It seems that it's now so overused, and has so many apocryphal meanings, as to render it useless in terms of determining user's documentation requirements and for planning the creation of product guidance.
As my job description includes "creating scenario-focused guidance" and "exploring typical customer scenarios" this could be a bit of a problem (maybe I've become overused and apocryphal as well). Perhaps, in line with the current trend to make guidance simpler and less formal by using common words and "talking to the user", we should replace "scenario" with "needs" so that I can just "explore typical customer needs".
However, my US-English thesaurus doesn't list "needs" as an equivalent to "scenario", but it does list "situation", "state", "set-up", "picture", and "development" - none of which feel quite right. If I created "state-focused guidance" people will probably ask if it applied only to developers in Wisconsin, and "set-up focused guidance" wouldn't seem to be much use after you'd finished installing the application.
But where we had a struggle this week as we continue to develop the structure and plan for our upcoming guide on Big Data and HDInsight is with the difference between "scenario" and "case-study". We want to create some examples of using HDInsight that correspond to typical users' requirements, covering different types of data and query approach. For example, as well as the old chestnut of analyzing web log files, we want to do something with numerical data and social media content.
We have an outline of the examples, but I still need to decide how to present them. If I phrase each one as though it had been done by some fictitious corporation (yes, you guessed: Contoso) and show how they did it, it seems like it will be a "case study" that is specific to that organizations needs. But can it really be a case study if the organization doesn't actually exist?
If I phrase it as a step-by-step explanation of how you would do it yourself then it seems like it's an "example". And the code that we provide for download will be a "sample". The "scenario", meanwhile, looks rather like the umbrella under which all of this occurs. Maybe I'm just a parenthesis short of a lambda expression, but to me it appears as though there's a hierarchy of things here - something like:
Scenario -> Case Study -> Example -> Sample
where a scenario describes the requirements and a case study provides the solution by including an example of how the sample code was used.
The problem is that many people seem to be put off by case studies because the natural initial response is that it will be specific to somebody else's requirements rather than their own. But while this may be true of case studies that show real life implementations, such as how [insert name of global company here] saved 30% on pizza and cola by adopting Hyper-V, we're inventing a case study to resolve a scenario that we also made up.
However, we made up the scenario based on feedback from real users and advisory boards, so it must apply to a lot of people. Therefore the solution should also be relevant, especially where we explore different options and show alternative implementations - together with, of course, guidance on which to choose based on your own specific needs. So it can't be a case study because now we're covering several cases.
I was going to say "catering for several cases" there, but I was worried it would just make readers think about pizza and cola again.
So do we need a new word to replace the possibly deprecated "scenario", and what should it be? Obviously it's not "case study", and "example" just sounds too minimal. We could try falling back on the old technique of combining words, though "scexample" sounds a little dubious. Mind you, it could be worse. Once the marketing people get started on this we'll end up with some action-based, solution-oriented, brainstorm-generated word to replace "scenarios".
I'll probably have to call them "opportunities" instead...
I'm increasingly seeing how big the disconnect is between people who use computers occasionally just because they need to do stuff on the Internet, and those of us who live and breathe computing. And we're not talking stupid people here; I see it most weeks with friends and acquaintances that are fully capable of managing almost any other technological domestic device.
It's both interesting and worrying. Interesting because I'm part of a group within Microsoft working on a project that will help to discover more quickly and more accurately the issues people typically face when using our products. While here in p&p we are tilted towards the needs of developers, software designers, and system admins, I'm interested to learn how you can make operating systems and user-oriented software easier for home users to grasp. After all, we spend inordinate amounts of time and effort making them intuitive, and providing pop-up help pages and tips.
And it's worrying because, for most of the non-technical population, having to spend time learning how to use technology is a thing of the past. We want instant gratification. Few technological devices come with proper manuals these days anyway (it's all on a CD that gets lost within minutes of opening the box), and instead these devices have intuitive UIs that mean you don't need to resort to the help file. Although I have to admit that some do seem to make common tasks difficult - our new kitchen oven has so many knobs and buttons, and different cooking settings, that my wife keeps the instruction book handy just to decipher the strange symbols.
Coming back to computers, though, this week I encountered a perfect illustration of the issue. Some friends called round, complete with laptop and a list of questions, seeking my help. They could no longer print anything because the Print button and menu bar had disappeared from their web browser. Plus, the desktop shortcut that opened their email now just showed Google search engine. And they were concerned that they'd lose all of their precious photos if the computer broke down or was lost because the backup software couldn't find them.
It took only a cursory examination to discover that Internet Explorer seemed to have disappeared, and now all their web shortcuts had a Chrome icon - which I assumed was down to the recent Adobe Flash upgrade (see Not So Shiny). Rather than uninstalling Chrome I just fired up Internet Explorer and used the Programs tab in the Internet Options dialog to make IE the default browser again. This meant that they now had a Print button on the toolbar, though I still had to mess about resetting the Home page and the link to Hotmail.
But they still couldn't figure out how to get the menu bar to appear in the browser. They never realized that you could click the little down arrow next to the Print button to see more printing options, and always did that before using the File option on the menu bar. I explained that they just had to press the Alt key to see the menu bar; but, other than mumbling something about extra room for the content of the page, I couldn't answer the subsequent question "Why?" So I reset the menu bar to be there all the time. Yet all of these operations are explained in the help file - if only I could persuade them to press the F1 key!
And then we got to the question of backing up their photos. Some while ago I'd given them an old USB thumb drive and copied their photos onto it. But now, every time you plug it in, the computer just displays a dialog saying "No more pictures found to import". It turns out that, when they bought a new printer a while ago, another friend had installed it - along with all of the accessory programs that came with it.
One of these programs was a utility that scanned for photos and displayed them for printing, and this program had helpfully set the autoplay option for USB thumb drives to run itself. Obviously it had imported all the photos from the thumb drive the first time it ran, and so there were no new ones. They thought the program was saving the photos onto the thumb drive, but examining it revealed that it contained only those I'd copied to it a year ago. None of their later photos from several trips abroad had been backed up.
To sort this out I had to go into the Autoplay settings in Control Panel to set it back to "Ask me every time", and then create a simple batch file in the root of the drive to copy the new photos. But, of course, there wasn't room on the thumb drive for all the new photos so we fetched a 500GB USB disk from a local store and I set up the batch file on that. Now all they need to do is run the batch file after loading new photos, music, videos, or documents onto the computer and they'll all get backed up automatically.
Except that, until you come to show someone how to do this, you don't realize how unintuitive it all is. Plug in the drive and wait for the autoplay dialog. Select "Open folder to view the files". Double-click on BackUpMyFiles (the name I gave to the BAT file), wait until the black window disappears, close the window showing the files on the USB drive, click the icon in the notification area that looks like a tall thin box with a tick on it, select "Disconnect storage drive D:", wait for the confirmation message, and finally unplug the drive. And if you forget to close the file window first you get an error that the device is still in use, but no indication of what to do about it.
OK, so this is Windows Vista, and thankfully Windows 8 can do all this through the cloud much more easily. But, despite my pleas to upgrade, they're unlikely to do so any time soon (probably only when the computer breaks down and has to be replaced). And they are the exception - most home user help requests I get are still for Windows XP.
Perhaps when I make my fortune and become a philanthropist my calling will be to upgrade everyone I know to Windows 8 for free, though whether it will install on my neighbor's ten year old HP tower computer (which doesn't even have a built-in CD drive) is questionable. And we'll probably be on Windows 23 by then anyway...
There's something rather disturbing about sitting on the sofa looking at a large brown water stain on the lounge ceiling while the mains electricity circuit breaker is occasionally tripping out. Somewhere in the back of your mind is the worrying thought that the two events might be connected. And that another stream of tradesmen, who will tear up the floor and bash holes in the ceiling, is imminent. And that's after three months of visits from men with toolboxes, and startlingly rapid deflation of my bank balance, during our recently completed house modernization saga.
Before I ramble on any further, however, I guess I should apologize both for the overuse of bad song titles (see also UPS Outside Your Head), and the strange focus on boxes with a big battery inside. As you probably guessed by now, we're back in what's turning out to be "interruptible-power-supply" land. And something I've never seen happen before with a UPS.
I've been using APC UPSs for more years that I like to remember, and generally they do what you expect. After a while, or when they get a bit too warm, the battery inside starts to expand and stops providing backup power, but for the rest of the time they just sit there - maybe flickering a few LEDs now and then, and beeping contentedly when the mains power goes off.
There's four 1000W ones in the server cabinet in my garage powering all of the hi-tech stuff you need to connect a few servers to the Internet and an internal network. One of the servers is a cold-swap backup Hyper-V host that can take over all the VMs if (or, more likely, when) the main host server dies. I power up the backup server once a week so it can sync Active Directory (and to check that it still works, you know what computers are like). When it's off I usually turn off its UPS as well, though that's still connected to the mains supply so the battery is kept charged.
But last week when I pressed the "1" button to turn it on (which automatically boots the server) it immediately tripped out the overload protector in the main fuse box for that ring main circuit. Could this be the cause of our mains electricity problem? But the server powered up fine running on battery so the inverter is obviously working, and the overload switch on the UPS was not tripped. Maybe it's a problem with the battery, though that seemed unlikely as it was powering the server, but I replaced it anyway and everything seemed to work again.
At least it did for a while. After scurrying repeatedly (often in complete darkness) for the fuse box reset button at various intervals during the next two days, and several other experiments trying to isolate the fault (isn't it amazing how many things you have plugged in around the house when you are trying to figure which one is playing up), I decided it needed more decisive action.
When I disconnected and removed the UPS from the cabinet it was quite warm, and an examination of the internal gubbins looking for stray wires or that familiar burning smell offered no clue. Could it just be the cold wet weather we've been having for a couple of weeks causing condensation inside? Yet we have cold wet weather every year (often it's the default climate setting here in England), so why should it suddenly happen this year?
However, after a day next to a radiator in a warm kitchen, I test the UPS and it's working fine again. It looks like it didn't take well to having a hurricane of freezing cold and damp air blown over it for a week while the server was powered down and not drawing any current. Though it did start tripping the circuit breaker again a few days later after I reinstalled it, even though I left it turned on this time. But I suppose it makes a welcome change to the more usual situation of trying to keep everything in the server cabinet from melting during the rest of the year. Maybe it's time to invest in temperature-controlled fans?
And the large brown water stain on the lounge ceiling? I do seem to remember having a bathing accident a while ago, at the time when the bath panel had been removed to lay the bathroom floor, so I'm fervently hoping that was the cause...
Footnote: After the UPS tripped the circuit breaker again I gave up and replaced it (there are good deals on reconditioned ones around at the moment). Perhaps I'll give this one another try in the summer to see if it's got over it's bad habit.
Thankfully I don't get many desperate requests for computer-related assistance from friends and acquaintances. Maybe they're frightened I'll ask for money (I won't), or they just don't want to bother me. And those I do get are usually the common issues that - other than the requisite amount of "fiddle time" - don't take much to fix. But this week I actually beat my previous record of fast fixing by solving a problem that just needed me to press Fn-F10.
Some friends had phoned to say that their ADSL connection wasn't working. They'd talked several times to their Internet provider on an expensive premium phone line, but it didn't seem to help. The router kept dropping out, it seems, though after a day the magic green light was back on. However, now their computer could not find any wireless connections.
The solution that their ADSL provider suggested was to "plug in the yellow wire". They had no idea which wire was meant, and so the support guy told them they would have to go out and buy one. And if that didn't work they'd need to replace their computer as it was obviously broken.
So my friends then phoned me because they were concerned that going into a computer store and asking for "a yellow wire" might give the impression that they didn't really know what they were talking about, and they might end up with the wrong thing. I guessed that what they needed was a CAT-5 Ethernet cable, and confused them completely by saying I had a spare "yellow wire" that was blue, and that they could have for free.
However, I suspected that the real problem with the wireless was that it was disabled in the Manage Network Connections dialog, turned off by the hardware switch, or the wireless card had failed. After I got them to search in vain for a switch on the outside of the computer (it doesn't have one) I toddled off to their house with my laptop (and the blue yellow wire) to investigate further. And, as you may have guessed, the problem was that the wireless connection on their computer was simply turned off.
Turns out that the support guy had told them to press "weird combinations of keys", which I assume included Fn-F10, to ensure that wireless was turned on. This made no difference at the time because the router had lost its connection, and all they'd achieved was to toggle it off. By the time I got there the line problems that were causing the router to drop out had obviously cleared because I could connect to it without any problem. And so fixing their computer just needed a quick prod on Fn-F10.
Ah, if only all the problems I tackle were as quick and easy to fix. Of course, after the computer was working OK again came the "extra" questions: "Can you make the browser open my email", "Where have all the photos I had on my camera gone", and of course "Can you make it run a bit faster"...
Some years ago I was forcefully introduced to the concept of statistical quality control, where the overall quality of a batch of items could be determined from an examination of a small sample. This came to mind as I've been immersed in watching demos of the new "Big Data" techniques for analyzing data.
My rude introduction to the topic came about many years ago in a different industry from IT. I was summoned to appear at a large manufacturer in York, England, to look at a delivery my company had made of glass divider panels for railway carriages. The goods inward store manager bluntly informed me that they were rejecting a delivery of 500 panels because they did not meet quality control standards.
"OK", I said, "show me some of the faults". However, it turned out that I was taking a simplistic view of the quality control process. I assumed that they unpacked them all, examined each one as they were about to fit it, and rejected any that were damaged or out of specification. What they actually did is look up the total quantity on a chart, which tells them how many to test. In my case it was 32. So they choose 32 at random and examined these. If more than one fails the test, they reject the whole batch because, statistically, there will be more than the acceptable number of defect items in the batch.
This struck me as odd because I knew that most of the batch would be perfect, some would be perhaps a little less than perfect (glass does, of course, get scratched easily), and only a few would be too bad to use. Our usual approach would be to simply replace any they found to be faulty as and when they came to fit them.
However, as the quality control manager patiently explained, this approach might work when you are installing windows in a house but isn't practical in most cases in manufacturing. If you get a delivery of 100,000 nuts and bolts, you can't examine them all - you just need to know that the number of faulty ones is below your preset acceptance level (perhaps 1%), and you simply throw away any faulty ones because it's not worth the hassle of getting replacements.
Of course, you won't find that exactly one in every 100 is faulty and the other 99 are perfect. You might find a whole box of faulty ones in the batch, or that half of the batch are faulty and by chance you just happened to have tested the good ones. It's all down to averages and random selection of the samples. What worried me as I watched the demos of data analysis with Hadoop-based methods was the assumption that, statistically, you could mistakenly rely on numbers that are really only averages or trends.
For example, one demo used the AdventureWorks sample data to calculate the number of bicycles sold in each zip code area and then mapped this to a dataset obtained from Windows DataMarket containing the average ages of people in each zip code area. The result was that in one specific area people aged 50 to 60 were most likely to buy a bicycle. So the next advertising campaign for AdventureWorks in that area should be aimed at the older generation of consumers.
I did some back-of-an-envelope calculations for our street and I reckon that the average age is somewhere around the 45 to 55 mark. Yet the only people I see riding a bicycle are the couple across the road who are in their 30s, a lady probably in her late 20s that lives at the other end of the street, and lots of young children. I rather doubt that an advert showing two gray-haired pensioners enjoying the freedom of the outdoors by cycling through beautiful countryside on their new pannier-equipped sit-up-and-beg bicycles would actually increase sales around here. Though perhaps one showing grandparents giving their grandkids flashy new racing bikes for Christmas would work?
Maybe "Big Data", Hadoop, and HDInsight do give us new ways to analyze the vast tracts of data that we're all collecting these days. But what's worrying is that, without applying some deep knowledge of statistical analysis techniques, will we actually get valid answers?
It's amazing how often you get the feeling with computers that someone has virtually trampled on your toes, or unceremoniously shoved you out of the way. The latest Patch Wednesday updates (here in England, patch Tuesday usually catches up with us on Wednesday) seemed to coincide with a driver update for NVidia cards to resolve a vulnerability, and since then I've been trying to clean up some picture files that NVidia feel they are free to dump into the My Pictures folder.
Of course, at first you have to wonder how just installing a video card can open up your computer to remote attacks that can take over the whole machine. Not that I had any idea up until last week that I actually had NVidia GeForce cards in my two Media Center boxes. Or, until I looked at the preference settings in the video driver console, that the driver checked for updates every day - and obviously not very successfully if it needs a Patch Wednesday to kick off the update process.
But what galled me was that, after the update, I had a new folder in the public pictures folder full of weird 3D sample files that the video card doesn't actually recognize as I didn't install the 3D driver. Because we use Media Center as our main TV and entertainment system, I like to manage the pictures folder so that we can browse our collection of digital photos, and they are also displayed by the screensaver. I don't really want some unusable files dropped in there, especially by an update that doesn't bother to ask for my permission.
And what's worse is that, on one of the two machines, I can't delete them. The owner, and the only account with permissions to delete them, is the built-in SYSTEM account - which is presumably used by the driver update program. I managed to add my domain admin account to the list of permissions and even take ownership, but I still can't delete them. I have no idea why. All I managed to do was set the hidden flag on the folder so that they don't show up in Media Center. Yet on the other machine I was able to delete the 3D sample files using my domain admin account.
Of course, it could just be that the flat file system of my hard drive doesn't recognize the extra dimension...
Some friends have just adopted a rather cute ginger cat and decided to name it Juno, perhaps after the Queen of the Roman Gods. Though it regularly leads to the interesting conversation: "What's your cat's name?" - "Juno" - "No I don't, that's why I'm asking"...
Meanwhile, here at p&p we're just starting on a project named after one of the new religions of information technology: Big Data. It seems like a confusing name for a technology if you ask me (though you probably didn't). Does it just consist of numbers higher than a billion, or words like "floccinaucinihilipilification" and "pseudopseudohypoparathyroidism"?
Or maybe what they really mean is Voluminous Data, where there's a lot of it. Too much, in fact, for an ordinary database to be able to handle and query in a respectable time. Though most of the examples I've seen so far revolve around analyzing web server log files. It's hard to see why you'd want to invent a whole new technology just for that.
Of course, what's at the root of all this excitement is the map/reduce pattern for querying large volumes of distributed data, though the technology now encompasses everything from highly distributed file systems (HDFS) to connectors for Excel and other products to allow analysis of the data. And, of course, the furry elephant named Hadoop that sits in the middle remembering everything.
Thankfully Microsoft has adopted a new name for its collection of technologies previously encompassed by Big Data. Now it's HDInsight, where I assume the "HD" means "highly distributed". There's a preview in Windows Azure and a local server-based version you can play with.
What's interesting is that when I first started playing with real computers (an IBM 360) all data was text files with fixed width columns that the code had to open and iterate through, parsing out the values. The company where I worked used to have four distinctly separate divisions, each with its own data formats, but these had now been melded into one company-wide sales division. To be able to assemble sales data we had a custom program written in RPG 2 that opened a couple of dozen files, read through them extracting data, and assembled the summaries we needed - we'd built something vaguely resembling the map/reduce pattern. Though we could only run it an night because it prevented most other things from working by locking all the files and soaking up all of the processing resources.
Thankfully relational databases and Structured Query Language (SQL) put paid to all that palaver. Now we had a proper system that could store vast amounts of data and run fast queries to extract exactly what we needed. In fact we could even do it from a PC. And yet here we are, with our highly distributed data and file systems, going back to the world of reading multiple files and aggregating the results by writing bits of custom code to generate map and reduce algorithms.
But I guess when you appreciate the reasons behind it, and start to grasp the concepts of the vast amounts of data involved, our new (old) approach starts to make sense. By taking the processing to the data, rather than moving the data around, you get distributed parallel processing across multiple nodes, and faster responses. And when you discover just how vast some of the data is, you realize that our modern relational and SQL-based approach just doesn't cut it.
Though there are some interesting questions that nobody I've spoken to so far has answered satisfactorily. What happens when you need more than just a simple aggregate result? It seems likely that the map function needs to produce a result set that is considerably smaller than the data it's working on, and if there is little correspondence between the data in each node the reduce function won't be able to do much reducing.
Maybe I just don't get it yet. And maybe that's why being just a "database programmer" is no longer good enough. Now, it seems, you need to be a "data scientist". You not only need to know about Database Theory, but Agile Manifesto and Spiral Dynamics as well according to DataScientists.net. You're going to spend the rest of your life organizing, packaging, and delivering data rather than writing programs that simply run SQL queries.
But it does seem that data scientists get paid a lot more, so maybe this Big Data thing really is a good idea after all...
So it's New Year resolution time again, and it's pretty clear after many previous unsuccessful iterations that the usual crop consisting of more exercise, better diet, and giving up smoking are a waste of time. Therefore, after several months of playing host to an assortment of builders and tradesmen, this year's resolution is more DIY.
What's annoying is that most tradesmen seem to be in a mad rush to get to the next job, and so don't have time for those little finishing touches (which, as my wife says I'm a perfectionist, are so important). Some days it really did feel like I might as well have done the job myself. For example, over the last several weeks I've been:
Meanwhile the people who delivered the rubbish skip and promised to come back for it the next morning left it here for a week so my front lawn now has a square hole that, after all the rain we've had, resembles a small swimming pool.
Realistically, though, many of the jobs they tackled were beyond my level of competence or patience. I can see that my attempts to plaster a ceiling or tile a floor would probably be a disaster, and completely rewiring a kitchen is likely to require some level of theoretical knowledge of the regulations that I don't have.
But, hopefully, it will be another fifteen years before we need to do anything else to the house. And I'll be retired long before then, so I'll have plenty of spare time. However, my wife says we're never going to go through this again - we're going to move house instead. Though I'm not sure that would be any less stressful...