Writing ... or Just Practicing?

Random Disconnected Diatribes of a p&p Documentation Engineer

  • Writing ... or Just Practicing?

    General Cluster's Probably Not Last Stand

    • 0 Comments

    I don't know if General Custer ever made a last stand against the Apache, but I feel like I have. My Apache is, of course, the Hadoop one. Or, to be technically accurate, Microsoft Azure HDInsight. And, going on experience so far, this is unlikely to actually be the last time I do it.

    After six months of last year, and about the same this year, it seems like I've got stuck in some Big Data related cluster of my own. We produced a guide for planning and implementing HDInsight solutions last year, but it's so far out of date now that we might as well have been writing about custard rather than clusters. However we have finally managed to hit the streets with the updated version of the guide before HDInsight changes too much more (yes, I do suffer odd bouts of optimism).

    What's become clear, however, is how much HDInsight is different from the typical Hadoop deployment. Yes, it's Hadoop inside (the Hortonworks version), but that's like saying battleships and HTML are the same because they both have anchors. Or cats and dogs are the same because they both have noses (you can probably see that I'm struggling for a metaphor here).

    HDInsight stores all its data in Azure blob storage, which seems odd at first because the whole philosophy of Hadoop is distributed and replicated data storage. But when you come to examine the use cases and possibilities, all kinds of interesting opportunities appear. For example, you can kill off a cluster and leave the data in blob storage, then create a new cluster over the same data. If you specify a SQL Database instance to hold the metadata (the Hive and HCatalog definitions and other stuff) when you create the cluster, it remains after the cluster is deleted and you can create a new cluster that uses the same metadata. Perhaps they should have called in Phoenix instead.

    We demonstrate just this scenario in our guide as a way to create an on-demand data warehouse that you can fire up when you need it, and shut down when you don't, to save running costs. And the nice thing is that you can still upload new data, or download the existing data, by accessing the Azure blob store directly. Of course, if you want to get the data out as Hive tables using ODBC you'll need to have the cluster running, but if you only need it once a month to run reports you can kill off the cluster in between.

    But, more than that, you can use multiple storage accounts and containers to hold the data, and create a cluster over any combination of these. So you can have multiple versions of your data, and just fire up a cluster over the bits you want to process. Or have separate staging and production accounts for the data. Or create multiple containers and drip-feed data arriving as a stream into them, then create a cluster over some or all of them only when you need to process the data. Maybe use this technique to isolate different parts of the data from each other, or to separate the data into categories so that different users can access and query only the appropriate parts.

    You can even fire up a cluster over somebody else's storage account as long as you have the storage name and key, so you could offer a Big Data analysis service to your customers. They create a storage account, prepare and upload their source data, and - when they are ready - you process it and put the results back in their storage account. Maybe I just invented a new market sector! If you exploit it and make a fortune, feel free to send me a few million dollars...

    Read the guide at http://msdn.microsoft.com/en-us/library/dn749874.aspx

  • Writing ... or Just Practicing?

    Defensive Writing

    • 2 Comments

    One of the facts of life when you write technical documentation and guidance is that it will get reviewed by other people, resulting in regular changes to the content as you try to follow shifting advice, conflicting feedback, and suggestions that sometimes even make sense. It doesn't help, of course, if the technology you are documenting is also a moving target.

    I don't profess to be an expert in all the technologies we cover, but I generally have a good grasp of the fundamentals for each one - such as what it's supposed to do, how it does, it, and how you can use it. But I depend on reviewers and feedback to make sure I covered all the relevant points, and that what I've written is accurate as well as being useful.

    Over the years, I've come across many situations where it's useful to write defensively in order to minimize errors and illogical content, and to reduce the work required to get the stuff finished and out of the door. While they might not be applicable to everyone, here's a few things to think about:

    • Don't include a number in the introduction to a list. The number of items will change. A list of four items that starts "The following four types of data store..." will look silly after you add another one in response to reviewers' comments and forget to change it to "five".
    • Look out for overuse of your favorite words. I found I'd used "encompass" three times in one paragraph before I proof-read it. Use a thesaurus (even the Word built-in one) to find equivalent words and phrases.
    • Beware of including numerically accurate information such "costs x" or "is x times faster" because it will be wrong next week. Words such as "considerably", "minimal", and "cost-effective" are often just as useful.
    • "At the time of writing..." makes sense only to you. Put the date in.
    • Check out commonly accepted information before you accept it. For example, Hadoop was originally created at Yahoo!, not by Apache.
    • Beware of your "only" positioning:
      • "Only developers can use feature x to confirm the results" (nobody else can use feature x to do this)
      • "Developers can only use feature x to confirm the results" (developers can't modify feature x)
      • "Developers can use only feature x to confirm the results" (developers can't use any other feature to do this)
      • "Developers can use feature x only to confirm the results" (developers can't use feature x for any other purpose)
    • Leave all the comments in until you go to release. You'll get conflicting feedback from different reviewers and you need the history to figure out why you changed something in the document, and what actually is correct.
    • Get inside your reviewers' heads. Some will comment only on the bits that interest them, so you can't assume the rest is accurate. Others will offer half-thought-through or off-topic suggestions that are not directly relevant (but might be useful elsewhere). Look out for hobby horses and special interest comments that will bloat the content without adding anything useful. Lazy comments such as "You need to cover other stuff here as well" but don't say what you missed probably need to go back to the reviewer, but don't expect much additional help.
    • Beware of the cleverness of word processors such as Microsoft Word. Removing a comment can sometimes insert a space, which can be a problem in code listings. Deleting the word before a word that starts with punctuation (such as ".NET") removes the space before the word. For example, deleting "Microsoft" from "the Microsoft .NET framework" results in "the.NET framework".
    • Minimize deep linking by linking to the home page of another site where possible, and tell the reader where to look, as long as it does not make the reference unusable. For example, "See the documentation for feature x on the [link]targetsite.com[/link] website or "Search [link]targetsite.com[/link] for "Configuring feature x". You get fewer broken links this way.
    • If you use numbered figures or schematics, and don't (or can't) use the automatic figure numbering features of your word processor, minimize the times you reference the figure by number because you'll add and remove figures over time. You can start the paragraph before or after the figure with "This screenshot shows..." or "In this schematic, feature x is ...". Referring to a figure in a different topic by number is risk taking at the extreme.
    • Proof-read your content in more than one format. If you proof it only in your word processor, you'll probably miss errors that become obvious when it's displayed as HTML. I'm not sure why this is - perhaps it's to do with word positioning due to the layout and line wrap. Or familiarity with the content in the format and layout where it was originally created.

    And finally, my own hobby horse: always remember the phrase "Time flies like an arrow but fruit flies like a banana". Use "such as x" rather than "like x" when giving examples of things...

  • Writing ... or Just Practicing?

    Snakin' All Over

    • 2 Comments

    My new pet snake is installed, working, and really flies. Deathly silent, yet it instantly responds to every command. It's like somebody speeded up the world. Or at least speeded up my television. And, yes, this is a follow-on from last week's rambling post about our new "Mamba" Media Center box from QuietPC.com. In fact, even the title continues the not-quite-a-song theme.

    The long and sometimes tortuous setup and installation is over. It's nestled neatly in the TV cabinet, and after a few days use it really does seem to be a superb machine - and a significant upgrade from the old I-US Media Center box. OK so most of the setup hassle was my fault (more later) because I wanted it to be on my local domain and integrated with the network. It needs to have remote Event Log access turned on, my "failed recording" monitor service installed, a custom screensaver, auto logon, and a few other tweaks.

    What surprised me, though, was the benefits from the new TV cards. The old box had only one PCI slot, whereas most modern tuner cards are PCI-E only these days so I had to choose between terrestrial (DVB-T) and satellite (DVB-S). And none supported HD. The new Mamba has a dual DVB-T2 (HD) and a dual DVB-S2 (HD) card. And, amazingly, Media Center accepted both, and tuned both of them, so that we now get all of the terrestrial and the satellite channels. You can still record from only two tuner instances concurrently (either on the same tuner card or one from each tuner card) and watch a previously recorded program at the same time. But it's wonderful to get back some old favorite channels that aren't on satellite, and to finally be able to get all the HD channels.

    Of course, the actual tuning process is still a pain, and really does need to come closer to the capabilities offered by ordinary TVs that can detect broadcast update signals and automatically retune channels that move around. Media Center has the facility to add new channels, but it never seems to fully work. In the past, when they moved channels around, I had to do a complete re-setup of all the channels - which means getting back the 500+ I don't want and had removed from the guide, and having to go through the laborious process of finding listings for channels where the channel name and the listing name are slightly different. Though maybe in the Windows 8 version of Media Center it will work better. No doubt I'll find out in time.

    The final setup process was made more infuriatingly slow by a couple of unexpected hitches. For some reason, Media Center no longer has an option to start automatically when the system restarts from cold or when a user logs on. I have no idea why this option was removed, and it seems from a web search that lots of people are annoyed about it and have found an equally large number of kludges to fix it, including creating a profile and using a batch file in the \ProgramData\Microsoft\Windows\Start Menu\Programs\Startup folder. However, another solution seems to be obvious. Create a scheduled task that runs at logon and executes the file %windir%\ehome\ehshell.exe, and set the taskbar to auto-hide.

    But the most annoying quirk was that my custom screensaver that displays details of photos never appeared. All I got was a nausea-inducing scrolling, panning, and zooming screenfull of black and white photos with odd ones occasionally appearing in colour - despite the Lock screen slideshow being turned off and my screensaver properly configured in Windows Personalization settings. I played with this for ages before finally searching the web for solutions. Most of which are totally confusing because the say to turn on the slideshow and then turn off the option to "show the lock screen instead of turning off the screen".

    I even followed the advice on one site to use gpedit to disable the Lock screen altogether, but it made absolutely no difference. After I finally gave up and went back to configuring Media Center I found the screensaver option within the Media Center interface. Which is helpfully turned on by default. The Lock screen slideshow I was trying to get rid of wasn't actually the Lock screen at all. No wonder I had problems! After turning the Media Center screensaver off my own screensaver works fine. Doh!

    I'm still not sure I'd recommend Media Center as a replacement for a normal TV to my non-technical friends, but it really is a superb system if you know something about computers, are prepared to fiddle with it, and accept the few shortcomings such as the usual need for updates and other maintenance tasks. Even the smart TVs I've seen can't compete with the full range of capabilities and flexibility of a powerful computer driving a big wall screen.

    But I have to run. Now that I've got the "Dave" channel back again, there's ten episodes of "The Professionals" from 1978 I need to watch...

  • Writing ... or Just Practicing?

    Papa Loves Mamba

    • 0 Comments

    Media Center is alive and well! Yes, you can buy a proper no-noise Media Center appliance that just works out of the box, does satellite and terrestrial TV, and looks good on your TV stand. You can even watch the YouTube video on it that inspired the title of this week's rambling post.

    Our current I-US Media Center box is struggling. It's had a selection of new hard disks, and is on its second button cell memory BIOS backup battery and third video card. It sounds like a bag of bolts when it's cool, and evolves into a jumbo jet by the time it's warmed up and is recording two concurrent TV programs. Vacuuming five years of accumulated crud from the heat-pipe radiator did help a bit, but the fact that it sometimes takes three reboots to find the hard drive makes me increasingly nervous about its longevity.

    So it's being replaced by a shiny new one. Or, to be more accurate, a matt black new one - the attractive-in-an-understated-way Mamba from a local company here in the UK called QuietPC.com. It looks superb, feels really solid, and has an impressive component spec. Even the packaging is glorious, and you get all of the manuals and O/S disks you could ever want - including a magnified photocopy of the Windows product key in case your eyesight is failing and you can't read the label on the back of the case.

    So far the initial setup experience has been excellent. It comes pre-configured for use as a TV with all the proper BIOS settings, there's no junkware installed, and it's absolutely silent when running - without showing any signs of getting beyond mildly warm. No doubt the separate power supply helps, and the solid aluminum case. We'll see what happens when it's under a real-life load recording two programs at the same time as my wife is watching yesterday's episode of Coronation Street. Mind you, the O/S is on an SSD and the main data drive is a hybrid beast with SSD cache, so it should be fairly speedy.

    As usual, the hardest part of the setup is deciding on the name for the new PC. I was tempted by "Dendroaspis" (as in Dendroaspis polylepis, the black mamba tree snake), but that seems a bit too esoteric. There is a Finnish schlager band called Mamba, but their best-known songs are all unspellable words, and I'm not sure you can use accented letters in a BIOS network name anyway. I can imagine trying to solve some weird errors that might cause in my Active Directory and WSUS servers. In the end I settled on MAMBA-TV in case I forget what it is next time I'm doing my pretend-to-be-a-network-administrator thing.

    The one "not quite fully prepared" bit is Media Center itself. I specified Windows 8.1 O/S, so I had to buy and install the Media Center add-on for a few pounds. Which would have been fine except that the credit card payment system they use was broken that day, so I ended up having to open a PayPal account and then close it again afterwards (part of reducing my attack surface). At one time you could just use PayPal to make a payment with needing to open an account - but I guess (like most other sites) they want to capture your personal information to sell to advertisers.

    And how come the printer driver that Windows Update offers for my old Dell 5100CN printer is broken? Every attempt to print something just raises an error. I had this issue on another 8.1 machine, though it did start working after a few uninstall/reinstall passes. But as I'll rarely print from this machine, that's a minor issue. I installed a driver that's similar and works instead, even though it has fewer image adjustment options.

    However, my custom screensaver that displays our photos when nothing else is happening is a necessity, and unfortunately I compiled it to use .NET 3.5. I had the usual error 0x800F0906 "Download failure" when it tried to install the .NET 3.5 framework because I use Windows Software Update Service (WSUS) to manage patching on my network. The solution, and a description of why it occurs, is in this blog post.

    Now I just need to spend a day setting up the final bits and pieces, and adapting the wiring and the ventilation holes in the cabinet in the lounge where it will live. And configuring the TV setup and channel guide for the twenty or so channels that are worth watching out of the 600+ channels of junk that arrive over Free-Sat...

  • Writing ... or Just Practicing?

    my data are getting littler

    • 0 Comments

    It looks like our Big Data is getting smaller. At least that's the impression I get from the exhaustive investigation carried out by the editor on our current project. Of course, "data" is actually a plural word so perhaps it's just that each datum is getting smaller. Or maybe there are fewer (not "less") of them.

    The reason for all of this shrinkage is that, according to our most recent documentation style guides, there are no capital letters in our large quantities of information. It's now only "big data" and no longer "Big Data". Is it just me that feels it somehow loses its impact when deprived of capitals? "Big Data" seems to be saying "look at me - see how huge and important I am!" But "big data" just looks like its shrinking into a corner and hoping it won't be noticed.

    And the enforced decapitalization makes some of my wonderfully delicate and skilfully crafted text sound plain odd. After I watched a documentary about how Big Data techniques are being used by many organizations around the world, I came up with:

    Police forces are using Big Data techniques to predict crime patterns, researchers are using them to explore the human genome, particle physicists are using them to search for information about the structure of matter, and astronomers are using them to plot the entire universe. Perhaps the last of these really is a big Big Data solution!

    But now that last sentence makes no sense at all, it just looks as though I had brain fade while I was writing it:

    Perhaps the last of these really is a big big data solution!

    I suppose it's all part of our ongoing drive to remove capital letters from the language - perhaps to suit the "txting" generation who don't know what a Shift (shift?) key is. My rants in An Upper Case of Indecisive Instruction, Hyphenless Decapitalization, and even I Can't Yell Any More obviously had no impact at all on our style guidance team.

    perhaps it's time i gave up my crusade, and just be grateful i have a job at microsoft writing about sql server and azure...

  • Writing ... or Just Practicing?

    Seventeen Syllable File Delete in ASP.NET

    • 0 Comments

    The days when I played with ASP.NET web sites all the time have, perhaps unfortunately, gone. Now my day job has me delving into grown-up stuff such as the wonderful worlds of cloud service infrastructure, design patterns, systems and software architectural, Big Data, and even open source projects named after animals.

    So a simple task this week, as I applied my almost forgotten skills to get my custom kludged-together server status website to delete old log files, was thwarted by permission errors. And, rather strangely, it coincided with a week when I watched John Cooper Clark reciting some of his famous haiku on the BBC TV program "Have I Got New For You". Including:

    Freezing sentiment
    in seventeen syllables
    is very diffic

    (See the original version on YouTube)

    So, here we go:

    ASPNET's gone
    So all I get is denied
    When I delete files

    And even:

    Not NETWORK SERVICE
    It was IUSR-machine
    but now something else

    Yep, I tried setting the target folder permissions to allow the ASPNET account to delete files, but there is no ASPNET account in Windows Server 2008 R2. Instead, I tried setting the permissions for the NETWORK SERVICE group, but that doesn't work either. And there is no IUSR_[machine-name] account anymore, so that was another dead end.

    So I went off and looked for the name of the account that the Default App Pool runs under, and found one I'd never heard of before: "ApplicationPoolIdentity". Which isn't actually an account. But various blogs say there is an IIS AppPool\[app-pool-name] account that you can configure. Err, no there isn't on my server. Another blog post suggested simply changing the Default App Pool to run under the NETWORK SERVICE account, but I didn't fancy that in case I broke something else.

    Yet another blog said that ApplicationPoolIdentity is a member of the Users group, and you should just set permission on that account group. Which didn't work either. Finally I found this page, which explains that all you need to do is set the permissions on the new IIS_IUSRS account. Which worked.

    In the end it had taken two minutes to add a "Delete" button and OnClick handler to the page, but the best part of an hour finding the right information, learning about how IIS works, and fiddling with permissions. But I guess that sometimes you do have to teach old dogs new tricks...

  • Writing ... or Just Practicing?

    The Internet Of Letterboxes

    • 0 Comments

    It's currently fashionable here in England to castigate large companies for not voluntarily paying more tax that they are legally obliged to do. Amongst the so-called offenders is Amazon, who have grown a huge order management, warehousing, and distribution network throughout the UK. Probably because we seem to be one of the countries that has wholly embraced the concept of online purchasing.

    I'm a regular Amazon customer for a whole range of goods, and I ignore the pleas of those who demand a boycott on "tax avoidance" grounds. Basically I'm on the side of those who feel that if the Government thinks they aren't collecting enough tax from a huge employer who pays for large expanses of real estate and employs lots of people, when that company is following the rules, then they need to change the rules rather than just complaining.

    And as a regular customer, I finally gave in and signed up for the "Prime" service where deliveries are made by Amazon themselves (or some subcontracted organization). It wasn't that expensive when I joined, though I may baulk at renewing now that it's gone up dramatically in price because you get the "Instant Video" streaming service in with the package (I've had it for four months and only managed to watch the first two minutes of one film to see if it worked).

    But what does amaze me is the efficiency of the delivery service. In the past they used to say that online purchasing would never really catch on because people wouldn't be prepared to wait for goods to be delivered, or they would get lost or damaged in the post, or wouldn't look anything like the picture when they arrived (though you could say the same about most microwaved instant meals).

    I preordered the latest Stephen Booth book some while ago. I love his books because they are set in the Derbyshire Dales and Peaks, right here in our part of the country, and I know all the towns and places his characters visit. In fact they even came to a café in our local town in one book, visited the aquarium in Matlock Bath where we went last year, and drive along the roads that we regularly use. And his books are a good read as well, of course.

    But, straying back onto topic, the amazement that prompted this rambling post was that I got an email yesterday to say that the new book had been dispatched. Then another at 9:25 AM today saying "We're going to deliver your order today. If there's nobody in when we arrive we'll post through your letter box if possible, leave with an available neighbour or in your preferred safe-place, if you've previously provided us with those details." OK, that in itself is not so amazing. But at 2:15 PM I got another email saying "Your order, containing the item(s) listed below, has been posted through your letterbox." I went and looked, and they were right!

    Never mind cloud-connected thermostats, online fridges, and the Internet of Things. I've got someone who emails me to tell me when to look in my letterbox...

  • Writing ... or Just Practicing?

    Bluetooth Made Me Un-picture My Contacts

    • 0 Comments

    Can software become more complicated and yet still be easy to use? It seems that, unfortunately in some cases, it can - and does so whether you like it or not. I just spent two hours trying to fix the Bluetooth connection between my wife's phone and her car, and discovered just how unfortunate it can be.

    From a standing start only a few years ago my wife has dived head first into our exciting, online, socially-connected world. It took me ages at first just to persuade her that she needed a mobile phone. Now she's fully immersed into the digital delights of tablets, smartphones, email, Facebook, YouTube, and more. And it all seems to merge into some amorphous mass of transient information delivery with a useful lifespan of twenty minutes or less.

    Except for one aspect: contact information. Maybe it's something to do with the combination of Google, Facebook, Android, and Exchange ActiveSync on her phone, but her contacts list grows magically by the day - and every entry is populated with a photo. And entries get magically linked together, with data from multiple sources, so that figuring out how to edit one becomes a nightmare. Even when you do edit it, the stuff you changed seems to get switched back again the next day.

    Most of the time this isn't a problem. She loves that her phone shows her friends' latest photo when they call her, and that the People list has pictures that get updated automatically. I have to admit that it's all very clever stuff. However, her car doesn't seem to agree. Like many modern vehicles it has a Bluetooth hands-free connection for the phone, allowing you to make and answer calls while your phone is in your pocket. It displays all her contact phone numbers, including lists of the top 10 and a search feature. And it's voice-activated as well, so there's no loss of road/eye contact.

    Or it was until it stopped working last week. Now the car just complains that it can't find any phones, and prompts to "start pairing". Being a logical kind of person, I began my fault diagnosis by pairing my phone (exactly the same brand and model) and it worked fine, though it took ages to connect. After a bit of investigation, it seemed that the connection was held up while it was downloading the contacts list. My phone has photos for some contacts (I cloned the list from my wife's some while back so that I had all our "vital numbers" in my phone), and also includes several entries that are auto-populated from our corporate SharePoint because the ActiveSync is to my Microsoft email account.

    Aha! I wonder if it's the photos that are the problem? So I fire up OWA to delete the photos from the contacts as an experiment. But you can't. There seems to be no way other than deleting the contact and recreating it. Next, try "real" Outlook 2013. Again, no option in the Edit pane to remove a photo from a contact. And then I notice the list of "sources" from which the information for each contact is collated. How clever, and how annoying. It was only after a search of the web that I found you have to choose one of the entries in the "View Source" list, where you can right-click the photo and select "Remove picture".

    After half an hour of this multi-step rigmarole I had a list of photo-less contacts (except for the corporate contacts that it refuses to remove). And, back in the car, the phone connected and populated the contacts list in less than 30 seconds. So obviously that's the problem. The car tries to load the photo-populated contacts list that's multi-megabytes in size, times out part way through, and decides that it can't connect to the phone.

    Of course, there's a pop-up dialog when you pair a phone that asks if you want to allow access to the contacts list on the phone. Instead of saying yes, I tried saying no - thinking it would solve the problem. But then the phone becomes pretty much unusable through the voice-activated or in-car menu interface because there's no numbers, although you can answer incoming calls and dial numbers that you can remember. So it seems that you need to make a choice between pretty smiling faces for your contacts or usability in your car.

    Though, according to a recent survey I saw in the newspaper, only one person in ten actually remembers more than one phone number these days because the phone does all the harvesting and remembering of numbers automatically. Which was followed by another report that one in seven people become "highly stressed" if their phone battery runs down, and "would find life almost impossible" if they lost their phone!

    I admit that I worry about losing all the contacts that my wife and I have collected on our phones over time, but I reckon we're reasonably well protected because ActiveSync keeps the lists in our email accounts up to date, and I consciously export the contacts list to a file on a regular basis in case both of our email providers decide we're persona non-grata at some point. But that's just my usual paranoia.

    Mind you, now I regularly have to suffer wife-generated complaints that our DECT house phones don't show the name of people when they call on the landline. If her mobile phone knows who the caller is, why doesn't the ordinary phone? Yes it has a contact list maintained by the base station that's available in all the handsets, and I did spend an evening entering the most commonly used numbers. But how do I justify to her that modern technology still has some wide disconnects, when a simple mobile phone can do everything by itself?

    Maybe it's time to switch over to IP Telephony in our house. I'm sure I've got an old Cisco router that does IPT in my collection of spare hardware. I wonder if it can do ActiveSync with an email contacts list...

Page 2 of 41 (325 items) 12345»