Random Disconnected Diatribes of a p&p Documentation Engineer
In the rapidly expanding realm of computing technologies, it's reasonable to assume that most developers have only a limited spread of knowledge. I regularly hear it said that keeping up with the welter of new frameworks, platforms, systems, and capabilities is almost impossible. Except that it's only occasionally you actually get abruptly confronted with this uncomfortable truth.
I freely admit that I'm fairly solidly fixed in the Microsoft world these days, but even then there are loads of areas that I have only viewed from afar. I've never tried to build an app for Windows 8 store; or played with StreamInsight and SQL Server Integration Services; or even seen an Xbox in action – never mind tried to write programs for it. And my experience with WPF, WCF, BizTalk, and SharePoint can optimistically be described as fleeting.
So the "how little you really know" event happened to me twice this week. The first was when writing map/reduce code for HDInsight. Unless you use the Hadoop streaming interface, or some fancy framework, the code has to be Java. Not a problem - Java isn't one of my strengths, but if you know a few procedural programming languages such as VB, C#, and Pascal (as opposed to declarative languages such as Lisp) working in a different one is not a major problem.
In fact, a friend who is multi-lingual often remarks that, once you've learned a couple of foreign languages, adding new ones is much easier. So it is with programming languages. You just need to figure out the equivalent dictionary words (in programming terms, the objects and methods) and master the pronunciation (the programming runtime environment).
Yet, try as I might, I could not get my Java code to execute. It compiled fine without errors, and loaded. But it seems that I missed some fundamental stage between the compiler and the runtime environment. Perhaps because there are endless different examples and reference topics on the web that say different things, and the object libraries in Hadoop on HDInsight seem to bear no relationship to the online docs and examples.
I guess the days of being a "developer" in the IT world are long gone, but maybe even a specialization such as "web developer" is now a thing of the past. Perhaps we are an industry of increasingly narrow focused specializations, because each is so complex - and is just one of a rapidly expanding domain. Maybe now you need to be a "rapid Android app developer", or an "SEO optimization engineer", or even a "presentation style management administrator".
But I suppose this fragmentation is just like what's happened in other, much older professions. I probably wouldn't want an osteopath to fix my teeth, or a pulmonary hypertension cardiologist to write the prescription for my new spectacles...
How could they get it so wrong? I've been very happy with all the other Netgear hardware scattered across my network, including ADSL modems, switches, and the NAS box, but I'm beginning to wonder if the WNDR 4500v2 wireless router I upgraded to last year was such a great idea. Especially when a firmware update seem so problematic. It's a shame because, other than the management UI issues, it's a really nice piece of kit that seems to offer very solid, fast, and reliable wireless connectivity.
The latest problems all came about because I read of a serious security vulnerability in the wireless feature of Virgin cable modems, which it seems are based on Netgear wireless routers. I have wireless disabled in my Virgin modem, and you can't actually upgrade the firmware yourself anyway - I assume Virgin will do an automatic update at some point. But it prompted me to check for updated firmware for my Netgear wireless router (which I use as an access point for my network). Supposedly it checks automatically, but you can also kick off a version check manually.
So I did, and after 10 minutes it was obvious that it couldn't connect to the Netgear server. Maybe it uses some esoteric port that my firewall blocks, or maybe it's just broken. So I toddle over to the Netgear website and discover there is an update that fixes several issues and vulnerabilities. No problem; read the release notes, download it, and install it through the router's web UI. Which seems to have worked fine when everything comes back up again.
Interestingly, the release notes say you should do a full settings erase after upgrading, but then says that you should write down all the settings you changed from the default values, since you may need to re-enter them manually. My guess is that you'll definitely need to re-enter them afterwards. But mine is configured with a fixed IP address and set up as an access point, so I'd need to mess around plugging in wires just to reload the configuration from a previously saved config file (although this turned out to be the least of my worries).
Instead, after the update, I ran through the settings to confirm everything was as expected. It's nice to see that they have finally finished the UI section for setting up the router as an access point (see Missing The (Access) Point). And it actually does say "Access Point" in the main menu instead if the cryptic "AP Mode" entry. They even populated the empty section of the "help" pop-up. Though help sections for some other pages of the configuration seem to bear little relationship to the actual UI.
They also removed the link to configure the MAC address-based access control settings from its previous home, and now it lives in the main menu. And when I did find it, I was amazed (and seriously perturbed) to discover that it was completely disabled - and that half a dozen unrecognized devices were shown as connected. Reloading the previous configuration from a saved backup file made no difference. How on earth can they get away with that?
So I set about reconfiguring the access control using a list of MAC addresses I thankfully printed out a while ago. And realized what a hash they've made of what was a quite usable and informative approach in the previous version. Yes, after you turn on access control you can quickly allow or block any currently connected device. The list also shows the NETBIOS names of each device and the IP address on the network. Though several non-Windows devices don't show a name, and some don't show an IP address either. It does say in the UI that "intruders" will also show up in the list, but without a name how can you tell?
In the previous version it remembered all allowed devices and allowed you to add a description for each one so it was easy to see what they all are. In this new version you can create a list of "allowed devices that are not currently connected" and provide a description. Though you have to turn off any devices that are connected and reboot the router so they aren't shown in the "currently connected" list before you can add them as an allowed device with a description – otherwise you get a "duplicate MAC address error" message. And after all that effort, when they do connect again, the list doesn't show the name or description (even though the router now knows what they are) so you still don't know what's actually connected.
Besides which, it's a long multi-click routine to add each device to the allowed list, made worse by the fact that the list is hidden under a "Click here" link every time the page loads. And if you make a mistake and want to remove an item from this list you're back in the half-finished UI world. There's a checkbox next to each item and an "Add" button, plus a small unmarked blue square that turns out to be the "Delete" button when you adopt the usual practice of clicking wildly around the page to see what happens.
And then, as computers that are allowed access are shut down, they appear in the "allowed devices that are not currently connected" list. Except they often appeared with the last two segments of the MAC address set to "00" and no name/description. It's almost impossible to tell what's going on. Yet, strangely, after a few days it seems to have started remembering the names of devices - at least those that have a NETBIOS name - and successfully shuffles them from one list to another as they come online or go offline. Perhaps if I just leave it alone it will sort itself out.
You now also have to allow or block wired devices that are on your network, but don't use wireless. Where a device has both wired and wireless interfaces you have to allow both separately. Why? All this does is stop something physically connected to your network from trying to open the config UI. OK, it does add some extra security if you don't know who might get physical access you your network, but it seems perverse blocking this but still allowing wireless access to the config UI. I suspect that any intruders that manage to get into the premises will have more pressing things to do that plug their laptops into the router – even if they did remember to bring an Ethernet cable with them.
But at least Netgear did manage to populate the pop-up help section with useful advice about using the access control feature. Though it seems odd that they "strongly suggest" choosing the "Allow all new devices to connect automatically" option, rather than "Block all new devices from connecting". If you allow the connection of any previously unknown device that you didn't specifically add to the blocked list, what's the point in turning on the access control feature?
Mind you, MAC-based access control might be less vital if the router had the two most obvious security features that others seem to include - the ability to block access to the management UI from all non-wired connected devices (to prevent wireless intruders from accessing the configuration) and the ability to reduce the power of the wireless signal so that it doesn't fill the whole street. I was hoping to find these options in the updated firmware, but no luck. You can change the maximum speed of the wireless connection, but nowhere does it indicate if this changes the power of the signal.
Of course, I'm guessing that I'm in a very small minority of people who bother with setting up access control, and that millions of these routers will never see any firmware updates anyway because most users will set them up and never look at the management UI again until something breaks. Maybe the firmware updates should be applied automatically, as with Windows update? Though an automatic update that automatically turns off security settings (as this one does) would be seriously worrying.
And should I actually be concerned about someone in the street connecting to my wireless network? They'd need to know the SSID (which I configure the router not to broadcast) and the passkey, though it seems that the latest firmware upgrade fixes a vulnerability that might allow intruders to bypass the authentication. Well, it would put them on my internal network behind the firewall, even though they'd need a username and password to connect to any other resource. It would also allow them to soak up some of my bandwidth, which could be a problem because one of my ISP connections is metered and chargeable beyond a certain limit.
Plus, with the increasing focus on ISPs blocking "inappropriate content" of various kinds, how long would it be before I get a visit from the thought police when my ISP records lots of attempted accesses to nefarious websites or illegal file sharing sites? I'm guessing that there will be plenty of technically savvy young people whose home connection is monitored or filtered, and who figure that someone else's Wi-Fi is an alternative source of connectivity.
However, it's increasingly the case that open Wi-Fi connections are popping up all over. When I first saw one or two appearing in my network connections dialog, bearing SSIDs that include the names of our major telcos, I wondered where they were coming from. The answer is that most new wireless routers include a guest network that is enabled by default. OK, so it's isolated from your own connection, but it shares your bandwidth. And I sincerely hope they also use a different IP address, or we're back with the thought police issue again. I haven't got round to testing this - I disable the guest network on all my routers, but I'll bet that most non-technical people don't even know it's there.
In fact it seems like a rather interesting (and somewhat insidious) way that the major telcos have found to widen Wi-Fi access without paying for it themselves, or even telling people what's happening. In most cases the customers have to pay for the router when subscribing to a package from an ISP, and they certainly pay for the electricity it uses. Though to be fair, and only because I have a business package, Virgin did tell me about the guest network capability of their modem. But that's because they punt it as an advantage - it allows visitors to my company premises to "enjoy the benefits of wireless connectivity".
Meanwhile I've discovered how hotels can afford to offer free Wi-Fi. During our recent trip to Iceland, the free hotel Wi-Fi required an email address and "click the link in the email" confirmation - which meant I had to use a real email address to avoid getting kicked off after 15 minutes. Since then I've been flooded with spam emails, all in Icelandic...
I really am trying to get used to the dumbed-down (sorry, I should say "user-friendly") move towards simple language and a less technical description of the options and features in modern software UIs. Messages such as "We're working on it" and "Something went wrong" feel like they would have been programmer's jokes only a few years ago, but now they are the accepted way to communicate with the "average user".
I came across another today on my Surface RT: "We've found new updates today, and we'll install them for you soon." No option to say "Well just do it now" or any indication of when "soon" might be. OK, so I can fire up the old Windows Update dialog from the Start screen and get all the usual functionality. But it's more the use of "we" that I find odd.
In the days when I wrote for Wrox Press here in England we used "we" extensively as a way to involve readers, and help them feel we were sharing their pain when programming or administering software. But when Wrox closed down and I started writing for US publishers I was told that you talked to readers, not worked with them. It was "you" not "we".
So does "we" in the software you use, rather than the books you read, mean something different? Are the programmers who wrote your O/S actually sharing your pain? I reckon the use of "we" is designed to make users think that there's a huge group of vigilant technical operators just waiting for them to turn their computer on and do something.
Maybe it's a bit like you see on those TV programs about nuclear power stations, or in NASA mission control, with hundreds of people fervently staring at banks of computer screens with slowly decrementing counters that determine when "soon" becomes "now" and they can "install them for you". Mike at desk 93 has just hit the big red button to install the latest updates for Mrs. Smith at 17 Willowlessgrove Avenue in Walmington-on-Sea, while Sarah at desk 426 is about to let Mr. Jones in Longleaf, North Carolina know that we've finally finished working on it.
Of course, what I see in real life is that the new simplified interface paradigm actually benefits most average users. And I'm sure that there's been a ton of research and market testing to prove it's only us technical geeks that find it annoying. In fact, I probably wouldn't have been quite so prompted to write this rambling diatribe had it not been for perusing the management UI of my Virgin cable modem to see if there was an update available (more on that next week).
As I was exploring I found the firewall settings page, and decided to check the configuration. Even when you choose "Advanced" mode, all you get is a drop-down list with three options: "Low", "Medium", and "High". And a pop-up help tip that says just "This will set how aggressive your firewall protection is". There's no indication of whether the setting covers inbound connections, outbound connections, or both, and what ports or protocols it affects.
The default aggression setting is "Low" and I wasn't sure if it would snarl at me and take a bite out of my leg if I chose "High", but I tried it anyway. Which resulted in nothing being able to connect to anything on the ‘Net. And on "Medium", everything seemed able to connect to everything (the same as on "Low"). In the end I left it set to "Low" – I've done a penetration test to prove all inbound ports are closed, and I have a configured firewall behind it in the load-balancing router, so I guess it's not really that important.
Mind you, I came across an interesting view on the use of "we" recently when talking on the phone to some sales guy. He said that you can tell the size of a company from whether people say "we" or "I". If it's a large organization, especially one with hundreds or even thousands of employees, the person talking to you will say "I" and "me", as in "send me some details of your interesting new product". If it's a tiny company or a one-man band, the person will say "we" and "us", as in "send us a free sample of your exciting new product" (i.e. no corporate gift policies).
But that's enough rambling from us for this week – we'll be working on it and writing again soon...
It's been many years since I switched from film to digital by selling my old Pentax SLR, extensive selection of quality lenses, and bag full of assorted attachments at some ludicrously low price. Since then my photographic arsenal has included several Olympus digicams. Yet I still haven't got the knack of successfully categorizing our ever-growing collection of photos.
At first it's easy, you just drop them into suitably named folders. Like most people, I suspect, I never quite get round to adding all the tags and other info that helps you search for photos. The problem comes as the collection grows. In our house, we use Media Center as the main TV, with a modified version of an old Coding4Fun screen saver sample (see "The Screensaver Ate My Memory") so that we get slideshows of photos at random when the system is idle. Yes, we actually get to see our photos regularly rather than them gathering virtual dust hidden away on a hard disk somewhere.
The screensaver presents them like the old Polaroid instant photos, with a caption containing the folder name and the date the photo was taken. However, increasingly I noticed that sometimes the date is wrong - usually because I fine-tuned the photo, scanned it from an old hard-copy print, or some friends sent it to us long after it was taken. But what really screwed things up was when, a few weeks ago, I was forced to reduce the total storage volume. I did it by running a macro in Paint Shop Pro that removes digital camera noise and shrinks most files by up to 60%.
As you can imagine, the result is that all of the photos now displayed that date, because - as I discovered by digging out the old source code - the screensaver reads the last-modified date of the file. No problem, I thought, just change it to read the created date instead.
If you haven't tried this, here's a tip: don't bother. I started off using the .NET File.GetCreationTime method, but that just gave some random result. So I dug back into the past and tried the old FileSystemObject we used to use in ASP scripts before the days of ASP.NET. And got the same result; obviously they use the same O/S functions. And if you get round to reading the blurb on the MSDN reference pages for the methods, you'll discover that you can't expect them to work. It says that NTFS caches the creation date, so it is only correct if you actually set it in code first - which is great if what you actually want to do is find out the date because you don't know what it is. Supposedly it only caches the value "for a short time", but waiting a day and rebooting the computer had no effect.
So, no problem, the cameras all know the date and time that the photo was taken and it will be in the EXIF properties of the file. Well it seems that everything you never really wanted to know, such as the aperture setting, shutter speed, quality setting, flash mode, lens manufacturer's name, and many other undecipherable values are there, but not the date and time - the field in the Origin settings for Date Taken is empty in every one. Err, why?
Ah, but the file name is a weird combination of letters and numbers (such as P0146752.jpg), which surely must be the date in some form of encoding. Well, after several hours looking at files, taking test photos, and playing Bletchley Park code-breaker, I couldn't figure it out.
In the end, I admitted defeat and decided that the obvious answer was to include some kind of tag in the filename that showed the month and year, and which could be easily extracted in the screensaver code for use in the caption. For some unaccountable reason I chose to add a tag of the form [t-MMM yyyy], so that the photos would have a filename such as P0356381[t-May 2013].jpg. It was easy to modify the screensaver to use the current folder name and the tag so I get a caption such as "Garden Birds May 2013"). The biggest job, of course, was going through all the photos adding the appropriate tag.
But it was worth it, now we get an accurate date for each photo and my wife tells me when I got one wrong. The nice thing is that whatever I do with the file in terms if modifying it, copying it to some device that doesn't properly handle dates, or some other so-far-unforeseen action, I will always have the correct date.
So, did marital harmony return to our house? Not quite. As my wife pointed out, when you try to view the photos in Media Center (or on any other connected device) they come out in some random order. The default alphanumeric filenames aren't in ascending order by date. So when you add new files to a folder, you have to search all through to find them. Oh dear.
What I should have done, of course, is put the date in the form yyyyMM at the start of the filename. But no problem, I can write a simple utility to rename the files automatically. In fact, I can even get it to both add both a suitable prefix (such as "201409") by reading the tag in the filename, as well as including an option to automatically generate the suffix tag for the screensaver by using the last-modified date of the file when I run it over new photos as I add them to our collection.
And, purely by luck, I've just finished working on our Cloud Design Patterns guide, which regularly reinforces the need to consider idempotency for operations that may be repeated. In my simple file renaming scenario, the issue is if I run the utility again over files that have already have a tag, or both a tag and a prefix. Obviously I don't want more tags and prefixes adding to the filename, so it's vital that the code checks if the filename already contains a tag before it creates one from the last-modified date, and only translates this into a prefix if the filename doesn't already contain one (I haven't got round to implementing any actions that would update a tag or prefix).
But after all this effort, I suppose I should have thought out the solution more thoroughly first, and just used the date prefix yyyyMM - and modified the screensaver to use that. But after a few days it occurred to me that I can put anything I like in the tag, not just a date, while the prefix will ensure that the photos still appear in ascending date order. So the effort wasn't totally wasted.
Though, afterwards, someone mentioned that I could just as easily have changed the name of the file to something meaningful and displayed that in the caption...
Well, we finally did it. After many months of redesign, reconsideration, rewrites, and recombination we've let loose on the web our first release of the Cloud Design Patterns guide.
The guide is a combination of design patterns that are especially applicable to cloud-hosted applications and services. It explores the patterns in detail, provides good practice advice for when to use each pattern (and the issues to be aware of), and many have a working code example based on Windows Azure that you can download and play with.
We also included a series of guidance topics that describe specific areas of concern around building applications for the cloud, particularly on how the distributed nature of these kinds of applications has an impact on design and implementation. These guidance topics include messaging, autoscaling, metering, and multiple data center deployment. There's also a series of topics related to distributed data management such as replication and synchronization, partitioning, consistency, and caching.
We had a huge amount of feedback from the product groups, advisors, and customers during the development of the guide. This not only helped us select the most useful and popular patterns and topics, but also ensured that the topics provide good practice advice and cover the edge cases that may not be immediately obvious in cloud applications.
For example, implementing features such as load-levelling and autoscaling require you to consider many aspects of how this can affect your operations and costs, while partitioning data can have a big impact on maintenance and the performance of queries if you don't plan ahead and choose the appropriate partitioning strategy before you start.
Over time we expect to extend the range of patterns and topics in the guide. Let us know what you think. Or if you have a favorite topic or pattern that you think should be included, send me a note.
"Cloud Design Patterns" from Microsoft patterns & practices is at http://msdn.microsoft.com/en-us/library/dn568099.aspx.