Random Disconnected Diatribes of a p&p Documentation Engineer
So my Windows 8 adventure has been terminated after only a brief foray into the delights of the new O/S. It's annual review time here an Microsoft, which means I need to connect to the corporate network. But my Windows 8 machine can't do that because the TPM module is faulty, and I need to have BitLocker enabled before they'll allow me to talk to the big iron in Seattle. So the old hard disk with Vista has seen the light of day again (or, to be more accurate, the dark of inside my laptop) and I'm back in 2006.
I never really noticed how little I really need to use the corporate network now that most of the applications I use in my daily work are in the cloud. I use our corporate ADFS for federated authentication, allowing me to access all our working docs stored in the TFS service in the cloud, and to connect to the various third party sites that manage our internal processes. And because my daily work is centered on Windows Azure, all of the working sets we use are available without needing even a sniff of the internal systems on the corporate network.
Even my online storage and email is cloud-powered now, and I'm being urged to make more use of cloud-based systems such as Office 356 in my daily work. It's quite amazing to see how the cloud is creeping, almost unnoticed, into everything connected with our IT world. It's a real vindication of what we've been writing here in p&p about claims-based authentication, moving applications to the cloud, and building enterprise solutions in Windows Azure.
Of course, I still have Windows 8 on my RT tablet, so I'm not completely divorced from 2013. OK, so it can't connect to my email server or the corporate network, but it means I can continue to figure out how to do stuff with the new O/S. Though sometimes I still look like an amateur. I used the rather good camera to take some photos this week to email to a colleague. However, having poked about to find where they end up being stored, and then got one showing full screen, every attempt to share it through Hotmail gave an error that my email wasn't set up correctly.
And then I couldn't figure out how to go back. There's no back button until you poke the screen. Then it took several minutes of wildly experimental prodding and sliding to get my email inbox and the photo showing side by side, and then there seemed no way to drag the photo into an email. Maybe I need to read the instructions. In the end I dropped into the desktop and did it the old fashioned way. I love the style of Windows 8 and the way that you can do lots of things by poking and sliding, but it really doesn't seem intuitive sometimes - or maybe it's just too clever.
Of course, the more I use it the easier all this will be. Except that I discovered a major problem now that summer is almost here. It's pretty much unusable in the conservatory unless I huddle under an overcoat like somebody selling bootleg watches. The reflectivity of the screen means that all I can see is my ugly mug and the sky (complete with clouds, so it looks a bit like Windows XP desktop). Though I guess this is an issue with all touch-screen devices. I have to go indoors to be able to see the screen on my phone.
What's worrying is that my new company laptop, when it finally arrives, will have a touch screen. Perhaps I'll never see the garden and conservatory again. I'll need to lock myself away in a dark room, or work covered with a sheet like some character from a third rate horror movie. My wife can just lift one edge and slide my meals underneath, and tell visitors that her husband is in the conservatory under a sheet, stored away like some item of old furniture (though maybe that's not so far from the truth).
Can I buy a non-reflective cover for a laptop touch-screen?
It's safe to assume that nobody could accuse me of being an eco-warrior. I buy cars that have more engine than I need, and computers with more power than is required to run any software I might ever use. And I quite happily squander electricity on a waterfall in the pond and lights in the trees, just to make the garden look nice. The trouble is that the electricity company seems to think that I should pay an increasingly exorbitant price for it.
We've all seen those TV programs where people build eco-friendly houses that generate their own electricity, collect and re-use rainwater, and suck heat out of the ground instead of paying for gas. Meanwhile the government is gradually covering the entire countryside with huge windmills and solar farms in an attempt to meet some green target, yet all this free electricity just seems to cost more every month. The electricity company just sent me an estimate of my next year's bill, and they reckon it will be over 1,500 pounds. It's time I figured out how to either use less, or pay less, or even get some for free.
Working on the "no free lunch" principle, you'd guess that the only people who would benefit from the current fad around solar panels on the roof would be the installers and panel manufacturers. However, talking to some neighbors who have taken the plunge, they have seen a considerable reduction in electricity bills and get a payment for feed-in four times a year as well. So it's probably time I took the plunge and turned our south-facing roof into a miniature power station. At least it should generate enough to run my servers for a few months in the summer...
But where you have to wonder is that, in a country that will supposedly be unable to keep the lights on beyond 2016, we are planning to throw huge amounts of money we don't have at a project that will eat electricity and blight thousands of people's lives. Nobody can produce a realistic rationalization for it, yet it will probably cost the best part of a hundred billion pounds by the time it's done.
At a time when the world is moving towards driverless cars, all-encompassing digital communication, increasing home working and localization, and the need to lead a green lifestyle, we're going to build a high-speed railway to connect the North and the Midlands with London. Yet we can't afford to build a high-speed broadband fixed and mobile system that would cost a tiny fraction of that amount.
OK, so I'm lucky because we happen to have cable here, but half a mile away our local post office is struggling to use modern retail technology over a 1MB ADSL line. My ADSL provider just sent me a beautiful color leaflet explaining how I will soon be able to watch loads of new sports channels over their Internet connection. At the bottom in very small writing it says that all I need is a 2MB ADSL line. Yes please, when can I have one? What's that? You're planning to have fibre-to-the-cabinet installed sometime in 2015? Super.
In the meantime I suppose I can just go to London on the high-speed train and watch it live instead...
I suppose it's a bit like that film The Matrix - you realize that you live in an ethereal and closed world only when you actually get to step outside of it. Or, like some people who have never been to another country, your view of the rest of the world is shaped just by what you see on TV. I guess I've been like that with open source stuff, and particularly Java; looking out incredulously from my little village of Microsoft technologies and products at the wide world beyond.
I started my computing days with what we euphemistically called a "home computer" (basically a games console with a keyboard), and progressed via a series of Amstrad computers to real PCs. At the time I was doing statistical and reporting work for a large manufacturing company, and had played with several MSDOS-based databases until I finally found nirvana with Windows 3.1 and Microsoft Access 1.0.
OK, so I'd learned a lot of computing theory in the meantime (such as a mixture of languages and programming theory), and I'd even written and sold technical, commercial, and business software. But most of it, especially after drifting deeper into the Windows way, was aimed at Microsoft operating systems and integration with Microsoft products. Gradually I'd been drawn in and captured by the Redmond magic.
It's only since I began work on our current HDInsight project that I've had to navigate deeper into the dark and scary jungles of open source; slashing away at the undergrowth of bewildering terminology with my virtual machete (Bing); wading knee-deep through murky and meandering streams of sometimes conflicting advice and guidance; and peering in amazement at the vast array of previously undiscovered wonders of nature such as ants, pigs, hives, zebra, and even a strange yellow elephant.
Sitting inside my nice comfy and well-defined Microsoft technology world, it's un-nerving to realize that until recently I never knew that all of this even existed. OK, so I've had dealings with Java, though mainly only under duress, and I've read about and even learnt a bit of other languages and frameworks such as Python and Ruby. In fact my first serious attempts at creating Windows DLLs were with Pascal (mainly because I could never get my head round C++). So Java code itself isn't really an issue.
No, where it all gets complicated is that almost all of the docs I read about Hadoop and the associated technologies are written by experts for experts. It seems like you need to know all about a whole range of topics and technologies before you can start learning about them. It's a bit like letting someone watch a medical drama series on TV and then giving them a scalpel, an operating theatre, and some (currently) live patients to practice on.
For example, I read endless articles about testing and debugging. It seems I should start by mocking out my objects (makes sense) and use a test runner to execute them within a single node local installation of Hadoop and with the Java virtual machine in advanced debug mode (I think). So I read about PowerMock, but it says I should use it with Mockito, but that says it's an extension of EasyMock. And I probably want to do it all from within Eclipse.
And to set up a local cluster I need to install Hadoop directly, although I have the HDInsight single node development environment already installed on my laptop. Can I use that instead? And if I want to run the Java VM in debug mode I probably want to use another program called ant to configure it. I'm sure I'll work all this out in time, though at the moment I'm wondering if it's going to be easier just to phone a friend (or ask the audience).
What's clear, however, is just how adaptable, configurable, and interoperable all this stuff is. I suppose I'm used to a strict product hierarchy and road map, nice dialog boxes and configuration web pages, and reams of beginner documentation for the Microsoft technologies that make up the confines of my own little world. Now I've strayed outside of the Microsoft Matrix, and discovered a whole wide world out there, I can't find a map. I'll probably spend the rest of my writing days blundering through the undergrowth, peering hopefully at every new edifice of civilization I can find, until - hopefully - it all start to fall into place. Or until some kind soul (aka Program Manager) taps me on the shoulder and mutters the equivalent of "Dr. Livingstone I presume."
Yes, the adventure is fun at the moment, but I'm quite looking forward to going home again...
So this week we had another digital failure in our household. At least I'm happy about the fact it was much less embarrassing than some other technical discrepancies that seem to have befallen the worlds of science and engineering recently.
The rather expensive Soundbridge Internet radio I purchased a few years ago, in response to my wife's requirement for more "proper" rock music stations in a location where Digital Audio Broadcasting (DAB) cannot reach, has curled up and died. And it did so in the usual disappointing digital way - no puff of smoke, loud bang, or any external sign that something had gone wrong. It just refuses to turn on or do anything at all. No doubt following the modern failure paradigm for electrical goods that I've described before (see Why Doesn't Stuff Go Bang Any More?).
Of course, my local disaster doesn't by any stretch of the imagination compare to that of Queensland University's Pitch Drop Embarrassment. This fascinating experiment, complete with live webcam view, is now so famous that it got a mention not only on TheRegister.co.uk, but also a half-page spread in one of our national newspapers - which happened to mention that the last time anything exciting happened, several years ago and unseen by any human eye, they rushed to watch the video and only discovered at that point the webcam had failed. I notice in the pictures that they now have three webcams watching for the burst of activity that's confidently expected to occur sometime this year.
But a faulty webcam can't get near to competing with the issue at Sweden's Ringhals power station last year, where they could have saved a lot of money just by installing a webcam. It seems that someone accidently left a vacuum cleaner inside one of the containment vessels, and nobody noticed when they started a pressure test. The resulting fire reportedly caused hundreds of million dollars of damage. At least the boss of the plant, Peter Gango, was able to quickly pinpoint the root cause of the event by telling reporters that "those items aren't supposed to be left in the containment when testing."
And another nice use of language to disguise webcam-related disasters has come to my notice recently. It seems that, before being abandoned altogether a couple of years ago, the wandering camera-equipped Spirit rover NASA sent to Mars had become permanently stuck in a sand drift. At which point it was re-designated from a "rover" to a "stationary research station." Maybe I should just re-designate my Soundbridge radio as a "non-audible listening device."
But instead I've taken the plunge and replaced the Soundbridge with an equivalent from Roberts Radio. It actually seems to work better than the Soundbridge (which doesn't appear to be available any longer), and has some neat features such as integration with Last.fm, and it even connects to a UPnP stream exposed by Microsoft Media Player. Though where we live it can't manage to drag a usable FM station out of the ether, and only finds a single DAB channel with enough meter bars to be usable.
Probably I should save money by avoiding all these high-tech devices, and use the cash to move house. Somewhere that gets radio and TV reception without needing huge aerials and signal boosters. Maybe somewhere like this...
You have to wonder whether the increasing use of tablet computers and touchscreens means we'll soon be back to the equivalent of a world that depends on stone axes and making fire by rubbing two sticks together. At the moment I'm doing my utmost to hang on to some semblance of advanced device interaction technique but there's a good chance that, in time, I'll also succumb.
This musing began when I noticed the gradual change in proddy-finger technique used by my wife with her new Nexus tablet. Previously, her frantic interactive facebooking and emailing sessions were rudely interrupted every few hours by the fact that the text on the screen became unreadable through a layer of greasy finger marks.
However, the fancy cover she bought to protect the device came with a neat pen-shaped, rubber-tipped stylus, and within days she'd become completely dependent on this. Now the screen is pristine after even the heaviest sessions of online social interaction. She tells me it's not only easier than using your finger, but more accurate and faster as well. And I have to admit that, after trying it out on my Surface RT, I can only agree.
But here's the thing. While I'm not the fastest or most accurate typist, I do manage to employ several fingers most of the time, and even a thumb or two for spaces now and then. And I can do it quite easily with the onscreen touch keyboard (in fact I'm doing it right now). However, watching my stylus-converted wife I realized that she was back in the world of one-fingered pecking using the equivalent of a pointed stick, rather than actually typing.
I suppose you could use a combination of fingers and stylus in the appropriate places, but that doesn't solve the greasy finger problem. Maybe the answer is gloves that have rubber tips of the correct flesh-matched consistency on all the fingers. Or just keep some wet wipes handy. Perhaps somebody already makes a cover for popular models of tablet computer that has a special holder for a packet of wet wipes.
Of course you could apply the "horses for courses" argument and say that some tasks should be carried out on a tablet, and others only on a real computer. During a recent discussion about applying Microsoft's Accessibility Standard (for example, you can't say "right-click" because there might not be a mouse) to a Hands-on Lab document we are creating, a colleague suggested that "nobody in their right mind would use a touch-screen device to run Visual Studio." OK, so basically I have to agree that writing programs in VS on a 7" tablet wouldn't be my idea of fun.
But many new laptops and convertible devices have a proper keyboard, a mouse track pad, and a touch screen. So I could just as easily be tempted into some proddy-finger action after typing a Lamba expression, rather than reaching for the mouse. Comments I'm already hearing from converted users of convertible devices is that it's a real shock going back to a computer where finger-on-screen action results only in greasy fingerprints. Jabbing at onscreen buttons with an index finger is much quicker than grabbing a plastic desk-bound rodent, or scratching around on a track pad to find where you left the mouse-pointer last time - and then manoeuvring it around the screen.
And maybe this transition to touch-screen interaction is becoming more obvious through its impact on the industry as a whole. I recently read that Logitech, best known for its keyboards and mice, went from a profit of $37m a year ago to a loss of $24m last year. That's a lot of unsold mice. Though it's likely that the difference was also caused by a reduction in sales of other traditional accessories that we no longer seem to need.
For example, instead of a monitor riser stand we now crouch uncomfortably over the tiny screen of a desk-located or knee-bound laptop or tablet. We don't need an ergonomically designed keyboard with soft-touch keys any more, we just get finger-impact injuries and stiff shoulder muscles. No requirement for a carefully designed mouse means additional wear on elbow joints as we scroll and point all around the screen. And the lack of a cushioned wrist rest is certain to speed the development of RSI.
Of course, evolution will soon resolve these problems for us. In only a few thousand years the successful members of the human race will have developed a long cranked neck, thin pointy grease-free fingers, and even a much larger nose to support our Internet-enabled glasses.
Those of us with small noses and fat fingers who fail to evolve will, of course, be easy to identify. We'll be the ones searching Amazon for sharp stones and abrasive sticks...
So I'm still fighting to keep Media Center as the main TV system here at chez Derbyshire, despite the veritable aggro it seems to spew out at regular intervals. On top of the increasing vagueness of the program categorization in the guide (it seems to think that "The Only Way Is Essex" is a documentary), the habit of the TV tuners to go walkabout when asked to do three things at once causes huge disruption to my wife's carefully crafted soap recording schedule.
OK, so maybe the hardware is aging a little, and the TV tuner card isn't the best in the world, but I've yet to find a decent compact and quiet Media Center box to replace it. So once again this week I've been plying it with more custom kludge software in an attempt to stay ahead of the aggravation curve. So, yes, it's yet another post about how to manage Media Center...
I've tried using services that regularly reboot the machine on a fixed schedule, or that just restart hardware drivers and services, but none seemed to be the right solution. Both tend to fire up at inappropriate times (usually in the middle of a recording or when you're trying to watch something it did manage to record), and so I finally decided that a notification mechanism was the optimum answer.
Failures are not that common anyway, occurring generally a couple of times a month, so rebooting every night is overkill while rebooting once a week means an error might go on for five or six days (or until you reboot manually). I considered a system of restarting the computer only when an error is detected, but then you're back to the rebooting at inappropriate times problem - often it will continue to record most things when only one tuner has stopped working.
And, anyway, how do I detect when an error has occurred? The obvious way is to look in the Windows Event Logs. The log named Media Center (in the Applications and Services section) helpfully contains Error events when a tuner fails, and meaningful Warning events when a program fails to be recorded correctly. And I already have a utility that can read event logs, so I'm half way there.
So I set to and created a modified version of the Server Monitor service I use to monitor my main network servers by removing all of the unnecessary crud and adding a capability to specify three pairs of values for the event source name and the partial text of an event message that the service will search for in the logs. The service already supports running scripts, external programs, and sending email messages. It also writes its own events to the Application log so that you can set up Scheduled Tasks that run in the context of a specified user account (the service needs to run under the Local System account). This means that you can use batch files kicked off by an event-driven Scheduled Task to do things such as closing and restarting the Media Center UI.
Initially I played with firing off a script from the service to reboot the computer using the shutdown command, but (as noted above) it's not an ideal solution unless you need to be away from the house for a while and you need Media Center to look after itself. Instead I set it up to search twice a day for new events in the Media Center log with the source Recording or ehRecvr, and containing things like "failure" and "not recorded" in the error message, and to send an email listing all of the new warnings and errors.
You'll need an SMTP server to actually send the emails. I set up my internal IIS 6.0 SMTP Service that sends emails from the various services and devices on my network (such as UPSs, the NAS, and the server monitor services) to allow relaying for the Media Center machine, or you can probably relay through your own email provider (such as Hotmail) instead.
So far it's worked well. You can look at the recording history within the Media Center ten-foot interface and see if there really is a problem, and reboot from the Settings section of the home screen when required. All you need is a smartphone or tablet by your side that can receive emails, and you'll never need to leave your armchair again.
However, deleting the history entries in Media Center doesn't seem to delete them from the Media Center log file - I guess that Media Center just listens for new events and stores them in its own database. To prevent being repeatedly reminded of old events after restarting the computer I added a configuration setting to the service that can be set to force it to ignore any existing events. Alternatively, you can simply clear the Media Center log when you reboot the machine.
If you want to access Windows Event Logs from a remote machine, such as one that has administrative privileges on a Media Center box that's running under a user account, you'll probably need to enable the inbound Windows Firewall rules for "Remote Event Log Management" (click the Advanced link in the Windows Firewall dialog or open "Widows Firewall with Advanced Security" from the Start menu).
You can download the service I created for free, including the Visual Studio 2010 source code, from here if you want to give it a try - and maybe even adapt it to suit your own requirements. There's no limitations on how you use the code, and there's loads of configuration settings so it might be useful in other scenarios as well.
But be ready for your partner to suddenly mention in the middle of a conversation that they "just got an email from the TV"...
One of the great things about being a technology pessimist is that you don't suffer that sinking feeling when something doesn't work as expected. And, of course, you get to experience a nice ripple of surprised elation when stuff actually does work as it's supposed to. This week I've experienced a roughly equal mixture of both.
I've rambled on in the past about the rather nice Dell Latitude E4300 that's been my main portable technology plaything over the last four years or so. It's got 4 gig of memory, a reasonably fast CPU, plenty of ports for my legacy peripherals, a superb matte screen with LED backlight, and all of this lives in a rather attractive metallic red case. And it's amazingly light and compact as well – easily the best laptop I've ever owned.
However, a constant stream of bleeding-edge pre-release software and myriad updates to the underlying Vista O/S (yes, I'm still a satisfied Vista user on some old kit) has taken its toll. Several things don't work very well, and it seems to spend the first half an hour of every day trying to collect all the wayward ones and zeros into some semblance of order - instead of doing anything useful such as opening my email inbox.
So I decided to experiment with the shiny new Windows 8 to see if I can live with it on a non-touch screen. I've been using it on the Surface RT that arrived a couple of weeks ago and I actually quite like it. In fact I'm writing this post on it right now. I'm also due to receive a super-duper touchy-feely ASUS laptop from work any day now, which will be great for collecting greasy fingerprints - and even for connecting to the corporate big iron when I need to do some internally-connected stuff.
So why not see if I can frighten the old Dell back to life by installing a scary new O/S? Or, even better, see if I can give it a whole new lease of life, like I did by upgrading an old XPS laptop to Windows 7 some while back. Mind you, there seemed no point in putting all this effort into it when the hard drive was probably one of the root causes of the arthritic performance, so before anything else I ordered a 128 gig SSD to replace it. The nice people at Crucial even do a proper upgrade kit for the E4300, so the hardware installation bit was only a ten minute job – including blowing the dust out of the innards after I took all the wrong panels off before I read the instructions.
After that installing Windows 8 was a breeze, and the change in performance is startling. It boots in less than ten seconds, and loads applications like they were already running in another window. Amazing. I've always been a bit reticent about SSDs after reading reports about doubtful reliability with the early ones, but it seems that even some parts of Windows Azure datacenters use them to maximize performance, so I guess they figured out how to make them more reliable.
It wasn't long before I'd got Office 2013, Visual Studio 2012, and all the other bits and pieces I use for my daily bread-winning tasks installed as well, and by now I was almost in ecstasy as the constant ripples of surprised elation overwhelmed my technological pessimism. Until I remembered why I'd never upgraded this machine to Windows 7 before – anything later than Vista must have BitLocker turned on to be allowed onto the corporate network.
But after some firkling in the BIOS I discovered a previously undiscovered TPM chip, and figured that maybe I could get BitLocker to work on this machine. It turned on OK, and quite happily encrypted my drive. However, every third reboot produced an error that "a compatible TPM module was not found” and I had to enter the recovery key (which is 48 characters long) to get it working. By teatime I'd decided that enough was enough, and turned BitLocker off again. It only took three hours to unencrypt the drive, but at least I still had a working setup.
So it seems that this machine is destined to never see CorpNet again, even if its memory will live long in the Active Directory Users and Computers list as a tribute to many years' faithful service. I just hope the new laptop arrives before I need to get access again, though I suppose I can just drop the old hard disk with Vista on it into the Dell if needs be.
And because I won't be joining it back to the corporate domain, I decided I might as well join it to my own domain here in my remote home/office. That way it will pick up all the domain rules and configuration settings, including using my own Windows Server Update Services (WSUS) server instead of going out to Windows Update each time.
And, just to balance out the pessimism-based elation/disappointment ratio, all I got from the Windows Update dialog was the incredibly useful "Error 800b0001” message. A brief Binging revealed that the WSUS server needed an application of the KB 2720211 patch, which several people reckoned would fix the problem. Oh no, it doesn't. You also need the KB 2734608 patch, which openly advertises that it fixes the problem with talking to the new "hardened” Windows Update client in Windows 8.
So at last I have what feels and looks like a new laptop, and all I need to do now is learn the keyboard equivalents of the proddy-finger actions. I quickly figured that Windows key + Q gets you to the nearest thing to a Start menu - the Search box where you can find and start all your regular programs. And I pinned the Control Panel widget to my Start (Home?) screen, along with all the applications I use most of the time, just to make life easier.
You even get a new game in Windows 8 - shuffling the start screen tiles around to get them in the order you want is just like playing one of those old picture tile slider puzzles. You have to plan four or five moves ahead or they all shuffle around randomly when you come to move the last one, and you have to start all over again.
And I guess that, if by some remote chance you're still here, you'll now be wondering about the title of this post. What I can't figure is why, when it's to set to automatically install updates, my WSUS server didn't automatically install the updates that make it to work with Windows 8. It turns out that they aren't actually on Windows Update. They are optional service updates that can prevent WSUS from distributing any custom updates you might have created. But it would have been nice to have been warned about it. Seems you only find out when you have the same problem as me, and search the web for an answer. Or if you are interested enough in WSUS to sign up to some mailing list.
But perhaps it's a good thing that everything didn't go right at the first attempt. I'm not sure my constitution could cope with an excess of surprised elation ripples at my time of life...
Why would you want to use a "Big Data" solution? It's a question that we've been trying to answer in the first chapter of our forthcoming p&p guide to Windows Azure HDInsight. For a long while, everything we found on the web and in the original HDInsight docs on the website talked just about the volume and unstructured nature of the data as the justification.
Meanwhile the docs and presentations from Hortonworks (who created the Hadoop implementation behind HDInsight), and existing books about Big Data, all have comparison tables for a relational database and Hadoop/HDFS. And all of these concentrate on the differences in data volume, disk write speed, handling unstructured data formats, the point at which a schema is applied to the data, and how query processing is distributed.
OK, so this is valid and useful information, but it all comes back to the suggestion that you should choose a Big Data solution such as HDInsight only when a relational database just can't cut it. You know how it is - your boss just sent you 20 Petabytes of data as text strings and he's happy to wait till next Tuesday for an answer to his question. This probably happens to lots of people every week.
But when you really start to think about it, and talk to people who actually know about enterprise databases, data warehouse systems, and BI (Hi, Graeme) you discover that almost anything real people want to do is most likely perfectly possible using the kit that's already running in their datacenter. SQL Server Parallel Data Warehouse (PDW) can cope with Terabytes of data, which realistically is all that most people will have. Let's face it, the total size of the census data for the whole of the U.S. is only a few hundred GB.
And the fact that Hadoop "moves the processing to the data instead of moving the data to the database engine" doesn't mean that much if the database has fibre connections to the data store. Yes, the fact that Hadoop does it as distributed parallel tasks might speed up queries, but PDW is built to do that, and there's typically no shortage of cores in modern processors. I'll accept that you need some serious hardware for SQL Server and PDW if you are doing anything more than pretending to be a DBA, but for most organizations that already do proper BI this is pretty much the case.
So, like many people, I was starting to wonder if this Big Data thing was just another fad that would be gone in a year's time. Is it actually "Big Hype", or even an updated version of the famous old IBM misquote: "I think there is a world market for maybe five Big Data systems."
Except until I watched a presentation by Microsoft technical fellow Dave Campbell. The point he made is that it's not really about any of the comparisons related to volume, or structure, or parallel processing, or distributed storage. These are just technical details. What it's really about is getting insights from data. Which is probably why they called it HDInsight.
When your mind is wandering while you're in the shower or eating your cornflakes, and you suddenly get hit by inspiration about some question you might be able to answer by querying all that data you keep collecting, you've discovered what Big Data is all about. If you need to go and see your data architect, DBA, or data steward to implement your inspiration they'll tell you it will take two weeks to design the query, a week to update the data models, three days to cleanse and validate the source data, and a day to set up the report. And it's perfectly possible that, after all this, the report won't show anything useful. Or it will show that you should have asked a different question.
If you are a Douglas Adams fan, no doubt you are already mumbling "Deep Thought" to yourself.
In the presentation Dave talked about how you simply fire up a new cluster in HDInsight, load the data, and ask the question. If the answer is useful, then you know what query you need to get your DBA to create in your data warehouse. If the answer isn't useful, but suggests that a different question might be interesting, then you go ahead and ask that question. And if there are no questions that provide useful answers, then the only thing you've lost is a few hours of your time and the hourly cost of the "pay only for what you use" Windows Azure HDInsight cluster.
And it's fairly safe to assume there are more than five organizations in the world that would find this useful...