Small Bites of Big Data

Cindy Gross, SQLCAT PM

With the Hadoop on Azure CTP, when you create a Hadoop cluster it expires after a few days to free up the resources for other CTP users. Therefore each time I do a demo or test I am likely to create a new Hadoop cluster. There are a few settings that it’s easy to forget about. For example, today I spun up a cluster and tried to use the Hive Pane add-in from Excel. I entered the connection information and hit the “OK” button.

clip_image002

Instead of seeing the expected option to choose the Hive table I saw this error:

clip_image003

Text version:

Error connecting to Hive server. Details:

SQL_ERROR Failed to connect to (cgross.cloudapp.net:10000): Could not connect client socket. Details: <Host: cgross.cloudapp.net Port: 10000> endpoint: 168.62.107.55, error: timed_out

While there are very likely many possible reasons for this error, I’ve done this enough times to immediately realize I never opened the ODBC Server port on this particular cluster. I opened port 10000 on the Hadoop cluster via the “Open Ports” tile and tried again. Success! I can now query my Hive data!

clip_image004

I hope you’ve enjoyed this small bite of big data! Look for more blog posts soon on the samples and other activities.

Note: the CTP and TAP programs are available for a limited time. Details of the usage and the availability of the CTP may change rapidly

Other Small Bites of Big Data: http://blogs.msdn.com/b/cindygross/archive/tags/hadoop/