NOTE This post is one in a series on Hadoop for .NET Developers.
For rapid provisioning and lack of long-term commitment, the cloud is an excellent place to try your hand with a multi-node Hadoop cluster. If you are an MSDN subscriber, Microsoft provides you access to cloud services as part of your benefits as described here which you can use to put a cloud-based Hadoop cluster in place. In Azure, Microsoft's cloud, Hadoop is delivered as a service named HDInsight.
Please note that consumption of cloud services incurs fees, and while an MSDN subscription covers these up to a pre-defined amount, you can exceed your allocation and possibly incur expense. For this series of posts, I will assume you are an MSDN subscriber and understand your benefits and liabilities as these relate to the consumption of Azure services.
NOTE If you are not an MSDN subscriber, you can still gain access to Azure through a trail subscription. Details on this are here.
To get started, you first need to obtain access to the HDInsight service. As of the time of this writing, HDInsight is in preview and available upon request. To check whether or not you currently have access to HDInsight in Azure, do the following:
If the HDInsight item is present, you have access. If it is not, you need to request access. To request access:
Please note that obtaining access can take a while but once you have access, you can then setup an HDInsight cluster. Steps for quickly creating a cluster are found here. I personally prefer to create a custom cluster which requires me to:
At this point, the provisioning of the services begins. This process can take several minutes to complete. You can view the progress of this step by clicking the DETAILS icon on the Creating Cluster bar at the bottom of the portal page.
Once the process is complete, you have a working HDInsight cluster. The cluster consists of data nodes, a name node, and an associated storage account delivered through the Azure Storage service. The portal will show you the HDInsight cluster as soon as provisioning is completed but to see the storage account, you may need click the HOME link at the top of the portal and then click PORTAL from the Azure homepage to return to the portal page. The storage account should now be visible.
Once you are done with your HDInsight Azure cluster, you can delete it by returning to the Azure portal, locating the cluster, and clicking the associated DELETE icon. If the storage account associated with the cluster is not used for other purposes and you do not wish to access it further, you can delete it as well.