    Microsoft HDInsight Installation & Dependency Management


    It’s a rainy Saturday afternoon here in Seattle, and the kids are keeping themselves busy running around the Christmas tree, so I’ve got a little time to put together a post that addresses some questions that have come up a few times in the forums as well as in our internal discussion aliases for our on-premises install of HDInisght.

    We currently use the Web Platform Installer to take care of dependency management of the installation. 

    It’s important to point out that there are actually two key pieces that get installed

    The second has a dependency on the first.  The WebPI feed also contains a number of pre-requisites required to set up IIS and a few other things for the HDInsight dashboard.  Here’s what it looks like on a completely fresh Windows Server 2012 machine.  Most developer machines likely have some or most of the IIS pre-reqs installed.  We’re also working to clean up some of this to minimize installation & setup.



    Let’s talk a little bit about what’s in each one.

    Hortonworks Data Platform installer

    This msi includes the core Hadoop bits (Map/Reduce, HDFS), as well as a number of other Apache projects in the Hadoop ecosystem.  The full list included in the current installer are:

    • Map Reduce
    • HDFS
    • Hive
    • Pig
    • HCatalog

    Each of these projects is packaged into a zip file that contains a PowerShell script that automates the installation and setup of the component.  There are more components in Hortonworks Data Platform, and the teams are working to get these packaged and included.

    Microsoft HDInisght Installer

    This msi contains bits that are Microsoft specific, and may also contain additional Hadoop projects.  The current install (as of today) contains:

    • HDInsight dashboard
    • Sqoop
    • Isotope.js
    • Getting Started content

    These are packaged the same way as the Hadoop projects above.  Additionally, there is an installation PowerShell script here which will do some initialization of the single node installer, such as starting the services for the Hadoop components.

    Alternate Approaches

    The team discussed a number of potential factorings, and we very much welcome feedback here.  A few ideas that we’ve thought about:

    • Stable and Experimental packages.  This would allow us to set expectations around quality and stability of the bits. 
    • Decomposing every project into an individual msi
    • Integrating and building a Chocolatey package for these

    What Does This Mean For Me?

    What this means is that when you install HDInsight out of WebPI, you are installing two different msi’s.  We are revving the Microsoft msi every two weeks to pull in bug fixes (and very shortly include some experimental features).  The Hortonworks msi will be revved on a different schedule, as the team there decides to release an update.  We are partnering closely with the team there and so we will coordinate releases so that the combined installation will always work.

    More directly, this implies that if you want to uninstall completely, you will need to uninstall both packages from Add/Remove Programs:


    This also means that when we issue an update for the Microsoft HDInsight package, you don’t have to “lose” your cluster by uninstalling both products.  You should be able to simply uninstall & update the Microsoft HDInsight package. 

    The team would love to get more feedback on this approach, so, let us know what you think!

    Updating HDInsight Preview


    Today we’ve shipped an update to the single node HDInsight Server Preview that is installable via Web Platform Installer.  

    What Are We Doing?

    Every two weeks, we’re going to take a snapshot of the work in progress and update the public installer with this (provided it passes a basic level of validation).  This gives us the opportunity to rapidly get new bits in front of customers, experiment with new features, and address bugs in a timely fashion. 

    We’ve also snapped versions such that releases of the SDK as well as the One-Box installer will all share the same version.  As such, we’ve versioned the msi as

    How do I Get This?

    Simply by installing HDInisght out of WebPI onto a new machine after uninstalling (in place upgrade is not currently supported).  To completely uninstall previous versions, you will want first uninstall “Microsoft HDInsight Community Technology Preview” and then the “Hortonworks Data Platform”.  We will not preserve data in HDFS using uninstall/reinstall, so archive that prior to uninstalling.


    What’s New?

    There are no new features in this release, we’ve worked to improve some of the install and setup experience to address some bugs that have been reported on this alias as well as the forums.

    • IIS / Dashboard setup issues experienced on certain OS SKU’s
    • Addition of HDInsight Dashboard link into Start Menu
    • Fix to Hive console for multiple line input errors
    • Clean up some uninstall issues
    • Minor changes to getting started content

    Hortonworks has also shipped an update to the HDP installation to address the bug encountered on Friday (documented here).  By removing both installations above and re-installing, you will get updated bits here.

    What if Something Doesn’t Work?

    Please report issues on the forums.  We’re also tracking a set of known issues via the release notes.

    When Will These Bits Be Updated In Azure?

    We’re using this as an opportunity to ship these bits early and often, and they will find their way into future service updates to Azure HDInsight, typically in the next monthly update.

    What’s Next?

    We’ll continue to address common issues as they are reported.  From a feature perspective, we hope the next update will include updates to the dashboard to include a new and improved Hive console.

