Over the past few years, the Microsoft Technology Center (MTC) has worked with many organizations across all industries who are taking a fresh look at their data to find new ways to engage with their customers and drive operational efficiencies. Like many of you, they are exploring the emerging space of Big Data.
These customers are on a wide spectrum of knowledge when it comes to distributed scale-out compute technologies, but the constant has been a desire to implement a common platform approach to processing and analyzing large volumes of data. Enter Apache Hadoop. While Hadoop is not the only way to implement distributed scale-out processing, it has become a first approach for many organizations—and with good reason. There is a vast ecosystem of tools and technologies that are integrated with the platform, which helps to ease the discovery, integration, and use of data.
Another consistent theme across our engagements has been a concern among customers that they don’t have the skills, hardware, or time to install and configure Hadoop at the speed that business demands. This is compounded with decreasing budgets for IT, requiring companies to consider how to justify new investments without a proven return on investment (ROI). Enter Windows Azure. Many of our customers are starting with a cloud-first approach by using the Windows Azure HDInsight Service to address these concerns and respond to their business demands. The HDInsight Service is a cloud-based service that delivers Hadoop and will eliminate the burden of installing and configuring a standard Hadoop cluster, where you only pay for the time you are using the service.
Zero to Hadoop in Minutes – Delivers Faster Time to Analysis and InsightIn a traditional customer data center, procuring hardware can often take weeks or more. There is also the process to rack, network, and configure the servers before our customers can even begin installing Hadoop. In short, all of this adds latency to the most important and valuable business need: data analysis.
The HDInsight Service will deliver an Apache Hadoop environment in about 10 to 15 minutes, significantly reducing the time it takes to get to the critical data analysis. With the HDInsight Service, you have a data center already configured and optimized for these workloads. Customers are not burdened with hardware setup, and the Hadoop cluster can be provisioned through an intuitive interface by specifying a few parameters: name of the cluster, size of the cluster, and administrator password.
Lower the Cost of CuriosityBusiness leaders have ideas—often called “gut instincts”—and they need data to help validate and support their instincts. These data-driven and data-validated decisions are central to many Big Data engagements at the MTC. I recently met with a customer who stated that they have been “lucky” with past decisions, and they see a large risk that this luck will run out. They want a fast, agile, and cost-effective method of validating current direction or identifying when to change course. They cannot justify a large capital investment in people or hardware resources to set up and manage environments for “data experimentation.”
Creating clusters on demand over data that exists in Windows Azure Storage provides an incredibly agile method to delivering Big Data analysis with minimal investments. Customers will generate a cluster as needed and only pay for the time they use it. When they have satisfied their curiosity, answered their questions, or validated their intuition, customers can drop the cluster without incurring further costs to manage or maintain the Hadoop environment.
Spend Less Time Validating Infrastructure and More Time Validating IntuitionThe ease of implementation and speed of creating a standardized Hadoop infrastructure with the HDInsight Service is critical to delivering immediate value. Our customers have stated that their business users and internal customers will not give them the luxury of time to learn how to implement and configure these new infrastructures. The HDInsight Service is an immediate solution to keep business moving forward while the IT services team develops additional solutions that add value.
A cloud services model also eliminates the risk of designing your cluster with the wrong capacity. If your cluster is too small and performance does not meet the demands of your analysis, you will introduce more latency to acquire and add more hardware resources to the cluster. Data analysis may be halted while the business waits for the new cluster hardware. If the cluster is too large, you have overinvested and reduced the ROI. These risks can be eliminated with the cloud. Create the HDInsight Service cluster, and if more resources are needed, you are 10 to 15 minutes from a new, larger cluster.
The Experts at the MTC are Ready to Engage! Visit the MTC and talk with the architects to learn how other organizations are using data in new and inventive ways, and how you can drive better customer experiences and improved operational decision making through data.
We can work with you to identify the right approach to data analysis, plan your architecture, and align your business with the right technologies. Our architects can meet directly with your experts to design the best approach to answering your questions in the context of your priorities, risks, and primary concerns. In addition, we facilitate hands-on experience labs that drive the concepts of Big Data in Windows Azure.
Lara Rubbelke (@SQLGal) loves data and brings her passion for designing solutions to customers as a Data Solutions Architect at the Microsoft Technology Center in Minneapolis, Minnesota. Her expertise involves Big Data, OLTP, and OLAP systems; data management; emerging NoSQL solutions; and the business intelligence life cycle. You will often find her delivering technical presentations at local, regional, and national events like TechEd, PASS Summit, and online MSDN and TechNet webcasts.