from ideas to solutions

Browse By Tags

Tagged Content List
  • Blog Post: Run Jupyter Notebook on Cloudera

    In a previous blog , we demonstrated how to enable Hue Spark notebook with Livy on CDH. Here we will provide instructions on how to run a Jupyter notebook on a CDH cluster. These steps have been verified on a default deployment of Cloudera CDH cluster on Azure. At the time of this writing, the deployed...
  • Blog Post: Run Hue Spark Notebook on Cloudera

    When you deploy a CDH cluster using Cloudera Manager, you can use Hue web UI to run, for example, Hive and Impala queries. But Spark notebook is not configured out of the box. Turns out installing and configuring Spark notebooks on CDH isn't as straightforward as is described in their existing documentation...
  • Blog Post: 使い慣れた R がクラウドのスケーラブルな Hadoop と Spark で利用可能に

    執筆者: Oliver Chiu (Product Marketing, Hadoop/Big Data and Data Warehousing) このポストは、3 月 29 日に投稿された Microsoft brings the familiarity of R to the scalability of Hadoop and Spark in the Cloud の翻訳です。   Azure HDInsight は Azure Data Lake の一部として提供されるマネージド more
  • Blog Post: Microsoftがクラウドでのビッグ データと分析をより簡単に

    今週、ビッグ データとデータ サイエンスのテクノロジとビジネスを探るために、数千人が参加しているサンノゼのSt […] more
  • Blog Post: Resolving Spark 1.6.0 "java.lang.NullPointerException, not found: value sqlContext" error when running spark-shell on Windows 10 (64-bit)

    It is easy to follow the instructions on and download Spark 1.6.0 (Jan 04 2016) with the "Pre-build for Hadoop 2.6 and later" package type from However, when you try to run spark-shell on your Windows 10 (64-bit) machine...
  • Blog Post: Explorando o Vale do Silício :: 1ª Meetup – Spark!

    Pessoal, seguem minhas impressões sobre a 1ª Meetup que participei organizada pelo grupo SF Data Science (   Aproveito para compartilhar duas questões: Enquete que vocês podem me ajudar a responder: Criei more
  • Blog Post: 1ª Meetup – Data Analytics :: Explorando o Vale do Silício

    Opa! Confesso que nem dormi ainda e estou empolgadíssima com o dia que já vai iniciando. Difícil conseguir dormir com tanta coisa para fazer e tantos conteúdos interessantíssimos para estudar. Impressionante que quanto mais eu estudo mais eu chego a more
  • Blog Post: Real Time Analytics with Azure Event Hubs, Cloudera, and Azure SQL

    In this blog post, I will demonstrate how to ingest data from Azure Event Hubs to Spark Streaming running on Cloudera EDH, process the data in real time using Spark SQL, and write the results to Azure SQL database. Alternatively, data processing can also be done using Impala. This example uses the same...
  • Blog Post: How to allow Spark to access Microsoft SQL Server

    Today we will look at configuring Spark to access Microsoft SQL Server through JDBC. On HDInsight the Microsoft SQL Server JDBC jar is already installed. On Linux the path is /usr/hdp/ If you need more information or to download the driver you can start here Microsoft...
  • Blog Post: A KMeans example for Spark MLlib on HDInsight

    Today we will take a look at Sparks's module for MLlib or its built-in machine learning library Sparks MLlib Guide . KMeans is a popular clustering method. Clustering methods are used when there is no class to be predicted but instances are divided into groups or clusters. The clusters hopefully will...
  • Blog Post: Using Spark on Azure - Part 2 - Enter Power BI

    As you have seen in Part 1 it is very easy to create a powerful Spark cluster and get some great data exploration capabilities right in the Zeppelin notebooks. But at one point you may want to visualize your data so you can share it with your colleagues. Power BI is an incredibly rich solution not only...
  • Blog Post: Microsoft Cloud Data Platform

    J’ai récemment eu l’occasion de faire une présentation de l’intégralité de la plateforme de données de Microsoft en me focalisant sur ses composantes Cloud et sur ses solutions de Business Intelligence à destination des utilisateurs. Je souhaitais aborder ce sujet de façon exhaustive tout en offrant...
  • Blog Post: Using Spark on Azure - Part 1

    The world of data is one that is evolving quite rapidly, over the last years many new technologies have seen the light like Hadoop, Hive, Pig and many many more. One project that has gotten quite some attention lately is Spark which is described as follows: "Spark is a fast and general engine for large...
Page 1 of 5 (125 items) 12345