In our previous post we introduced you to Microsoft Codename “Data Explorer” , a cloud service that helps you to gain insight into your data by allowing you to discover, enrich, and publish your data. This post walks you through an end-to-end scenario using “Data Explorer”.
In this scenario, our business consultant (Anna) is tasked by Contoso Yogurt to help them decide where to open the next three stores in the Western Washington State area. In order to answer this question, Anna needs to consider various aspects about the target customers, the demographics of each potential location. Anna must also forecast the reaction of the people near those new store locations after the new stores are open; i.e. how are people “feeling” about these stores.
Anna already has access to most of the data sources that contain the information that she needs, either because she owns the data or because the data is part of her company’s information systems. Additionally, Anna knows that there are other useful pieces of information in the wild that she can leverage, such as web pages, forums, social networks or other places on the Internet. She is not yet sure how she might leverage some of this valuable data, but thanks to the power of “Data Explorer”, now she can put this data to use and gain new insights.
These are the different data sources that Anna already knows and considers useful with the decision process:
Anna starts with the Welcome page, where she can start working with her data.
Discover
On the Welcome page, Anna clicks Dashboard to start working with her data.
Anna clicks on Add data source to start adding new data.
This brings Anna to the Add Data page, where she can add data from many different kinds of data sources. She can connect to network resources such as a database, consume contents from a web page, a data feed or data coming from Windows Azure Marketplace. Alternatively she can add data from her local machine in various formats (Excel, Access, Text, etc.) or even create data “ad-hoc” by typing or pasting text, or creating formulas.
Anna starts off by connecting to the SQL Azure database where Contoso stores information about existing stores. To do this, she provides the Server and Database information, and a user name and password with permissions to access this database.
Next, Anna adds the Excel spreadsheet which contains potential new locations (shopping centers). She does this from the Add Data page as well. In this case, she uses File – Excel as the data source type, and she gets to indicate where the Excel file is currently located so that it can be uploaded.
Now that Anna has added her two data sources, she goes back to the Dashboard page and notices that her two existing data sources appear within the dashboard (on the right). In addition, a lot of useful information appears on this page, including classifications about the data that was imported. There are also some recommendations about potentially useful and relevant data sets from Azure Marketplace and Bing that she could leverage.
Anna finds these recommendations interesting and will incorporate some of them to her data later.
The next thing that Anna needs to do is to combine the information from the SQL Azure database regarding Contoso stores with the list of shopping centers for potential new locations that was in the Excel file added earlier. This is generally not a trivial task, but with “Data Explorer” she merely needs to select both sources in the list and click Mashup.
Enrich
Once Anna has clicked on Mashup, she is taken to the Mashup editor. This is where she can start shaping the data, enriching it by connecting the two different tables.
We will talk about the Mashup editor in greater detail in subsequent posts, but for now there are a few concepts that we want you to learn in order to understand the rest of this scenario… In the top-left corner of the editor, you can see the resource pane currently displaying the two resources that Anna is trying to mash up, namely ShoppingCenters and ContosoStoreTraq.
The New option above the resource pane allows you to add more data via the Add Data experience we covered above. The Merge option allows you to merge two resources into a single table.
Currently, the selected resource is ContosoStoreTraq which is being previewed in the editor. Immediately below the ribbon are two gray boxes – we refer to this as the Task Stream. Each of those boxes is a “Step” that transforms or refines the data in a particular way. You can select from the ribbon a wide variety of tasks and apply them to a given resource, thereby adding to its task stream. These tasks are suitable for filtering, ordering, changing column names in the preview, transforming the data as you choose. The tabular preview shows the result of adding each task to the task stream, giving you immediate visual feedback on how the data shape is affected with each step. You can click on each task to see how the preview looked during that step.
You will also notice recommendations about datasets that are relevant to the data that you are currently working with (shown in blue in the bottom left). Since Anna is working with stores and shopping center locations, demographics data from Data Market and phone book information from Bing are provided as recommended datasets.
Next, Anna wants to combine these two resources because the information about existing stores contains a store performance rating. She would like to make this performance rating appear next to each shopping center where there is an existing Contoso Yogurt store. She can easily achieve that by adding a lookup column to ShoppingCenters, as displayed below…
After adding this lookup column, Anna selects the recommended data set about demographics and adds it to her mashup. She is then going to merge this resource with the ShoppingCenters resource, based on the Zip code.
Once she has merged these two resources using the Zip and PostalCode columns, Anna can incorporate another one of the recommended datasets; in this case, she is adding data from the Bing Phone Book to enrich her current data with phone numbers.
Using the Bing Phone Book API, Anna is able to create a new column with the count of the number of high schools within a ten mile radius of each store.
She could also add some of the other recommended services in order to provide social sentiment for each of the shopping centers, in order to pick the one that people like best…
Publish
Finally Anna wants to share her findings with her colleagues and uses the Publish features from “Data Explorer”, which allows her to publish the results in many different formats (Excel, PowerPivot, etc.), as you can see in the following screenshot. We will explore in detail the different publish mechanisms in subsequent posts.
You will be able to try these capabilities and much more very soon. We are working hard every day so that by the time you get to try them, they are even more powerful! As a side effect of this, some parts of the user interface that you have seen in this post might look a bit different by the time you start using “Data Explorer”.
We hope this tour has helped you get a better understanding of the great opportunities “Data Explorer” brings to you and your data. If you still haven’t had the chance to sign up to try “Data Explorer”, you can follow this link to sign up.
Enjoy!