Jim O'Neil
Technology Evangelist
Join App Builder
Keep The Cash! Earn $100 for every app you publish! Let me know how I can help!
While the availability of Windows Azure and the announcement of Silverlight 4 were certainly highlights of PDC 09, I was especially intrigued by the introduction of “Dallas,” the code name for Microsoft’s Data-as-a-Service (DaaS) offering. So after the relatives had left and my turkey coma had subsided, I thought I’d give “Dallas” a whirl this Thanksgiving weekend.
“Dallas” is essentially a repository of data repositories, a service - built completely on Windows Azure - that allows consumers (developers and information workers) to discover, access, and purchase data, images, and real-time services.
“Dallas” essentially democratizes data, enabling a one-stop shopping place (via PinPoint) for all types of premium content. With “Dallas” one can opt in to a pay-as-you-grow type model, facilitating access to data that may have previously only been accessible via expensive subscriptions directly with the data provider.
Developers can access “Dallas” via REST-based APIs and Atom feeds or in raw format (as many of the content providers had made available pre-“Dallas”). The web-accessible “Dallas” Service Explorer (shown below) allows the consumer to explore the data as well as the HTTP URLs that are constructed and executed to retrieve the data set based on the user-provided parameters.
Since the data services are accessed via standard protocols (REST, HTTP, Atom), “Dallas” can be used by a vast variety of clients – PHP, Ruby, Java, etc. – in addition to .NET, of course. For .NET developers, the Service Explorer offers a convenient option to generate a C# proxy class that essentially provides a wrapper for the HTTP/REST API.
Current data providers (offering trial periods of access) include:
Other content providers coming soon include CitySearch, ESRI, and NAVTEQ. If you happen to be a content provider yourself, contact the “Dallas” team for partnership opportunities.
To get started with “Dallas,” access the Quick Start page under the Windows Azure Platform portal. You’ll first need to request an invitation code (associated with your Live ID); that code will provision you with an account key that you then submit as part of the HTTP header with each of your data requests.
Note: the account key is your private key and should not be shared with anyone. The unique user ID is a GUID representing individual users. Both are required when submitting a request for data. While you’re exploring “Dallas” on your own, a single (or random) user ID is sufficient, but when you deploy your application, you may want to set up multiple user IDs to track requests by user for analysis and/or billing purposes.
Note: the account key is your private key and should not be shared with anyone. The unique user ID is a GUID representing individual users. Both are required when submitting a request for data.
While you’re exploring “Dallas” on your own, a single (or random) user ID is sufficient, but when you deploy your application, you may want to set up multiple user IDs to track requests by user for analysis and/or billing purposes.
Once you have your account key, you’ll have access to the “Dallas” Developer Portal (below), where you can view your current data subscriptions, other available providers (catalog), your account keys, and an access report (how many requests were made to each of your subscriptions).
This part’s pretty easy; the Catalog link brings you to the list of available (and forthcoming providers). Find the ones that look interesting - they’re all free for the time being – and click the 'subscribe’ link.
Once you’ve subscribed to a service it will be listed under Subscriptions in the “Dallas” Developer Portal, and you’ll find a link for each service through which you can explore its dataset via the “Dallas” Service Explorer.
I’ve subscribed to the Data.gov feed for 2006 and 2007 crime statistics, and below is a preview of the data in the “Dallas” Service Explorer.
I’ve numbered the five primary sections of the output as well.
One of the best ways to investigate HTTP traffic is via a network sniffer tool such as Fiddler. With Fiddler, you can fashion your own HTTP requests, execute them, and view the resulting HTTP response. By using the “Dallas” Service Explorer, I can see, for instance, that the URL for a request for 2007 crime statistics for Massachusetts should be issued as follows:
https://api.sqlazureservices.com/DataGovService.svc/crimes/ Massachusetts?year=2007&$format=atom10
https://api.sqlazureservices.com/DataGovService.svc/crimes/
Massachusetts?year=2007&$format=atom10
and the $accountKey and $uniqueUserID are required as request headers.
So via the Request Builder tab in Fiddler, I can build up the HTTP request as you see below. Note, the headers are separated from their values via a : (colon) not = (equals sign) as implied by the “Dallas” Service Explorer.
or, in raw format:
GET /DataGovService.svc/crimes/Massachusetts?year=2007&$format=atom10 HTTP/1.1 $accountKey: REDACTED $uniqueUserID: 933f6f25-77fe-463a-bed1-c97de7fe3c8f Host: api.sqlazureservices.com
Then, via the Inspectors tab in Fiddler, I can view the output, shown below in XML format, which Atom 1.0 also uses.
It should be clear then, that any development language capable of issuing HTTP requests and parsing XML output can easily invoke a “Dallas” service – Ruby, PHP, JavaScript, Python, and the list goes on.
When you use the option in the “Dallas” Service Explorer to create C# service classes, a .cs file is downloaded to your machine. That file includes one or more pairs of classes (in the namespace Microsoft.Dallas.Services). One of the classes is a data transfer object with properties corresponding to the data output, and the other is the actual proxy class that exposes a few methods:
constructor – which sets up the request URL and parameters, including the paging values. InvokeWebService –a private method that creates and executes an HttpWebRequest (which returns results as an Atom feed) and extracts the Atom entries as an IEnumerable<XElement>. Invoke – the publically accessible method that synchronously calls InvokeWebService and transforms the result into a strongly-typed List, the type of which is the other class defined – the data transfer object - in the generated C# file.
constructor – which sets up the request URL and parameters, including the paging values.
InvokeWebService –a private method that creates and executes an HttpWebRequest (which returns results as an Atom feed) and extracts the Atom entries as an IEnumerable<XElement>.
Invoke – the publically accessible method that synchronously calls InvokeWebService and transforms the result into a strongly-typed List, the type of which is the other class defined – the data transfer object - in the generated C# file.
Pulling this C# file into a Windows Forms, WPF, or ASP.NET application is pretty straightforward. Here’s the entirety of code needed to display the 2007 Massachusetts crime data in a Windows Forms DataGridView:
DataGovCrimeByCitiesService svc = new DataGovCrimeByCitiesService(
"REDACTED",
new Guid("ebcd3366-6c32-49cd-83df-356aae167692"));
var data = svc.Invoke("Massachusetts", null, "2007");
crimeGridView.DataSource = data;
While analyzing and manipulating the data feeds exposed by “Dallas” is cool in and of itself, the real power of this offering comes when you aggregate your own domain data with the public data. You might have sales figures that you are trying to interpret against economic or census data to make decisions on future store openings… or maybe you’d like to factor in up-to-date traffic conditions in your courier scheduling application… or perhaps you want to calculate insurance rates based on FBI crime statistics. With project “Dallas”making these large, vetted premium data sets available, such types of analyses become viable for practically any application and developer.
“Dallas” was just announced at PDC 09, so it’s a great time to get involved and watch (and participate) as the service evolves. You’ll need an invitation code to get started, but while you’re waiting for that you can check out the following resources as well:
What is “Dallas” (Channel 9) Enrich your Applications with Data from Microsoft Project Code Name “Dallas” (PDC 09) Intro to “Dallas” (Hands-on Lab) “Dallas” team blog
What is “Dallas” (Channel 9)
Enrich your Applications with Data from Microsoft Project Code Name “Dallas” (PDC 09)
Intro to “Dallas” (Hands-on Lab)
“Dallas” team blog
Another very cool and ‘rewarding’, way to get to know “Dallas” and Windows Azure is to participate in the Pathfinder Innovation Challenge. NASA and the Jet Propulsion Laboratory in conjunction with Microsoft are sponsoring this contest for ages 14 and older to help foster an interest in and further the exploration of the Martian surface.
As you can read in the contest rules, there are four leagues of involvement, with various degrees of programming expertise recommended (beginning with League 1 that requires only an enthusiasm for the exploration of Mars!). Individuals or teams can compete for prizes ranging from a planetarium kit for a local secondary school, Zune HDs, to a trip to the launch of the Mars Science Laboratory in September 2011.
The contest is on-going with deadlines of February 15, 2010 and April 16, 2010 depending on the League of competition. See the contest rules for more information on deadlines and submission procedures.