Welcome to MSDN Blogs Sign in | Join | Help

I haven't been writing much here, mostly because I've been way too busy but also because I couldn't discuss publicly many of the things I'm doing. Now that SharePoint 2010 has been announced and its feature set published everywhere, I can finally discuss one of the coolest things we've been up to lately.

SharePoint is a repository of resources (list items and documents in document libraries) that are collected and manipulated collaboratively. Resources have a bunch of security and business logic attached to them, such as who can see each item, who gets to change it, or whether a particular column in a list needs to conform to a particular validation formula.

When SharePoint folks said they wanted a RESTful interface this was great news...the system is just a perfect fit. Not only it's a perfect fit for RESTful services in general, but also for Astoria in particular. In the end SharePoint is very data-centric in nature, it already supports queries and business logic as part of the uniform interface.

So we're really excited to announce that as of SharePoint 2010, every SharePoint server is an Astoria server out of the box. No configuration required or anything, just make sure the proper version of ADO.NET Data Services is in the box. For SharePoint 2010 beta, the "right" version is ADO.NET Data Services v1.5 CTP2. We'll put details out there for future iterations as they come.

Official write up in the Astoria team blog.

The SharePoint Data Service head is not just a side integration deal, it's a full-on REST-over-HTTP head for SharePoint. It supports browsing as well as modifying data using regular HTTP verbs (GET, PUT, DELETE, etc.), it does ETags for concurrency control, enforces business logic as part of side-effecting methods, and it handles the full range of Data Services conventions for URLs, Atom and JSON payload formats, etc. It also exposes full metadata like any other Astoria service, allowing Visual Studio and any other metadata-driven client to give you a great experience on the client side. Now if you need to get or manipulate data in SharePoint from any platform in any language, all you need is an HTTP stack.

The other cool aspect is that the SharePoint folks where able to build this entirely on top of public bits, using our new fancy provider model for advanced data sources I discussed here some time ago.

SharePoint joins a growing family of Microsoft products that enable users to share data through the use of a simple RESTful interface that follows the Astoria conventions when needed (e.g. URLs, structured data in Atom). Another example on the server side include SQL Server Reporting Services, which in SQL Server 2008 R2 can now render any report as Atom feed that follows the Astoria conventions (examples here). On the client, in addition to developer-oriented options such as .NET, Silverlight, AJAX, PHP, Java and more, we now also have PowerPivot (f.k.a. Gemini) which can pull data from any data service and do fancy analysis and publishing over it, making it trivial to bring data from Data Services into Microsoft Excel 2010 among other things.

I find this to be extremely important. A simple, uniform way of sharing data at the protocol level, and in a way that truly enables the lowest possible bar of entry, is key to enable broad integration across products and break data silos that form around applications.

I'll be talking about this at the SharePoint conference 2009 in Las Vegas this week, and at the Professional Developer Conference (PDC 2009) in late November. If you're attending any of these and this sounds interesting, these sessions will drill into plenty of details.

-pablo

 

Just a few weeks after announcing the PHP toolkit for Data Services we now are happy to announce a Java toolkit for consuming Data Services that follow the Astoria RESTful data services pattern. The library ships as an extension to the existing Java Restlet library. I'm really happy to see more and more clients and servers come up that can be used to share data between systems.

You can check out the announcement in the Interoperability's Team blog.

-pablo

 

Yesterday we announced that the CTP 2 of the ADO.NET Data Services framework (yeah, Astoria) is available for download. We put in a ton of work on this release, ranging from adding better support for high-end services to making it easier to write applications in Silverlight by having cross-domain and proper data-binding support.

Official announcement is here, and includes details on where to download from, what to watch for (it's still a CTP, comes with fine-print :) ), and a list of the new and updated features.

Also, many folks in the Astoria Team have been writing a bunch of content to cover the new features, so watch the Astoria Team blog as well as related blogs for all sorts of samples and details to come up.

-pablo

 

Folks in the interoperability team at Microsoft just announced something they've been cooking for a while, developed Persistent Systems: a client PHP toolkit for Astoria services. It follows more or less the model of the .NET client where you can run a tool during development to get code-gen based on a data service metadata, and then it has a runtime library that can be used to send queries, obtain responses as PHP objects, track and submit changes, etc. They implemented both the Atom and JSON formats, and even threw in batching support.

The best part is that it's all written in PHP (both code-gen tool and runtime), so it runs in any environment where PHP and the toolkit dependencies (XML, XSLT, CURL) are available.

For more details check out the post in the interoperability team's blog, the kick off Channel 9 video, and the PHP toolkit project in codeplex.

-pablo

 

Given how much of what we do in my team is related to the web (ADO.NET Data Services, System.Xml, etc.), Mix is one of the events I look forward every year, both to share some of the stuff we're working on and to hear from attendees who are building real-world applications.

If you are around, feel free to drop me a line if you want to chat. I'll be there through the whole event, and will be doing two talks on Friday, one on the Entity Framework and how some of its new features can make life easier when writing web applications, and another one on ADO.NET Data Services where I’ll discuss how to build RESTful data-centric interfaces and what’s coming in the future of Data Services.

  • “Modeling RESTful Data Services: Present and Future”: here I’ll discuss how to build RESTful data-centric services using today’s version of the Data Services framework as well as what’s coming in the future for Data Services.
  • “How I Learned to Stop Worrying and Love the Microsoft ADO.NET Entity Framework”: in this one I’ll talk about the Entity Framework and how some of its new features can make life easier when writing web applications. (no, I did not choose the title for this one)

-pablo

A few days ago we announced the big news about SQL Data Services (SDS) switching to being a full relational database on the cloud.

I’ve been a strong supporter of this path for a number of reasons. Relational databases are very well understood and there is a large base of expertise for them in the market. Also, a lot of the existing applications and libraries out there are ready to run against a relational database, so SDS is enabling them to be ported to the cloud with minimum (or perhaps sometimes, no) effort.

With SDS going relational not only you get to reuse all your knowledge and codebase in the cloud, but you also get all the benefits of a cloud-based infrastructure: high availability, piece-of-cake provisioning, pay-as-you-go growth, etc.

One of the concerns I read about is the impact on scalability. My observation is that when you look at most of the storage systems in the cloud, they don’t have some magic formula for scalability, the trick is partitioning. Some systems are smarter than others in how they partition data and how dynamic the partitioning scheme is to adapt to varying system workloads. But in the end, you need to partition your data such that it’s spread across a bunch of nodes; if across your system you never (or rarely) depend on cross-partition operations, then you have a sustainable scalability path. That is independent of the actual organization of the data (e.g. relational, flexible entities, etc.) The ACE model on top of SDS had partitioning embedded in the model through scale units that surfaced as “containers”. In the new SDS world you can just partition your data across nodes, where each node has full relational capabilities. So it’s similar (partitioning), but each node gives you very rich ways of organizing and interacting with your data (full SQL!).

The other concern I heard is around TDS, the SQL Server client-server protocol, and how it would play in the Internet. In many cases the actual application that connects to SDS will be running in Azure as “web” or “worker” roles, and things should go smoothly. For the scenarios where the client is connecting to SDS from across the web, there are two challenges: firewalls and latency.

The server side of TDS by default listens in TCP port 1433, which a lot of firewalls will just block; furthermore, TDS is not HTTP, so a packet-inspecting intermediary could choose not to let the traffic through, regardless of the port number. This could certainly create some trouble that will need to be addressed at some point.

From the latency perspective, the short story is that I think it’s fine. TDS follows a simple request/response model, so interactions between clients and servers are straightforward and not chatty at all (things are more complicated when MARS is enabled, but that’s another story). We have experience tuning TDS for large WANs with high latency and things work out well as long as you optimize for those scenarios (e.g. batch queries together, etc.).

As a final note, there is the question about the SOAP/REST interfaces. In my opinion whenever you’re building the kind of rich applications that needs full SQL, rarely the data in the database can stand alone for direct access by consumers. Most of the time there is code on front (in the form of a middle tier) that manages access control, shaping, and even application-level constraints that don’t belong to the database. If you need a REST head on top of an SDS database, you can add ADO.NET Data Services to the equation, which will let you add all that logic fronting your data.

All in all, I’m really exited to see this happening. This gives Azure a whole spectrum of storage services, from blobs in Azure Blob Storage, to schema-less tables in Azure Table Storage, now to full relational with SQL Data Services.

-pablo

We announced two releases this week, kind of usual but it worked out this way.

The first one is the first CTP of ADO.NET Data Services v1.5. This is the next version of "Astoria" or the ADO.NET Data Services framework, and it includes a number of enhancements that were requested both by the developer community and by some internal partners. More details about this upcoming release here.

The second one is the preview release of an exploration project we have been calling "Astoria Offline". This project sits at the intersection between Data Services, Sync Framework, SQL Express/Compact and the Entity Framework. It will be interesting to see what folks think about it. I'm personally very interested in this space so I'm happy to see this finally out. As the announcement says, this is not an "official" product or anything like that, but more like an early experiment to understand the problem space. More details about the release and pointers to the download page available in the announcement in the Data Services team blog.

Looking forward to hear feedback about both of these releases and the technologies behind them.

-pablo

 

JSONP is a common way of making data accessible in client-side mashups even when the requests need to be cross-domain.

While the current version of the ADO.NET Data Services framework does not support this, it’s possible to build it on top. There are a couple of ways of doing this. Here is what’s probably the simplest way. There is some downsides to this approach, but overall is the most straightforward path to get there.

The default transport layer for Data Services is WCF, which has a many extensibility points across the stack. For the case of JSONP support, IDispatchMessageInspector comes in handy.

There are two things needed to support JSONP properly:

  • The ability to control the response format. Data Services uses standard HTTP content type negotiation to select what representation of a given resource should be sent to the client (e.g. JSON, Atom). That requires that the caller can set the Accept request header, which is not possible when doing the JSONP trick (which basically just uses <script> tags). We need to add the ability to use the query string in the URL to select format. (e.g. /People(1)/Friends?$orderby=Name&$format=json).
  • A new option to wrap the response in a callback if such callback was provided in the request (also in the query string). For example /People(1)/Friends?$orderby=Name&$format=json&$callback=loaded.

What we’ll do is register a message inspector and adjust the request/response when we see these new options coming in.

In order to support the $format=json option we can intercept the message before it gets dispatched to the Astoria runtime, at the IDispatchMessageInspector.AfterReceivedRequest method. If we see the query string option then we’ll a) strip it out from the URL so Data Services does not generate an error and b) change the “Accept” header to “application/json”, so the rest of the system just thinks that the client asked for a JSON response in the first place.

For the second part, where we need to wrap the response into a Javascript call if the $callback option was used, we have the IDispatchMessageInspector.BeforeSendReply method which gives us the perfect spot to rewrite the response. One unfortunate side-effect of this is that the response will get buffered and re-encoded; that said, in many cases this won’t make any noticeable difference.

Finally, we need to register the interceptor with WCF’s dispatchers. For that we create an attribute that implements IServiceBehavior, so we get called during service initialization. When we get called we can register our message interceptor.

The net effect is that if you include this code in your project, you just need to add a single attribute to your Data Service to make it support JSONP:

[JSONPSupportBehavior]
public class SampleService : DataService<ContactsData>
{
    // your service code here...
}

Once that's in place you can use JSONP by adding $format and $callback to URLs, for example:

http://<host>/SampleService.svc/People?$format=json&$callback=cb

Of course, you can still use all the other Data Services URL options in addition to these.

The implementation and a small sample service are available at MSDN code gallery, here:

http://code.msdn.microsoft.com/DataServicesJSONP

 

-pablo

The announcement of Windows Azure is a big milestone for us in the Astoria team. We got a chance to add our little contribution to the platform by providing data service interfaces for a couple of the Azure services.

Currently there are two services that use the ADO.NET Data Services runtime: the Windows Azure Tables Service, which was announced this week as part of the whole Windows Azure story, and SQL Data Services, which has been around for a while but got a new experimental Data Services interface this week to coincide with the PDC.

These services -and others that will come in the future also based on Data Services- share a common aspect: they have extreme scalability requirements.

In order to enable them to use our Data Services server runtime we had to extend the data service framework to make it scale in various new dimensions. In the rest of this post I'll summarize some of the walls we hit and the changes we made to the system to handle these scenarios.

Things that already scaled

The Data Services runtime already incorporates many design principles that help with scalability.

For example, the system does not keep any required state between requests (we do cache stuff, but we can throw it away at any time), so scale out of the front-end servers of the storage systems is relatively straightforward. This allows the existing runtime to handle an arbitrarily large number of requests by throwing more front-ends to the problem (as long as the back-end systems can take it, of course).

Also, we don't make any assumptions around the size of the data and provide mechanisms to push-down filters in requests to the data source, so effectively in principle there are no limits to the amount of data that a data service may be fronting.

Hitting the scalability wall

While some things scaled, there are certain aspects in which we ran into a scalability wall that required a number of changes in the system.

Using .NET types to represent the shape of the services is great in a single application, but not-so-great if you have millions of users with hundreds or thousands. We needed another way of describing the "shape of the data in the service", that is the metadata or schema of the service.

Since you can't practically create a distinct type for every user/application/table in the system, that means that the instances of objects that represent data flowing through the data services runtime cannot be of a specific type for each entity type. Instead, we needed independent of the flow format with respect of the declared types.

Metadata and service schema

The data services runtime needs to know the "schema" of each service it exposes. That is, the list of entity-sets, the entity-types of the instances living in those entity sets and the relationships between the various entities.

In a typical data service, the service exposes data for a given application or domain-specific service, so the schema of the service is known and static (within a given version at least) and all the front-end servers simple share the same schema.

The way a service author specifies the schema of a service in the shipping version of the Data Services runtime is by using .NET classes or an Entity Framework model (which in turn generates .NET classes). That works great for application developers, because .NET classes are a simple and natural way of defining the shape of your objects.

Now, if the requirement is to be able to handle millions of applications, each of which can have hundreds or thousands of tables, does that mean that we have to create a .NET type for each service and for each table, and the corresponding number of properties and such? And if so, since the front-end systems are stateless and potentially don't have any affinity to parts of the data, does that mean that any given system may end up having to load up millions of types in memory? To complicate things further, once you load an assembly (the only container in which .NET types can exist), you can't unload it unless you unload the AppDomain.

.NET types are a great solution for the scenarios where the schema is known and more or less bounded, and will continue to be the primary way of creating services in that context. However, we needed something else to handle the high-end side of the spectrum.

To address this need we introduced a new interface that data services can optionally implement. We already had the internals of the system organized more or less like this, but didn't expose it in the first release. The idea is that there is main split between the "upper half" of the runtime that deals with URL translation, LINQ expression tree generation, interceptors, policies and all aspects that make a Data Service look like a Data Service. The "bottom half" is the "data service provider", and is responsible for describing the shape of the service among other things. There are two built-in "data service providers", the Entity Framework provider which is what you use when you create a data service over an Entity Framework model, and the reflection-based provider which is what you use when creating a service on top of an arbitrary object graph. With the new change you can now create new implementations of these data service provider thingies that can obtain and manage metadata any way they want.

The way we interact with the provider is carefully designed to avoid requiring long term state state in the provider or the consumer of the provider in any way, while at the same time allowing the provider to do caching of metadata and control information if desired.

First, we never hold on to information returned by the provider beyond the scope of a single request. So for all we know the provider could be reloading all the metadata in every request. In practice, providers will probably cache this metadata in some way or another.

Second, we load metadata on demand and piecemeal. For example, during URI translation we do a small scale version of the usual binding and semantic analysis that any compiler does, and for that we need metadata. In those cases we don't load all the metadata, but only the pieces we need to do type checks, symbol lookups, etc.

Making metadata dynamic

Another aspects around metadata to consider is the fact that the shape of one of these data services can be altered at any time. For example, the Azure Table service has the concept of tables, and you can add and remove tables whenever you want.

The new scheme with custom data service providers make this possible because we don't remember anything at all across requests. So all the provider needs to do when the underlying shape of the data changes is report a different schema on the next request, and the data services runtime will happily take it.

With .NET types this would have meant creating and re-distributing new types (or creating them on demand on each node), and dealing with not being able to unload the old types from memory. Clearly not an option at this scale.

Flow format independence

With the addition of the "data service provider" interfaces we no longer have .NET types to use for the instances of each entity-type that flows through the system (e.g. from the data source to the runtime via the IEnumerables returned in LINQ queries, and from there to the serialization stack).

Another important change we made in the system is that we no longer assume anything about the shape of each CLR object returned by the query. We treat instances just as "object" all over the code base. When we need to access a member, we use methods in the data service provider interface to do that, imagine something like GetPropertyValue(object o, string name).

That means it's now possible to use some form of generic record type across the system. Not only this avoids the need for specific types, but also allows providers to piggyback control information in the instances themselves, avoid copies from the original format into CLR objects just to flow them through our runtime and a few more benefits.

Impact on LINQ expressions

While having flow format independence is great, it did complicate things for query formulation.

We typically translate URLs to expression trees, and since we have all the CLR types in the server that correspond to the entity types, all the expression trees are nice and clear.

When we're operating against unknown types we can't generated "typed" expression trees anymore. In those cases we still produce expression trees, but the member-access operations (and certain operators) are represented using custom calls to a well-known set of static members. The providers that enable this feature need to know about this and do proper translation of these expression trees.

Extension to the data model

We did one more major change that while it's not directly related to scalability it has a lot to do with the database/storage services in Windows Azure.

In the current version of Data Services types are "closed" in the sense that they have a structure that's final. You list a set of properties for each type and instances of that type cannot have properties added dynamically.

It turns out that the data services we have online have a more flexible model, where each entity has a fixed portion but also a dynamic portion. Typically the fixed portion includes a key or some sort and a version property. The dynamic portion is a property bag where you can add any name/typed-value pair.

We call these types that can be extended on a per-instance basis at runtime "open types". We introduced support for open types in the Data Services runtime such that you can mark a given entity type as "open" in metadata and that would cause the system to allow unknown properties to be set, as well as the use of unknown properties in queries (e.g. in filter predicates).

There is a lot of details around open types that I won't go into here, maybe the topic for another post, but I wanted to point out the change because it was a significant addition.

What do these changes mean for developers?

What does all this mean to current users of data services. Well...not much other than some background on how the system is evolving. Other than open types, services created with custom metadata/custom flow formats are indistinguishable from the ones created the "classic" way.

Furthermore, we will preserve the existing model where creating a service based on some .NET objects or an Entity Framework schema is really straightforward, and we consider that our primary scenario for developers.

At the same time, addressing the needs for the highest-end services out there is important, so many (if not all) of these changes will eventually make it into the shipping product so that other folks out there can use them if they chose to. Beware that these interfaces are not designed to be "nice", but rather optimized for control and efficiency, so it may not be exactly a fun experience, but you'll get all the scalability you'll need out of them.

-pablo

Since we shipped ADO.NET Data Services v1 in .NET 3.5 SP1 (and actually before that as well) I've been working on a few things that I could share (such as offline/sync support for data services) and some that I couldn't discuss publicly until all the big plans where announced.

This week at PDC Microsoft announced Windows Azure. A lot has been written about it, so I won't go into the details.

On our side, in the data services team, we made our small contribution to the big picture.

The Windows Azure table service is a structured storage facility that's part of the core part of Azure. Access to the table service is done through a data-services compatible RESTful interface that uses the Astoria conventions over an HTTP binding. That means that you use either any client with an HTTP stack to talk to it, or you can use the ADO.NET Data Services client, which does a nice job exposing data as .NET objects, letting you write simple queries using LINQ instead of URLs, etc.

Another cool thing about the table service (and the blobs and queuing service for that matter) is that they are accessible both from the virtual compute environment and from anywhere in the Internet. In both cases, if you're using .NET, you can use the data services client to interact with it. In the case of code running in the Windows Azure hosting environment, the client is already present (the environment includes .NET 3.5 SP1) so you can use it without worrying about taking new dependencies.

You find out more about the table service you can watch Brad's PDC session for a discussion of the service itself, and this other session than Niranjan and I did together (or "will do" if you're reading this before Wed in the PDC week) for a drill down on how to program the Windows Azure table service. If you're not at PDC no worries, these talks are accessible to online.

On the next layer up from the core, the Windows Azure service layer, SQL Data Services also is making big announcements in this PDC. We're introducing more relational capabilities into the system, and also experimenting with a data services-compatible interface. This PDC talk from Patrick will discuss and demo the new interface, and you can follow how this effort goes here.

-pablo

During the design of Data Services (Astoria) v1 we did the transparent design thing. We're quite happy with the result, we got a lot of feedback and were able to adjust many aspects of the project based on that.

Now that we're in full swing with v2 design work, we're going to be posting regularly again (hopefully :).

Andy got a new tiny camera and he had to use it for something...so he suggested that we do short clips that may help explain design points. Sometimes a short explanation is just much better than a bunch of writing.

The first post with the write-up + short video format discusses "Friendly Feeds" and is here. If you get a change to take a look, let us know what you think about the new style (and friendly feeds)!

-pablo

It's amazing how much information is there in our email archives. Now that we've shipped the thing, I thought I would share my summarized (still long), partial view of how the ADO.NET Data Services Framework ("Project Astoria") came to be. I left out a ton of partners, important events and features that came and went for the sake of brevity, but most key points in time are there.

Idea
Hack
Pitch
Prototype
Review
Announcement
Team
Design
Release
Future

Idea

September July 2006. We were having lunch at the building 35 cafeteria with Alex Barnett (back then our Community Program Manager) and he brought up that a bunch of people out there were doing REST-based APIs. The question on the table was if there was anything interesting around entities (as in EDM entities) and REST. I didn't know enough about REST to answer anything interesting, so I agreed to go do some reading.

September July 24th, 2006. Earliest date for what I have something documented. I found an email message I sent to Alex that basically said "I read about the REST thing...we could expose Entities as resources and might fit the REST model". My struggle at that point was whether there were valid scenarios for exposing a REST interface on top of entities without added semantics ("interceptors" came to the rescue later on).

Hack

August 24th, 2006. Finally found time to look at this. I wrote a quick hack overnight called, unimaginatively, "REST for Entities". It was a simple ASP.NET handler that would take data in EDM terms and give every entity on the system a URI, allow clients to retrieve the entities in multiple representations (XML and JSON, based on the HTTP accept header), and make changes to entities by using simple HTTP verbs (POST, PUT, DELETE). The use of URIs with added semantics driven by EDM schemas, support for multiple formats and the simple HTTP interface still remain core aspects of Astoria.

Pitch

September/October 2006. Alex pinged everyone and then some to see who'd hear us. He and I started to tour around Microsoft pitching the idea, testing it to see if it was interesting to folks, adjusting the pitch as we learned how to deliver the story. It was clear we needed some well-articulated content to move forward.

Late October 2006. Found the time to sit down and write a white paper, "Entities in the Cloud"; the paper focused on two observations: a) the raise of "application/service hybrids", typically consumer application that became development platforms, and the key role of their service interfaces, typically data-driven, and b) how simple the interfaces to these things are...just URLs and basic HTTP functionality. The paper described how Astoria greatly facilitated the creation of such application/service hybrids and had the potential to create a small ecosystem of libraries and tools to consume them. It also explored early ideas around data-independence in the HTTP/data interface, synchronization and offline support for data services and higher order semantics.

November and December 2006. We were busy with other things (I think I was working on the Entity Framework at that point, either on transactions support or on LINQ to Entities), but we kept working on the pitch in the background so we could show a "vision" to folks that would eventually decide to fund the effort. In addition to the paper above, we also wrote a paper for "ThinkWeek" with Britt Johnston (now the PUM of the data programmability tools team). This was probably the first time we explicitly painted the picture of Astoria not only as a framework and common HTTP interface, but also as an online structured storage service (fast-forward to the present: our collaboration effort with SQL Server Data Services will land us closer to this initial vision than I would have ever expected).

Prototype

February-May 2007. We had been talking about announcing the plans for Astoria at Mix 2007, which was planned for May 2007. We wanted bits in people's hands, not just slideware. In less than 3 months we built the "Mix CTP" of Astoria as a side-project. It wasn't just the code, but the setup, the samples, the client libraries, the huge documents, the online service and more. I wrote a large chunk of these, and several folks helped with specific areas, such as Elisa Flasko with reviewing the documents and creating websites, Tim Mallalieu with parts of the URL parser, and Asad Khan with parts of the client library. During these months I had a lot of fun building the system from scratch, but didn't exactly have a life. In retrospective, it was a great choice. Laser-focus on something and build it in a time-boxed manner.

First check-in was on February 16th, 2007.

A few key developments happened during the prototype building work:

  • Service definition: various folks independently influenced the way Astoria services are authored. At first it was declarative-only, which was falling short. During a few chats with Nikhil Kothari, the early forms of the current code-centric model came up. Also, during conversations with Anders Hejlsberg the thought of using LINQ as a layering mechanism came up.
  • WCF: While working on this, the WCF folks offered a better integration path. After a brief discussion Steve Maine, WCF wizard, got going into a week of non-stop hacking where we had many 2, 3 and 4 AM email threads. At the end, we had a nicely integrated story. Nothing like people that know their stuff and can get their hands dirty.
  • Client libraries: Another thing that happened here was the client library. It started as "let's cook up a couple of quick bindings for .NET and Javascript". I found an email from January 30th, 2007 that discusses the idea for the first time, and even mentions the possibility of supporting some form of LINQ-to-URL translation.
  • URL formats: with this work we introduced the second (but not last) URL format for Astoria. After a ton of community feedback we adjusted it to be what shipped in the final version later on.

Review

April 9th and 10th, 2007. You don't want to invest too much while locked into your office and then think you're solving a real problem. While we had ran our plans by a lot of internal folks, we wanted to get some external perspective before the general announcement. So on April 9 and 10 we held the first Astoria SDR (software design review, common practice at Microsoft).

The organization of this goes back to January and February, and Alex did a superb job organizing the event and in particular picking the right set of folks to give us feedback on the problem space, the plan and the pitch.

The feedback was good overall, which was a relief. Right before the event I started to wonder if they would just say "this? you brought us all the way here for this?". Happily, they were a bit more interested than that :)

Announcement

April 30th, 2007. We announced Project Astoria at Mix 2007 in Las Vegas. On that Monday I gave a talk called "Accessing Data Services in the Cloud", which described the problem space and announced the downloadable Astoria framework as well as the experimental online service that we had hosted for everyone to access (with Astoria front-ends for Northwind, AdventureWorks, Encarta and TagSpace). The talk went great and feedback was good.

Throughout that week at Mix we had a crazy amount of activity. We talked with a ton of folks, did labs, repeated the talk on Wednesday, and more. There was even some press follow up, which for a low profile project was a nice tough.

Team

The team formed sort of organically, so I don't have an exact timeline, but did find a few emails and old meetings in my schedule that are good reference points.

April 27th, 2007. The first actual interview for a member of the yet-to-exist Astoria Team was for Mike Flasko. Mike was working in the Windows networking group. He nailed the interview and became the Program Manager for Project Astoria and one of the driving forces that made the product possible. I couldn't think of a better PM for Astoria than Mike.

May 2007. For some reason, over the years whenever I'm working on cooking something Andy Conrad is also working on cooking something independently, and these things tend to be related. Around Mix 2007 Andy was working on Jasper, exploring dynamic languages and data access. Andy was the first internal member of the team, jumping-in as the Developer Lead for Astoria, and was very influential on how we built the system. Agile methodologies, morning scrum meetings and all that good stuff...

May-July 2007 and on. I won't go through each member of the team. You can read some of them in their own blogs like here and here, some in the Astoria team blog, and some don't write but you'll be using their code and relying in their tests whenever you use Astoria. We had an initial team with development, quality assurance and program management in place by early July, ready to start the official design. New folks continued to join the team throughout the project.

Design

July 5th-11th, 2007. To go from prototype mode to production-quality product mode, we started with a weeklong "design marathon" where we got all together and went from the problem statement and state of things on the web to a walk-through of that Astoria should look like and how it would work at a high level.

From there we set to build the product. While we kept the lessons learned in mind, we didn't use a single line of the prototype code in the real version of the product.

Design is an ongoing process, and we would hold design meetings two times a week, 2 hours each, through the whole product cycle.

Another thing we experimented with around design is how we communicate with the developer community. We tried, successfully in my opinion, to go with a "transparent design process".

Release

June 20th, 2008. We started to actually code stuff in late July 2007, and we worked on the product code, test code and functional specifications for around 9 months. The remaining time included bug fixing, performance work, standard overhead for packaging, etc. After almost a year of "official" (read non-prototype, non-side-project) work, on June 20th 2008 we got together to celebrate and claim victory over the ADO.NET Data Services Framework v1, aka Project Astoria.

August 11th, 2008. Service Pack 1 for Visual Studio 2008 and .NET Framework 3.5, containing a few new features including the Data Services Framework, becomes publicly available.

Future

Astoria v1 was an amazing ride, but we're far from done. What's in the future? In the short term we're working on closing and shipping the Silverlight version of our .NET client library and working together with the ASP.NET folks to refresh our AJAX libraries. As for the next release, some of us have been already working for a while on that...this deserves a post of its own, but the topics getting attention these days include support for synchronization and offline applications, extensions to Astoria to support the largest, busiest online services in the internet, changes for being a better Atom citizen and more.

-pablo

I've been sort of under a rock for a while, but I thought I'd come out for a minute to celebrate. Today we made available .NET 3.5 SP1 and Visual Studio 2008 SP1. There are two components in the release I spent a bunch of time on, which interestingly enough have very different origins and get to RTM through very different processes. One is the ADO.NET Entity Framework, which has been cooking for several years and survived controversies, comparisons with non-shipped previous attempts and other natural disasters; the other one is the ADO.NET Data Services Framework or Project Astoria, which was built, well...fast.

I won't go into details of the release, folks have discussed the Astoria, Entity Framework and general data-related features in the release already.

Why have I been under a rock? In the last few months I've been spending time working on various things related to Astoria, online services and data interfaces. Some I can discuss, some will need to wait a bit until the stakeholders are comfortable to talk about it publicly.

Moving Astoria as a framework forward: we were ready (modulo bug fixing and last minute tweaks) some time ago, and we've been thinking about the next steps for the Astoria framework. In Mix 08 we mentioned that we were working on "Astoria Offline" and showed a prototype. We've been working hard in that topic. There is also a bunch of features we want to take on for the next release. I'm sure we'll post something in the Astoria blog at some point about our thinking and give a change for folks to give feedback.

Online services: as you can imagine there is a number of things going on around online services these days, and a number of them involve Astoria one way or the other. I've been working with several of them, varying from providing guidance all the way to writing custom "v.next" versions of Astoria to experiment with their needs. An example of these efforts is the work we're doing to align SQL Server Data Services and the ADO.NET Data Services framework. We would like to see them as the "service" and the "framework" pieces, both using the same HTTP interface, same client interfaces, etc., so we've been spending a bunch of time exploring how to bring them together.

Anyway, there, a bit of a celebration.

-pablo

 

Roger just tagged me for this software development meme thing…it looks like Julia tagged him, Shawn tagged Julia, etc. so all the usual suspects have been down this path already. I’ll bite…

How old were you when you first started programming?
I got my first computer, a Commodore 64, when I was 10. I just had to figure out how things like games worked, so I made my way through Commodore BASIC then (later I figured that none of the games I used where actually written in BASIC…).

How did you get started in programming?
I kind of started twice…first when I got my first computer, I got a couple of books on BASIC. None of the programs I wrote back then did anything useful. I "started again" when I was in school studying industrial electronics, where I learned assembly (for the Motorola 6809) and then C for controlling microprocessors/microcontrollers. 

I loved their bottom-up training style in this place: we first learned to assemble by hand using the processor manual and literally enter hex number into the "kits" using an hex keyboard. Then we used an assembler. Then we learned this "simpler, shorter way" of writing programs (a subset of C), which we still were required to translate to assembly by hand on paper. Only at the very end we were allowed to use a C compiler.

What was your first language?
First language at all: BASIC. First language for a useful program: C. First language for a paid job: C++.

What was the first real program you wrote?
As part of a school project I designed the hardware and wrote the corresponding software for controlling direct-current electric motors. The software had feedback from the engine (electric current consumption, speed in RPM, temperature) and maintained constant speed on the motor and made sure it stayed within safe ranges of temperature and energy consumption. I also did some graphics of all the state information on the screen (a "Hercules" monochrome screen).

There is something interesting about writing software that can physically break things…

What languages have you used since you started programming?
I wrote commercial-grade software in 80x86 assembly, C, C++, Prolog, SQL, LotusScript, FoxPro, VB, Javascript, Java and  C#. I've also written/modified small programs in other languages such as assembly language for various Motorola processors (6510, 6809, HC11), Haskell/Gopher, COBOL, BASIC and Pascal.

What was your first professional programming gig?
I wrote a small messaging application for Kodak Argentina as a freelancer. They wanted the thing to be really lightweight and really fast, so they wanted the whole thing in C/C++ with no dependencies. In retrospective I did all the wrong things on this one, such as re-inventing the wheel several times (wrote a small database from scratch, a synchronization-over-cc:mail infrastructure, etc.).

If you knew then what you know now, would you have started programming?
Absolutely. I still find it amazing that somebody actually pays me to do what I do. I get to spend most of my days working with smart folks solving hard problems. Couldn't ask for more.

If there is one thing you learned along the way that you would tell new developers, what would it be?
Well…I'll go with two things: first, stay close to the code where things are real, don't go into high-level-limbo-land –at least not too fast. Second, even if you're the best coder ever, you can't lock yourself in an office…software development is a social activity as much as it is a technical discipline, and it takes good interaction skills among team members to build great software.

What's the most fun you've ever had ... programming?
So hard to pick one…
 
There was this time when I lead a project (and was one of the developers as well) to build a highly scalable rules-based expert system for credit risk analysis. Expert systems are CPU-bound, so making them scale (back then more than now) meant massive distribution. We built both the expert system from scratch (first in Prolog, then switched to Java and a custom inference engine we also built), and then the networking stack and client/server agents for job distribution and control. The result was a system where you could simple plug-in more computers and you'd get more inferences/minute; the system was self-balancing, automatically adjusted itself for various nodes with different computing power, automatically recovered from failed nodes, and would scale pretty much as much as the network could take the load.

I must say though that after I thought I had all the fun and that I "knew" how to build software, I jointed Microsoft and got a different perspective, both from the people perspective and from the projects scale perspective. It's not just "programming" so it may not fit here, but being involved in building something like SQL Server is just too good…

Who's next?
Well, I can't resist the temptation of going with a few of the Astoria team folks, Andy, Mike, Marcelo, Phani.

-pablo

 

3 Comments
Filed under:

The news are out. The ADO.NET Data Services Framework (Astoria) and the ADO.NET Entity Framework will be shipping as part of .NET 3.5 SP1, and the Beta 1 release is now available. All the official blogs discussed the details already, including the Astoria team blog, ADO.NET team blog, Scott's, and many others out there.

Folks out there trying Astoria and the EFx have been working on bits from last December for a while. Finally we have a newer release for everyone to take a look, try stuff and send feedback.

In addition to the release of the framework and VS, we also put out the Data Services AJAX library in codeplex.

This is the last beta before we're done, so give this release a shot and let us know what you find!

-pablo

 

More Posts Next page »
 
Page view tracker