-
Are you at TechEd this week? If so, we have a few Astoria related sessions going on. If you are at the event and want to chat about Astoria, drop me a note or swing by one of the sessions:
6/4 8:30am-9:45am
ADO.NET Data Services for the Web
(aka "Project Astoria") by Mike Flasko
6/5 Noon-12:45pm
ADO.NET Data Services Deep Dive
by Mike Flasko
We also have a booth and a number of EF related sessions. For more details on these, check out this post.
-Mike Flasko
ADO.NET Data Services Framework, Program Manager
-
We have received a lot of questions lately about how to authenticate calls to an ADO.NET Data Service. Mike Taulty has created a nice post outlining some of the options for authenticating calls to data services. Check it out here: http://mtaulty.com/CommunityServer/blogs/mike_taultys_blog/archive/2008/05/27/10447.aspx
-Mike Flasko
ADO.NET Data Services Framework, Program Manager
-
So far Astoria has used “merge” semantics for update. That is, an “update” operation (an HTTP PUT request) replaces the values of the properties for the target entity that are specified in the input payload; this applies both to properties and links. If a property or link is not present in a PUT operation, it means “leave it with its current value”. To null-out a property or link it has to be present in the PUT body and has to have a null marker (null attribute for properties, null href for links).
While supporting merge is important and will remain part of Astoria, there are scenarios where we need “replace” semantics. That is, an operation where the entity properties are entirely replaced by those in the request body, even if the request body is partial. In particular, the AtomPub protocol requires PUT to have replace semantics.
Operation semantics
In a “replace” operation, each property in the target entity either takes the value specified in the payload (if any) or its default value. The meaning of “default value” for a “replace” operation is server implementation-specific. The Astoria server will likely use the CLR default values for each value type and null for references.
For operations on primitive values (e.g. PUT /Customers(123)/CompanyName), there is no practical difference between “merge” and “replace”.
A “replace” operation, just like a “merge” operation, cannot specify different values in the key properties. That is, keys remain non-updatable in “replace”. Similarly, a “replace” operation is subject to the same requirements from the ETags perspective as a “merge” operation.
About links inside entity payloads: “replace” operations replace the properties in entities themselves, not its links. If you include a link in a “merge” or “replace” operation we’ll wire it up, but if you don’t it’ll maintain the existing links.
About directly-addressed links: links are currently atomic values (just the link itself), so there is no difference between replacing it and merging it when operating on link resources (e.g. /$links/…). Astoria servers should support both operations but keep identical semantics. The rest of this discussion won’t touch on links as stand-alone resources anymore.
Note that these operational semantics are the same regardless of the actual format (Atom, JSON, etc.) used to represent the resources being exchanged.
Finally, this in general does not apply to service operations, as the meaning of service operations is service-defined. It’s interesting to think about whether in the Astoria server library we introduce some mechanism to allow it to expose a MERGE-enabled URL, but that would still require the user-implementation to make sense of it.
HTTP interface
AtomPub specifically requires HTTP PUT to mean replace. So we adjusted the way Astoria interprets PUT to mean “replace”.
In order to request a “merge” operation we have two options:
1. Introduce a new HTTP method, “MERGE”, and rely on verb tunneling (POST + a header) for the cases where custom methods are not allowed. There has been talk about a PATCH verb in multiple circles including the AtomPub community, but it seems to be going in a somewhat different direction.
2. Introduce a new custom header, “DataServices-Merge” or something, that when set to “1” in a PUT request indicates that the server should merge the body with the server entity instead of replacing it.
While we’re not thrilled with the idea of introducing a new HTTP method, overloading PUT with an extra header seems to be very problematic. If anything else, a server that does not support “merge” through headers would see PUT as a regular “replace” request and perform an operation that’s not what the client expected. Also other things break. For example, if a server sees an actual MERGE request and cannot handle it then it can respond with 405 – method not supported.
So we’re leaning toward MERGE and tunneling (we already support tunneling for PUT/DELETE in Astoria servers and clients).
Responses to “merge” and “replace” requests are identical.
.NET client
Using “merge” from the client has several advantages:
1. The server does not know what a client considers a “whole” entity. The client may be using entity types that contain a subset of the properties of the server-side version, either due to versioning mismatches or because the client is not interested in all of the properties.
2. The client is pretty much required to use “merge” to make links work. Since an entity might have been brought down to the client without its related entities expanded, one or more links won’t be present, and if we used replace we’d lose information on the server.
Based on the above, it would seem that the client should do “merge”, which will effectively result in “replacing the subset the client knows about” because the client always sends all the fields.
The problem now is servers that don’t implement “merge” operation. One option is to require “merge” for the client to work, but that leaves too many interesting scenarios out. Since we already had a SaveChangesOptions enum that’s used as an argument to SaveChanges/BeginSaveChanges, we introduced SaveChangeOptions.UpdateAsReplace to indicate that you want the client to use PUT.
AJAX client
The AJAX client’s DataService.update method can be extended with a new argument “UpdateAsReplace” that enables use of replace when set to true. By default we would continue to do “merge” (this requires tweaking the library because currently DataService.update generates a PUT request).
If the new boolean adds one too many arguments for update(), alternatively we could add a knob to the DataService class, we still made this change.
Astoria runtime-data source interaction
The only change in the interaction between Astoria and the data source should be that for the resource being replaced the system will call IUpdatable.ReplaceResource instead of IUpdatable.GetResource.
Pablo Castro
Software Architect
Microsoft Corporation
This post is part of the transparent design exercise in the Astoria Team. To understand how it works and how your feedback will be used please look at this post.
-
Do you want to work on the next generation of data access APIs for the web? If so, the Astoria and XML teams are hiring. If you want to get a feel for the types of problems our team thinks about the solutions we build, check out the earlier posts on this blog as well as
http://msdn.microsoft.com/data and http://msdn.microsoft.com/xml
We have a range of job openings across disciplines (Development, Developer in Test and Program Management) available on the Astoria and XML teams. If you are interested in any of these positions, please send myself (mike.flasko@microsoft.com) and Andy Conrad (aconrad@microsoft.com) email.
For more details on each of the open positions, please see:
ADO.NET Data Services PM
XML PM
XML PM
ADO.NET Data Services SDE
ADO.NET Data Services SDE
XML SDE
ADO.NET Data Services SDE/T
ADO.NET Data Services SDE/T
XML SDE/T
We look forward to talking with you...
Thanks,
Mike Flasko : ADO.NET Data Services ("Astoria"), Program Manager
-
We are very excited to announce that .NET 3.5 SP1 Beta 1 and Visual Studio 2008 SP1 Beta 1 are now available!
This beta marks the entry of the ADO.NET Data Services Framework as well as the ADO.NET Entity Framework as part of the overall .NET/Visual Studio product and will be the final beta before the RTM of both technologies.
The remainder of this post will cover the changes and additions to the ADO.NET Data Services Framework since the last CTP in Dec 07. The
Since our last CTP in Dec 2007 along with the ASP.NET 3.5 Extensions Preview, there have been a number of changes and added features. I'll try to summarize the changes and features below. We'll follow up as we go with some more details on the changes and what to expect post Beta 1.
Changes:
- Assembly and namespace changes. Now that we are part of the .NET Framework we have changed our assembly, namespace and API names to reflect the standard .NET naming conventions. The main assembly and namespace changes are:
- All Microsoft.Data.*.dll assemblies have been renamed to System.Data.Services.dll (server) and System.Data.Services.Client.dll (client)
- Anything in Microsoft.Web.* namespaces have moved to System.Data.Services (server types) & System.Data.Services.Client(client types)
- The assemblies are now installed to the standard location for .NET 3.5 assemblies
- API Name Changes. The main API name changes are:
- In general anything which was named WebData* has changed to DataService*
- In general anything with was named ResourceSet* or Resource* was changed to EntitySet* and Entity*
- WebDataService class changed to DataService
- IWebDataServiceConfiguration changed to IDataServiceConfiguration
- WebDataServiceContext class changed to DataServiceContext
- WebDataQuery class changed to DataServiceQuery
- ResourceActions enum changed to UpdateOperations
- Query Interceptor Changes. We changed the syntax of query interceptors to take 0 arguments and return a predicate (return type = Expression<Func<[EntityType],bool>>. An example interceptor that limits queries to all categories starting with the letter "B" now looks like:
1: [QueryInterceptor("ProductCategory")] 2: public Expression<Func<ProductCategory, bool>>
OnQueryProductCategory()
3: { 4: return (pc) => pc.Name.StartsWith("B"); 5: }
- Update Interceptor Changes. The ResourceActions enum changed name to UpdateOperations
- AJAX/Javascript Library. The Javascript library for data services which was part of the ASP.NET 3.5 Extensions preview is not part of this beta release. Instead, we will iterate in short intervals on this library, making intermediate drops available on http://codeplex. The first drop which works with this Beta 1 release is available here.
- Command line tool changes. The command line tool to generate client side types for a data service (webdatagen.exe) has had a its name changed to datasvcutil.exe and its parameter list simplified. You can now find this tool in the \Windows\Microsoft.Net\Framework\V3.5 directory
- A bunch of bug fixes :)
- Tweaks to the ATOM payload format. We've made a few tweaks to the payload format based on feedback from the ATOM community. We've got a bit more to do here so please expect a bit of churn to the payload formats post Beta 1.
Features:
- Batching: data services now support the ability to group a set of requests into a "batch" to be sent to the server in a single HTTP request. The system supports the idea of an atomic group of operations as well as a loose group of operations without such guarantees. This release doesn't quite have what we're thinking in terms of a final design for this feature, but is quite representative of our thinking. An early write up of the feature is here.
- Optimistic Concurrency: Data services now support the notion of optimistic concurrency by passing concurrency token values using HTTP ETags and making conditionals requests using HTTP If-* requests. Some notes on how this works are here. We'll also likely extend support post this Beta release to include use of the '*' character in conditional requests.
- New IUpdatable interface. As was the case in the last CTP, you can create a data service over relational databases using integration with the Entity Framework or you can expose any data source as a REST service that has an IQueryable provider. In the last CTP we had defined an IUpdatable interface which could be implemented to make such data sources r/w at the service tier. We have significantly changed this interface to make it easier to use. I've put this "change" in the list of features as we redesigned the API based on our teams reviews and user feedback. A write up of the new interface is here.
We look forward to your feedback...
-Mike Flasko
Program Manager, ADO.NET Data Service Framework
-
We have received a lot of feedback over the past few weeks asking when will be update the Silverlight library for data services. I thought I'd put up a short post to update everyone on where we are at and what our thinking is....
I'll start by saying we are targeting Beta 2 of the Silverlight SDK (no dates to announce for this just yet) to have a version of the client library for ADO.NET Data Services. Given that we use many of the core pieces of Silverlight to enable data service interaction, we've waited for those to come into the platform so that we can start to round out our Silverlight experience.
The scenarios we're looking to enable in Beta 2 are: the ability to send async queries, inserts, updates and deletes for same domain requests. We're still working through how we'll enable these types of requests in cross domain scenarios.
If you have questions/comments about data services or our plans around Silverlight and data services please leave a comment here or on our online forum:
http://forums.microsoft.com/MSDN/ShowForum.aspx?ForumID=1430&SiteID=1
Mike Flasko
Program Manager, ADO.NET Data Services Framework
-
Different applications have different requirements around consistency and how concurrent modifications are handled. I’ll oversimplify and put all these applications in two buckets: either you care about controlling concurrent changes or you don’t.
If you’re creating a REST interface to your data and don’t care about concurrency (e.g. no deep consistency rules, or nice units of change that change in whole consistent ways), then you can use the basic HTTP methods to retrieve (GET) and manipulate (POST, PUT, DELETE) resources directly without any more context than the representations of your resources. You get “last one wins” semantics on updates in this case.
On the other hand, if you do care about concurrency in your REST interface, there are more aspects to take into consideration. If your resources are atomic (no further structure that’s interesting from the concurrent changes perspective than the resource as a whole), then you can have an out of band mechanism for creating a “version number” for each resource -typically a monotonically increasing number- and use HTTP’s existing mechanism to ensure you overwrite stuff that you know about. In HTTP you can stick an “entity tag” or ETag to your responses that contain an opaque value used to denote the version or state of a resource. Later on, when you want to modify a resource, you can use that value in a “if-match” request header to make sure that your knowledge about the state of the resource you’re modifying is still current. If it’s not the resource in the sever would have an ETag that won’t match the one you provided and you’d get back a 412 “Precondition failed” status code. All that is standard HTTP 1.1 stuff described in RFC 2616. (ETags are also used for caching and conditional gets in addition to the scenario I described here).
Now, REST data services that expose structured data have to deal with various challenges beyond the basics, which I’ll go into details below. While I discuss this in the context of the ADO.NET Data Services Framework (Astoria), I’m sure some of these problems apply to a broader set of applications.
Creating ETags: concurrency tokens
The data services framework has to deal with the fact that we don’t control the data sources that we expose through the REST interface. Sometimes each entity that we turn into a resource will have a nice clean property that’s a timestamp or similar and maps perfectly to ETag semantics (e.g. whenever we change the value in a significant way the value of this property changes). However, often the schema of the underlying data is not under the control of the service developer so we have to work with what we have. What that means in practice is that you can tell the data services framework which properties of each entity type are “concurrency tokens”. Changing those values means that you chanced the version of the resource.
The way you do that in the framework is by using an [ETag(props…)] attribute in your class or an annotation in your EDM schema. For types that don’t have any concurrency tokens we won’t generate ETags for the responses for those types, and they get “last one wins” update behavior.
Once you indicated which property or properties are your concurrency tokens we can produce ETags by using the values of those properties for the particular instance we’re returning.
During update the data services runtime works with the data source to determine whether the concurrency token values that were marshaled through ETags and if-match headers still match, and if so perform the update/delete operation. If they don’t match a 412 response is sent to the client.
Including ETags in headers and/or payloads
The HTTP spec describes the ETag response header to transport the entity tag for a given resource. That works great for us for cases were we respond with a single entity (e.g. an entry in Atom terms), but it doesn’t when we return a collection of entries from a URL (e.g. an Atom feed). For the latter scenario, we include the ETag as part of the resource representation (in the entry for Atom, in the “__metadata” property for JSON), for example:
<entry m:type="BikesModel.Customer" m:etag="'A%20Bike%20Store'">
<!-- rest of the entry -->
</entry>
Validation during side-effecting operations
Concurrency tokens are validated whenever you perform an operation that affects the state of an existing resource. In the data services REST interface that means HTTP PUT and DELETE methods.
As I mentioned above, validation happens during update processing by extracting the “original” values from the ETag (which was sent back through the if-match header) and comparing them with the data in the data source. If they are the same, we consider the whole resource the same and proceed with the modification.
An interesting question is whether presenting an ETag in a if-match header should be mandatory for resources that have concurrency tokens. Put another way: should the decision of whether it’s ok to potentially overwrite changes based on state knowledge be up to the client or restricted by the server? The HTTP spec defines a special value of “*” for the if-match header that effectively means “any value will match”. The behavior that we are planning for is that if an entity type has concurrency tokens then we’ll always require an “if-match” header in modification operations. The header value can be an actual ETag obtained through a GET request or “*” meaning “I know this type supports concurrency control, but I’ll overwrite it anyway”.
Almost, but not quite, a perfect match
HTTP ETags and conditional operations are almost a perfect match to what we need to handle concurrent activity in RESTful data services. There are, however, a few glitches. This is where we get into the fine-print that’s not necessarily popular knowledge. Mike brought up many of these details I wasn’t aware of.
ETags can be “strong entity tags” or “weak entity tags”. Weak ETags are very similar to what happens when we have entities for which only some properties are designated concurrency tokens. From section 13.3.3 of the HTTP spec:
“However, there might be cases when a server prefers to change the
validator only on semantically significant changes, and not when
insignificant aspects of the entity change. A validator that does not
always change when the resource changes is a "weak validator." “
The problem is that weak ETags only apply to GET operations, they cannot be used for PUT/DELETE which is what we’re trying to do.
For cases where you own the data, the data services framework can expose a compliant interface by using constructs such as timestamps (if using a database as a data source), where any change in the entity will reflect in the ETag. You can also used a relaxed form of ETags where the entity might change but the ETag stay the same. It’s not completely HTTP compliant and may confuse intermediate systems, but it may be your only option in some scenarios.
As always, thoughts and feedback is welcome.
Pablo Castro
Software Architect
Microsoft Corporation
This post is part of the transparent design exercise in the Astoria Team. To understand how it works and how your feedback will be used please look at this post.
-
Around a year ago in the Mix 2007 conference we announced Project Astoria, an overall initiative to understand how data is used on the web and what frameworks, tools and services could we create to enable new and better applications in this space. Several things resulted from that effort already.
One of these results is a unified pattern for exposing and manipulating data across various data-centric services using the Atom format and the AtomPub protocol, plus a set of common conventions for constructing URLs to point to resources. This pattern is shared by some Windows Live services as announced here and articulated in more detail here, and also by any services created with the ADO.NET Data Services Framework.
The ADO.NET Data Services Framework is another one of the results of this project. It’s a set of libraries and tools for the .NET Framework to create and consume data services over various data sources, from relational databases to XML content to arbitrary web services. These data services expose the same AtomPub-based interface and same URL conventions, so clients and tools equally apply to online services and to your own services, on the web or on-premises.
The other piece we discussed at Mix 2007 was an experimental online service. As we explained when we first announced it, this service wasn’t a real “online service” in the sense that it wasn’t backed by an internet-scale infrastructure and such. The goal of the service was to learn about data interfaces for online services.
Now, at Mix 2008, we announced our plans to offer a real internet-scale data service called SQL Server Data Services (SSDS). General information about SSDS can be found here, and you can also watch the talk that Nigel Ellis gave at Mix about it. SSDS is a real service that is being made available in a closed beta release. We are now taking registrations for participation in the SSDS beta program. Please go here to sign up. Customers are being accepted into the beta program on a rolling basis.
As part of our unified Data Services vision, we will provide a solution that enables seamless integration between on-promise deployments (software) and cloud based deployments (services). We’re working on aligning aspects of SSDS and Astoria and this alignment will come over a series of updates to both Astoria and SSDS. For example, we will likely add AtomPub and JSON support to SSDS to match the results encoding of Astoria and are already working on extensions to EDM to incorporate the open content model of SSDS. We’ll be working to extend Astoria as needed to ensure it provides a great development experience over the SSDS service. It’s also worth noting how the Microsoft Sync Framework can be used to tie multiple deployments together – Nigel covers this in his talk. Stay tuned for more details.
Given that we have SSDS out there now, in the next week or so we will take down the experimental service we’ve hosted at http://astoria.sandbox.live.com for the last year. That service uses an old interface (it’s not even compatible with the current Astoria patterns), and it’s not meant for real use anyway.
Having a service out there with sample data is really handy though, so we’re exploring options to host a few read-only services somewhere for everybody to access for experimentation and demo purposes.
Are we done? Absolutely not. You’ll see more coming from Project Astoria over time. An example of things we’re exploring for the future is synchronization/offline capabilities for services and service clients, as we demo’ed in this Mix session (Astoria-related stuff starts at minute 35 or so, but the whole talk is really interesting).
Pablo Castro
Software Architect
Microsoft Corporation
-
Astoria service allows reading/querying of data via the already-established IQueryable interface – this helps in abstracting Astoria from the underlying data source. But there is no existing interface for the update operations (CUD – create, update, delete operations). Hence we came up with IUpdatable interface to support CUD operations and support read-write services.
One of the main design goals while designing the IUpdatable interface was to make it resource independent. In other words, the methods that return objects representing resources can return anything – for Astoria, the returned object is a opaque object that represents the resource being asked, and whenever we want to use the resource (reading/updating a value from the resource), we will pass the same opaque back to IUpdatable. The actual implementation of IUpdatable needs to track the mapping between this opaque object to the actual object it represents. The only time we need the actual clr instance of the resource is when we need to serialize the object and we call a specify method on IUpdatable (ResolveResource) for that.
Let take a quick look at IUpdatable interface
public interface IUpdatable
{
/// <summary>
/// Creates the resource of the given type and belonging to the given container
/// </summary>
/// <param name="containerName">container name to which the resource belongs</param>
/// <param name="fullTypeName">full type name i.e. Namespace qualified type name of the resource</param>
/// <returns>object representing a resource of given type and belonging to the given container</returns>
object CreateResource(string containerName, string fullTypeName);
/// <summary>
/// Gets the resource of the given type that the query points to
/// </summary>
/// <param name="query">query pointing to a particular resource</param>
/// <param name="fullTypeName">full type name i.e. Namespace qualified type name of the resource</param>
/// <returns>object representing a resource of given type and as referenced by the query</returns>
object GetResource(IQueryable query, string fullTypeName);
/// <summary>
/// Gets the resource of the given type that the query points to. The resource returned contains the default values,
/// and not the value as present in the server
/// </summary>
/// <param name="query">query pointing to a particular resource</param>
/// <param name="fullTypeName">full type name i.e. Namespace qualified type name of the resource</param>
/// <returns>object representing a resource of given type and belonging to the given container and containing default values</returns>
object ReplaceResource(IQueryable query, string fullTypeName);
/// <summary>
/// Sets the value of the given property on the target object
/// </summary>
/// <param name="targetResource">target object which defines the property</param>
/// <param name="propertyName">name of the property whose value needs to be updated</param>
/// <param name="propertyValue">value of the property</param>
void SetValue(object targetResource, string propertyName, object propertyValue);
/// <summary>
/// Gets the value of the given property on the target object
/// </summary>
/// <param name="targetResource">target object which defines the property</param>
/// <param name="propertyName">name of the property whose value needs to be updated</param>
/// <returns>the value of the property for the given target resource</returns>
object GetValue(object targetResource, string propertyName);
/// <summary>
/// Sets the value of the given reference property on the target object
/// </summary>
/// <param name="targetResource">target object which defines the property</param>
/// <param name="propertyName">name of the property whose value needs to be updated</param>
/// <param name="propertyValue">value of the property</param>
void SetReference(object targetResource, string propertyName, object propertyValue);
/// <summary>
/// Adds the given value to the collection
/// </summary>
/// <param name="targetResource">target object which defines the property</param>
/// <param name="propertyName">name of the property whose value needs to be updated</param>
/// <param name="resourceToBeAdded">value of the property which needs to be added</param>
void AddReferenceToCollection(object targetResource, string propertyName, object resourceToBeAdded);
/// <summary>
/// Removes the given value from the collection
/// </summary>
/// <param name="targetResource">target object which defines the property</param>
/// <param name="propertyName">name of the property whose value needs to be updated</param>
/// <param name="resourceToBeRemoved">value of the property which needs to be removed</param>
void RemoveReferenceFromCollection(object targetResource, string propertyName, object resourceToBeRemoved);
/// <summary>
/// Delete the given resource
/// </summary>
/// <param name="targetResource">resource that needs to be deleted</param>
void DeleteResource(object targetResource);
/// <summary>
/// Saves all the pending changes made till now
/// </summary>
void SaveChanges();
/// <summary>
/// Returns the actual instance of the resource represented by the given resource object
/// </summary>
/// <param name="resource">object representing the resource whose instance needs to be fetched</param>
/// <returns>Returns the actual instance of the resource represented by the given resource object</returns>
object ResolveResource(object resource);
}
Let’s go through each api one by one.
object CreateResource(string containerName, string fullTypeName) :– This is called when one tries to insert a new resource via the POST http method. The first parameter points to the container that the resource belongs to and the second parameter tells the namespace qualified name of the resource type that needs to be created. The second parameter might not be that useful when there is no inheritance, since from the container, one can easily figure out the type, but for inheritance cases, this is helpful. The return type, as said before, need not be the actual clr instance of the resource. It can be anything (a cookie) that only the IUpdatable implementor needs to understands.
object GetResource(IQueryable query, string fullTypeName) :- Get the given resource as resolved by the query and the namespace qualified name of the type that the query resolves to. In some cases, the full type name can be null. Look at the examples below to see the cases when the fullTypeName can be null. Again, the return type can be anything that represents the resource.
object ReplaceResource(IQueryable query, string fullTypeName) :- Very similar to the GetResource API, but this is used for replace-semantics and GetResource is used for merge-semantics. The implementation needs to return the resource referred by the query, but with default values for all non-key properties.
void SetValue(object targetResource, string propertyName, object propertyValue) :- Set the value of the property with the given name on the target resource to the given property value. The target resource is the opaque object returned by either CreateResource or GetResource. This method is called for scalar properties and complex properties only.
object GetValue(object targetResource, string propertyName) :- Gets the value of the property with the given name for the target resource. The target resource is the opaque object returned by either CreateResource or GetResource. This method is called for scalar properties or complex properties. If it’s a scalar property, we expect the returned object to be the actual value.
void SetReference(object targetResource, string propertyName, object propertyValue) :- This method is called for setting the value of navigation property with the given name on the target resource to the given propertyValue. The target resource and the propertyValue are opaque objects returned by GetResource or CreateResource API. This method is called for navigation properties that refer to a single resource – representing 0 or 0..1 side of a relationship.
void AddReferenceToCollection(object targetResource, string propertyName, object resourceToBeAdded):- This method is called for adding a resource to the collection navigation property with the given name on the target resource. Again, targetResource and resourceToBeAdded are opaque objects returned by GetResource or CreateResource API. The navigation property presents many side of a relationship. We generally call this operation as binding, which means you are binding the resourceToBeAdded to targetResource. In other words, you are setting up a relationship between the 2 resource objects.
void RemoveReferenceFromCollection(object targetResource, string propertyName, object resourceToBeRemoved):- Very similar to the above method, except it removes the given resource from the collection navigation property on the target object. We generally call this operation as unbinding, which means you are deleting the relationship between the two resource objects in question.
void DeleteResource(object targetResource):- This actually deletes the given resource. Again, the targetResource is the opaque object returned by GetResource or CreateResource API.
void SaveChanges():- This actually saves all the changes that has been made till now, using the above API’s. The IUpdatable implementation needs to track all changes until this API is called and then save all of them when this API is called. The IUpdatable implementation is expected to save all the changes or nothing
object ResolveResource(object resource):- This API is called whenever we want to resolve the opaque object returned by the CreateResource or GetResource API into the actual clr instance. This normally is called after SaveChanges, when we want to serialize out the resource (for POST methods). This method is also called if there are UpdateInterceptors, that needs to be invoked with the actual clr resource instances or the provider supports optimistic concurrency and the resource type has concurrency tokens (defined via etag properties in clr based provider).
This blog has already become bigger that I intended it to be. In my next blog, I will try and come up with some examples and the sequence of calls made on the IUpdatable interface for each of them. I have purposefully keep this post simple just so that people can get the basic idea of IUpdatable interface. There are few features like ETag, Update interceptors, etc due to which there might be additional calls on this interface. I will try and cover them in my coming blogs.
Pratik Patel
Developer, ADO.NET Data Services Framework
(Project Astoria)
This post is part of the transparent design exercise in the Astoria Team. To understand how it works and how your feedback will be used please look at this post.
-
We have received a fair amount of feedback regarding a number of use cases where it would be beneficial to enable a client of a data service to “batch” up a group of operations and send them to the data service in a single HTTP request. This reduces the number of roundtrips to the data service for apps that need to make numerous requests to the data service to perform a given action and allows a set of operations to be logically grouped together.
Below is the design we have landed on. Note: We will have something close to this design in our next CTP/Beta release of Astoria, but its not quite there yet.
Background
The base ADO.NET Data Services Framework semantics provide two mechanisms to query and send updates to a data service. From a high-level they are:
Query:
1) Send an HTTP GET request to a URI representing a resource (or set of resources) and receive in the response a representation of the resource (or set of resources). Example: a GET request to /Customers(1) returns a single customer entity in the response
2) Same as #1, but add the $expand query string operator to the request to request resources related to the resource(s) specified in the request URI be returned in the response as well. Example: a GET request to /Products(1)?expand=Category, Parts returns product #1 as well as the Parts and Category associated the product in the response
Update:
1) Insert / update / delete a single resource per HTTP request by sending POST, PUT or DELETE requests to a data service
2) Insert a new resource and related resources in a single request. Example: a POST to /Customers can insert a Customer and related Orders in a single request by inlining the related orders in the request body
Why do we need batching?
Now assume you have the following situation: Single “Save” button per page in my RIA line of business application: Contoso Solutions is building an online Silverlight-based order entry system for its salesforce. Any given sale requires a number of entities within the data service be inserted and/or updated. Some of the entities are associated via navigation properties while others have no relation to the other entities being acted on as part of processing the sale. The user experience contoso wants to enable is to paint the full order processing information on a single screen and include a “save changes” button at the bottom to persist all the updates made to create the order.
In this case, the $expand operation cannot be used to pull down all the data to paint the order entry screen in a single HTTP request. Also, the update operations cannot easily be persisted as an atomic set of operations to the underlying data store.
Batching Design
To support batching, ADO.NET Data Services has added a new $batch URI which will accept batch requests and return batch responses. Logically a batch request is a group of 0 or more QueryOperations and 0 or more ChangeSet operations. QueryOperations are analogous to a simple "non batch" query request. ChangeSet operations are just a group of unordered, atomic CUD (update,insert&delete) operations (ie. all operations succeed or none do).
Now that we have the logical model (Batch is a collection of ordered QueryOps and ChangeSets), we needed a wire representation. After a bit of exploring various ways to represent batches in ATOM, JSON , etc, Yaron Goland pointed out to us there is already a well defined way to represent multiple HTTP requests in a single request using multipart/mixed MIME messages and the mime type application/http. This turned out to be just what we needed and enables us to easily encapsulate binary and text based content in a request or response. Also, it looks like using multipart/mime for batching has been explored (with pretty positive feedback) a number of times in the blogosphere, so perhaps we'll all land on something generally applicable. Instead of describing this, lets just look at an example.
Example - Batch Request:
The example assumes the batch request is sent to a data service located at: http://foo.com/dataservice.svc
The Batch example contains the following operations (in order):
- A Change Set which contains the following operations in order:
- POST operation
- PUT operation
- A Query Operation
- A Query Operation
Note:
- Outer HTTP Request elements & batch boundaries are shown in blue
- Query Operations are shown in green
- Change Sets are shown in red
POST /dataservice.svc/$batch HTTP/1.1
Host: foo.com
Content-Type: multipart/mixed; boundary=batch(36522ad7-fc75-4b56-8c71-56071383e77b)
--batch(36522ad7-fc75-4b56-8c71-56071383e77b)
Content-Type: multipart/mixed; boundary=changeset(77162fcd-b8da-41ac-a9f8-9357efbbd621)
Content-Length: ###
--changeset(77162fcd-b8da-41ac-a9f8-9357efbbd621)
Content-Type: application/http
Content-Transfer-Encoding:binary
POST /dataservice.svc/Categories HTTP/1.1
Host: foo.com
Content-Type: application/atom+xml;type=entry
Content-Length: ###
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<entry xmlns:d="http://schemas.microsoft.com/ado/..."
xmlns:m="http://schemas.microsoft.com/ado/.../metadata"
xmlns="http://www.w3.org/2005/Atom">
…
<content type="application/xml">
<d:CategoryName>Software</d:CategoryName>
<d:Description d:null="true" />
<d:Picture d:null="true" />
</content>
</entry>
--changeset(77162fcd-b8da-41ac-a9f8-9357efbbd621)
Content-Type: application/http
Content-Transfer-Encoding:binary
PUT /Categories(5) HTTP/1.1
Host: foo.com
Content-Type: application/atom+xml;type=entry
If-Match: xxxxx
Content-Length: ###
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<entry xmlns:d="http://schemas.microsoft.com/ado/..."
xmlns:m="http://schemas.microsoft.com/ado/.../metadata"
xmlns="http://www.w3.org/2005/Atom">
…
<content type="application/xml">
<d:CategoryID>5</d:CategoryID>
<d:CategoryName>UpdateCategoryName</d:CategoryName>
</content>
</entry>
--changeset(77162fcd-b8da-41ac-a9f8-9357efbbd621)--
--batch(36522ad7-fc75-4b56-8c71-56071383e77b)
Content-Type: application/http
Content-Transfer-Encoding:binary
GET /Categories(5) HTTP/1.1
Host: foo.com
--batch(36522ad7-fc75-4b56-8c71-56071383e77b)
Content-Type: application/http
Content-Transfer-Encoding:binary
Operation: GET /Categories(6)
Host: foo.com
--batch(36522ad7-fc75-4b56-8c71-56071383e77b)--
Batch Response
Now that we have seen what a request looks like, the response is pretty much the mirror image of the request (also uses multipart/mime), with a mime part containing the associated HTTP response for each operation in the batch request. The exception to this rule is for responses to ChangeSets. Since ChangeSets are atomic if an operation in the set fails, the response for the ChangeSet is a single HTTP response instead of a nested multipart/mixed collection of responses.
This is already getting a bit long, so I'll cut this off here. In a future post we'll walk through an end to end request + response and talk about how to cross reference operations within a batch request. What do you think so far? Are we overlooking/missing something?
Mike Flasko
ADO.NET Data Services, Program Manager
This post is part of the transparent design exercise in the Astoria Team. To understand how it works and how your feedback will be used please look at this post.
-
(sorry, tricky problem -> long write-up)
One of the few things pending in the server library of the ADO.NET Data Services Framework is query caching to help with performance. Here is a brief explanation of why we needed and a couple of design options. Feedback is welcome.
Query processing in Astoria
To give a little bit of context, let me first briefly go through the process that takes place when Astoria receives a URL and works its magic to turns it in query results ready for serialization in the HTTP response.
Data sources are hooked into Astoria services through LINQ expression trees. That means that the role of Astoria during query processing is to take a URL and translate it into an expression tree that then we can ask the data source to execute and give us results. The general flow is more or less like this:
URL -> Expression Tree -> [Data Source Execution and Materialization] -> Objects -> Serialization (Atom/JSON)
The thing in brackets is data source-specific. For example, if using the ADO.NET Entity Framework it would look like this:
URL -> Expression Tree -> Canonical Query Tree -> View expansion/Query simplification -> Canonical Query Tree -> SQL -> Rows (DataReader) -> Entities (DataReader) -> Objects -> Serialization (Atom/JSON)
This is not a cheap thing to do in every request, hence this discussion about query caching.
Why query caching
The processing pipeline that I showed above can be expensive. In particular, all that query translation between different tree types as well as all that analysis for view expansion and query simplification is an expensive, CPU-bound activity that we want to avoid as much as we can.
To help with this we are planning to introduce query caching, similar to what database systems do (e.g. the SQL Server “proc cache”). The idea of query caching is that for commonly requested URLs we’d bypass most of the processing required to setup the query for execution, and we’d go as directly as possible to the execution phase.
Since Astoria is a generic framework that works on many data sources, we have to enable this in a way that allows different data sources plug in their query caching capabilities if they have such thing.
Data source-independent query compilation
In order to implement caching, we need a way of doing our own work for translation, then tell the data source to do its own work on the expression trees, and then save that work to be used every time we have to respond to the “same” request (the definition of “same” is well…difficult, more about this later); that is, we need a way of compiling queries.
You can imagine an interface with two methods for this:
interface IQueryCompilationProvider
{
object CompileQuery(Expression<Func<T1, T2, …, Tn, TResult>> query);
IEnumerable ExecuteCompiledQuery(object compiledQuery);
}
The idea is that each data source can give whatever meaning it wants to “compiling” a query, and they can return us an opaque token (object) that we’d pass back along with parameter values in order to execute the query. Ideally you’d do as much work as possible. For example, there is a CompiledQuery class in LINQ to SQL and LINQ to Entities that can do the translation all the way to a SQL statement only once and then re-use it in all subsequent executions (re-binding parameter values).
Query caching design, part 1: for simple cases, a simple design does it
In the simpler cases where the data services does not have customizations the URL -> Expression Tree translation is 1:1. The only consideration in this case would be to parameterize the URLs so that we don’t fragment the query cache with URLs that are the same query with different constants (e.g. /Customers(1) vs /Customers(2), or /Customers?$filter=City eq ‘Seattle’ and /Customers?$filter=City eq ‘Las Vegas’).
In this case we can keep a simple map of parameterized URL to compiled query. If we don’t find a given URL, we go through the full translation process to produce an expression tree, and then –assuming the data source supports compilation- call CompileQuery to obtain a compiled query opaque token. On subsequent executions we look up the URL, find the compiled query token, bind the constants that we turned into parameters to parameter values and execute the query directly.
The rest is standard caching stuff…keep a map, have a limit and an eviction policy, make it “spike resistant”, etc.
Of course, life is rarely that simple…
Query interceptors
So far we’ve assumed that there is a 1:1 correspondence between (parameterized) URLs and expression trees. Astoria has a nice feature called “query interceptors” that causes that correspondence to break.
A query interceptor allows the service developer to introduce a custom filter predicate for each entity set that’s exposed through the service interface. For example, if you had a Customers table and a CustomerAccess table that indicates which user-ids can see which customers, you could implement entity-level security for customers by adding this interceptor:
[QueryInterceptor("Customers")]
Expression<Func<Customer, bool>> QueryCustomers()
{
// retrieve the user from the environment (e.g. the currently //logged-in user)
UserDescriptor u = GetUser();
if (u.IsAdmin)
return c => true; // can see all customers
else
return c => c.CustomerAccess.Any(ca => ca.UserID ==
u.UserID);
}
Interceptors are invoked whenever a URL involves the entity-set the interceptor is bound to, regardless of how. The system injects a Where operator with the predicate returned by the interceptor. This includes top level entity-set access (e.g. /Customers), access of subsets through link traversal (e.g. /SalesPeople(123)/AssignedCustomers), and inline expansion (e.g. /SalesPeople(123)?$expand=AssignedCustomers).
Note that now URLs and expression trees are no longer 1:1. There are two kinds of differences that might show up:
a) Same query but different parameters. For example, if user 1 and user 2, both not administrators, fetch the URL /Customers, then both requests will produce identical query trees, with the only difference that the value of “u.UserID” will be different. The difference with the parameterization of the URLs is that in this case we’re not the ones parsing the input.
b) Different query. In the example above, if one of the users accessing the URL /Customers is administrator and the other is not, the filter predicates used in each of them will be different.
Now the caching thing got complicated.
Side-note: I’ve excluded update interceptors in this discussion because they don’t participate in query composition. They are a key part of enforcing access control the way I described above though.
Query caching design, part 2
We really wanted to make query caching work without any API surface other than maybe a configuration knob for the size of the cache. That’s not looking great at this point, but we do have a few options on the table.
The essence of the problem is the fact that query interceptors need to be able to capture data from the execution environment at the time a given request is being processed. The main example of this is grabbing the user credentials for a given request (e.g. Thread.CurrentThread.CurrentPrincipal, or the user-id extracted from an encrypted cookie/custom HTTP header), but there can be other scenarios as well.
There are a couple of approaches that could do the trick, but then come with their trade-offs:
Option I: a bit of extra magic for a nicer API.
We could let users write interceptors just like I showed above. Those interceptors mix references to environment data and the description of the filter predicate in a single construct, a LINQ expression tree that has references to external variables in the reference closure. What the developer writes looks like this (copied from above):
[QueryInterceptor("Customers")]
Expression<Func<Customer, bool>> QueryCustomers()
{
// retrieve the user from the environment (e.g. the currently logged-in user)
UserDescriptor u = GetUser();
if (u.IsAdmin)
return c => true; // can see all customers
else
return c => c.CustomerAccess.Any(ca => ca.UserID ==
u.UserID);
}
To cache queries we would have to:
· On every request invoke all interceptors that are needed
· Extract all uncorrelated subexpressions from each interceptor filter predicate and turn them into constants (do “funcletization” in LINQ jargon), then replace the constants with generic parameters. Now we have a little tree for each interceptor
· Now use a parameterized URL + all the interceptor trees the caching key.
Option II: explicitness at the cost of API complexity
The other approach is to explicitly separate the definition of the filter predicate from the per-execution state that comes from the environment. The developer would write two pieces, statically-defined filter predicate and a method that sets-up per-request state, as follows:
[QueryInterceptor("Customers")]
static Expression<Func<Customer, CustomersDbContext, Dictionary<string, object>, bool>> QueryCustomers()
{
return (c, ctx, state) => (bool)state["IsAdmin"] ||
c.CustomerAccess.Any(ca => ca.UserID ==
(string)state["UserID"]);
}
void override OnStartRequest(RequestDescriptor descriptor, Dictionary<string, object> state)
{
// retrieve the user from the environment (e.g. the currently logged-in user)
UserDescriptor u = GetUser();
state[“IsAdmin”] = u.IsAdmin;
state[“UserID”] = u.UserID;
}
As you can see, the definition of the filter predicate gets a bit trickier (more parameters in the lambda expression in particular).
Of course, since at this point the interceptor is called only once and can’t use any request-bound context, we may as well just remove the idea of a method all together, and move the filter setup to the service initialization, where we already have APIs for configuring service policies. So the code would become:
static void InitializeService(IDataServiceConfiguration config)
{
// other policy initialization
// ...
// filters
config.SetEntitySetFilterPredicate<Customer>("Customers",
(c, ctx, state) => (bool)state["IsAdmin"] ||
c.CustomerAccess.Any(ca => ca.UserID ==
(string)state["UserID"]);
}
void override OnStartRequest(RequestDescriptor descriptor, Dictionary<string, object> state)
{
// retrieve the user from the environment (e.g. the currently logged-in user)
UserDescriptor u = GetUser();
state["IsAdmin"] = u.IsAdmin;
state["UserID"] = u.UserID;
}
The idea is that OnStartRequest (or whatever is a nice name for that) would explicitly capture the state any and all interceptors would use. Then interceptors become static constructs that are setup during initialization. An important detail is that now the predicate expression cannot refer to variables in the environment any more. Instead, for anything that’s request specific it needs to access the “state” object, which is then setup with data on a per-request basis.
This yields a very efficient system, because now we can do minimal work on repeated requests that result in the same query, even in the presence of interceptors. On the other hand, the code that developers have to write is pretty tricky…
Is it better to take a performance hit and leave it more usable? Is there a middle ground that we didn’t consider?
If you made it reading this far, you’re probably one of few J. In any case, feedback is very welcome.
Pablo Castro
Software Architect
Microsoft Corporation
This post is part of the transparent design exercise in the Astoria Team. To understand how it works and how your feedback will be used please look at this post.
-
We’re gearing up for the next release of the .NET Framework, and we are looking for people that have a passion for building great frameworks to help with the effort. Since we’ve shipped the .NET Framework 3.5 we’ve been working on projects like the ADO.NET Data Services Framework (aka Astoria) and LINQ to XML support in Silverlight. If you’re interested in designing the next set of API’s for data and want to work for a team that’s focused on shipping technologies and having fun, we’re interested in hearing from you. Drop us a line through the team blog or directly to me and we’ll get back to you.
Carl Perry
Lead Program Manager
cperry@microsoft.com
-
Its been a year since Pablo first announced Project Astoria at MIX 07. Since then we've started from scratch and build a production version of the product which is now known as the ADO.NET Data Services Framework. At the MIX conference this time around we'll have a bunch of sessions to talk about how the ADO.NET Data Services Framework has evolved into a platform to create your own services and to talk to those from Windows Live. A snippet from the Windows Live blog is below and we'll have a lot more to show at the conference next week!
"At MIX we are enabling several new Live services with AtomPub endpoints which enable any HTTP-aware application to easily consume Atom feeds of photos and for unstructured application storage (see below for more details). Or you can use any Atom-aware public tools or libraries, such as .NET WCF Syndication to read or write these cloud service-based feeds.
In addition, these same protocols and the same services are now ADO.NET Data Services (formerly known as “ Project Astoria”) compatible. This means we now support LINQ queries from .NET code directly against our service endpoints, leveraging a large amount of existing knowledge and tooling shared with on-premise SQL deployments...."
Pablo, myself and Andy will be heading to MIX this year from the Astoria team. So, if you are at the conference and want to chat all things data services, drop us a note or we'll likely bump into you at the talks and open space (discussion) areas....
The data services focused sessions are:
Wed, March 5th - RESTful Data Services with the ADO.NET Data Services Framework
Fri, March 7th - Accessing Windows Live Services via AtomPub
Fri, March 7th - Building RESTful Real World Applications with the ADO.NET Data Services Framework
See you at MIX,
Mike Flasko
Program Manager, ADO.NET Data Services Framework
-
While going through application scenarios for the ADO.NET Data Services Framework (Project Astoria) one of the first things we noticed is that data-centric applications usually want to bring down graphs of related resources in each interaction with the server. For example, if you are retrieving a resource that represents an "Event", you may want to also bring in the set of related Contact resources that are invited or the "Venue" resource where the event will take place. This write up briefly describes how we model associations between resources as Atom links and proposes a usage pattern of the atom:link element to support retrieving resource graphs in a single response. We're looking for feedback on the approach and also to get folks thinking about inlined content and whether it should be considered an extension to Atom.
More context on Astoria support for Atom here:
http://blogs.msdn.com/astoriateam/archive/2008/02/13/atompub-support-in-the-ado-net-data-services-framework.aspx
1. Links for modeling associations between resources
Related resources can be seen at the instance level as "links" in Atom terms. Of course, from the data application development perspective, it's interesting to make this discoverable at the service description (schema) level. In Astoria data services the underlying model is the Entity Data Model (EDM), which describes data in terms of "Entities" (instances of Entity Types) and associations between entities. In the context of the Atom interface, Entities are mapped to entries and Associations to links. So by looking at the service description a developer can discover the links that will be present in an entry of a given type.
We model related entries or feeds using a link with a "rel" attribute of "related", and with a "type" of either "application/atom+xml;type=feed" or "application/atom+xml;type=entry" depending on the cardinality of the other end of the association.
One tricky aspect is that we need to indicate which association it is. At the model level we have a "navigation property" that identifies the starting "end" of the association (e.g. "Attendees", "Venue"). We currently put that name in the "title" attribute of the link. That solution is not perfect, as we try not to overload constructs that are for human-readable content. However, the alternative is to use a custom attribute, and we've been trying not to introduce custom attributes unless absolutely needed. Another option would be to use different "rel" values to specify the relationship, which feels natural but makes it much less likely that generic processors will be able to do something interesting with it.
Do these trade-offs sound reasonable? Is any of the other options more appropriate?
Continuing with the Events sample, this is what an entry (/Events(456)) with links looks like:
<entry xml:base="http://localhost:81/EventsSample/" xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata" xmlns="http://www.w3.org/2005/Atom" m:type="EventsSample.Event">
<id>http://localhost:81/EventsSample/Events(456)</id>
<title type="text"></title>
<updated>2008-02-17T02:52:38Z</updated>
<author>
<name />
</author>
<link rel="edit" title="Event" href="Events(456)" />
<link rel="related" type="application/atom+xml;type=entry" title="Venue" href="Events(456)/Venue" />
<link rel="related" type="application/atom+xml;type=feed" title="Attendees" href="Events(456)/Attendees" />
<content type="application/xml">
<d:EventID m:type="Int32">456</d:EventID>
<d:Name>Big Party</d:Name>
<d:NoteToAttendees>It's going to be a great party!</d:NoteToAttendees>
<d:DateAndTime m:type="DateTime">2008-03-05T06:00:00</d:DateAndTime>
</content>
</entry>
From the data modification perspective, links pointing to other resources in the service can be specified in the payload of POST and PUT operations, to establish links between the resource being manipulated and other existing resources.
2. Expanding links inline
As I summarized at the beginning of this note, we want to enable clients to request whole sub-graphs of data starting at some resource or set of resources. There are two aspects that need to be addressed: how does the client indicate that it wants one or more links expanded and how are the expanded links represented on the response.
How link expansion is requested is outside of the atom-syntax problem space, so I'll just briefly state what we currently do in case you have an opinion: data services support the query string option "$expand" to request link expansion. So you could say "/Events?$expand=Attendees " to retrieve all Events and all contacts that are attendees for each of them, or "/Events(456)?$expand=Attendees" to retrieve a single event (with key 456) and its attendees. Expand syntax allows for deep expands such as "Attendees/BestFriend" (expand Attendees, and on the expanded entry(es) expand BestFriend) and wide expands such as "Venue, Attendees/BestFriend" meaning expand two immediate links, and for the Attendees one further expand its BestFriend link.
For representing expanded links we put the expanded content inside the link element itself. According to section 4.2.7 of RFC 4287:
"The "atom:link" element defines a reference from an entry or feed to a Web resource. This specification assigns no meaning to the content (if any) of this element."
So it seems that adding content to the link element is not disallowed and at the same time it does not overlap with any existing semantics given to such construct. Based on that we thought it would be the perfect place for this information, as the link itself already contains the metadata about the link that we needed.
When a client indicates that the target of a link should be expanded, the server responds with the Atom representation of the resources pointed at by links wrapped in an <inline> element. For example, for "/Events(456)?$expand=Attendees,Venue" the response would be:
<entry xml:base="http://localhost:81/EventsSample/" xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata " xmlns="http://www.w3.org/2005/Atom" m:type="EventsSample.Event">
<id>http://localhost:81/EventsSample/Events(456)</id>
<title type="text"></title>
<updated>2008-02-17T03:01:18Z</updated>
<author>
<name />
</author>
<link rel="edit" title="Event" href="Events(456)" />
<link rel="related" type="application/atom+xml;type=entry" title="Venue" href="Events(456)/Venue">
<m:inline>
<entry m:type="EventsSample.Venue">
<id>http://localhost:81/EventsSample/Venues(789)</id>
<title type="text"></title>
<updated>2008-02-17T03:01:18Z</updated>
<author>
<name />
</author>
<link rel="edit" title="Venue" href="Venues(789)" />
<link rel="related" type="application/atom+xml;type=entry" title="SalesContact" href="Venues(789)/SalesContact" />
<content type="application/xml">
<d:VenueID m:type="Int32">789</d:VenueID>
<d:Name>The Cool Place</d:Name>
<d:Description>Great place for parties!</d:Description>
<d:Capacity m:type="Int32">1500</d:Capacity>
<d:Type>Nightclub</d:Type>
</content>
</entry>
</m:inline>
</link>
<link rel="related" type="application/atom+xml;type=feed" title="Attendees" href="Events(456)/Attendees">
<m:inline>
<feed>
<title type="text">Attendees</title>
<id>http://localhost:81/EventsSample/Events(456)/Attendees</id>
<updated>2008-02-17T03:01:18Z</updated>
<link rel="self" title="Attendees" href="Events(456)/Attendees" />
<entry m:type="EventsSample.Contact">
<id>http://localhost:81/EventsSample/Contacts(123)</id>
<title type="text"></title>
<updated>2008-02-17T03:01:18Z</updated>
<author>
<name />
</author>
<link rel="edit" title="Contact" href="Contacts(123)" />
<link rel="related" type="application/atom+xml;type=entry" title="BestFriend" href="Contacts(123)/BestFriend" />
<content type="application/xml">
<d:ContactID m:type="Int32">123</d:ContactID>
<d:FirstName>John123</d:FirstName>
<d:LastName>Doe123</d:LastName>
<d:EmailAddress>jd123@foo.com</d:EmailAddress>
<d:Phone>123-456-123</d:Phone>
<d:BirthDate m:type="Nullable`1[System.DateTime]">1990-04-01T00:00:00</d:BirthDate>
</content>
</entry>
<entry m:type="EventsSample.Contact">
<id>http://localhost:81/EventsSample/Contacts(124)</id>
<title type="text"></title>
<updated>2008-02-17T03:01:18Z</updated>
<author>
<name />
</author>
<link rel="edit" title="Contact" href="Contacts(124)" />
<link rel="related" type="application/atom+xml;type=entry" title="BestFriend" href="Contacts(124)/BestFriend" />
<content type="application/xml">
<d:ContactID m:type="Int32">124</d:ContactID>
<d:FirstName>John124</d:FirstName>
<d:LastName>Doe124</d:LastName>
<d:EmailAddress>jd124@foo.com</d:EmailAddress>
<d:Phone>123-456-124</d:Phone>
<d:BirthDate m:type="Nullable`1[System.DateTime]">1990-05-01T00:00:00</d:BirthDate>
</content>
</entry>
<!-- more entries for contacts that will be -->
<!-- attendees in this party -->
</feed>
</m:inline>
</link>
<content type="application/xml">
<d:EventID m:type="Int32">456</d:EventID>
<d:Name>Big Party</d:Name>
<d:NoteToAttendees>It's going to be a great party!</d:NoteToAttendees>
<d:DateAndTime m:type="DateTime">2008-03-05T06:00:00</d:DateAndTime>
</content>
</entry>
I focused on the GET operations above. We think it would be better to stay away from attempting to support full modification operations on expanded graphs. In particular, we do not handle PUT on more than one entry at a time today. We do support POSTing an expanded graph, and we simply create all the nested entries and link them to the parent entry, creating the whole graph in a single operation.
Feedback in general about this approach would be greatly appreciated.
Pablo Castro
Technical Lead
Microsoft Corporation
This post is part of the transparent design exercise in the Astoria Team. To understand how it works and how your feedback will be used please look at this post.
-
We have been looking for the last few months at adding first-class support for AtomPub to Project Astoria (we briefly touched on it before here). We are at a point where we have some parts of the AtomPub story and their initial implementation running (and we'll share fresh experimental bits soon), some parts on the design board and other parts that we haven’t explored yet. I wanted to write a few words to explain why we think AtomPub support is important and enumerate the challenges we face. Guidance and general feedback on the reasoning, the approach and on the details is very much appreciated.
Why are we looking at AtomPub?
Astoria data services can work with different payload formats and to some level different user-level details of the protocol on top of HTTP. For example, we support a JSON payload format that should make the life of folks writing AJAX applications a bit easier. While we have a couple of these kind of ad-hoc formats, we wanted to support a pre-established format and protocol as our primary interface.
If you look at the underlying data model for Astoria, it boils down to two constructs: resources (addressable using URLs) and links between those resources. The resources are grouped into containers that are also addressable. The mapping to Atom entries, links and feeds is so straightforward that is hard to ignore. Of course, the devil is in the details and we'll get to that later on.
The interaction model in Astoria is just plain HTTP, using the usual methods for creating, updating, deleting and retrieving resources. Furthermore, we use other HTTP constructs such as "ETags" for concurrency checks, "location" to know where a POSTed resource lives, and so on. All of these also map naturally to AtomPub.
From our (Microsoft) perspective, you could imagine a world where our own consumer and infrastructure services in Windows Live could speak AtomPub with the same idioms as Astoria services, and thus could both have a standards-based interface and also use the same development tools and runtime components that work with any Astoria-based server. This would mean less clients/development tools for us to create and more opportunity for our partners in the libraries and tools ecosystem out there.
How are we approaching this?
We are simply mapping whatever we can to regular AtomPub elements. Sometimes that is trivial, sometimes we need to use extensions and sometimes we leave AtomPub alone and build an application-level feature on top. Here is an initial list of aspects we are dealing with in one way or the other. We’ll also post elaborations of each one of these to the appropriate Atom syntax|protocol mailing lists.
a) Mapping the data model: how do we map Astoria’s underlying data model, the Entity Data Model, to Atom constructs. This is quite straightforward but it deserves a look for completeness.
b) We use just the regular format/protocol whenever we can, we would be interested in validating our use with folks out there
c) Using AtomPub constructs and extensibility mechanisms to enable Astoria features:
· Inline expansion of links (“GET a given entry and all the entries related through this named link”, how we represent a request and the answer to such a request in Atom?).
· Properties for entries that are media link entries and thus cannot carry any more structured data in the <content> element
· HTTP methods acting on bindings between resources (links) in addition to resources themselves
· Optimistic concurrency over HTTP, use of ETags and in general guaranteeing consistency when required
· Request batching (e.g. how does a client send a set of PUT/POST/DELETE operations to the server in a single go?)
d) Astoria design patterns that are not AtomPub format/protocol concepts or extensions:
· Astoria gives semantics to URLs and has a specific syntax to construct them
· How metadata that describes the structure of a service end points is exposed. This goes from being to find out entry points (e.g. collections in service documents) to having a way of discovering the structure of entries that contain structured data
e) How do we deal with aspects that AtomPub does not handle by design or just because it has not been needed so far?
· What to do with fields that may not have a backing value in the input source (e.g. updated, author).
· Replace versus merge semantics during updates
f) High-level client libraries. How high-level can we make clients so they can consume AtomPub-based Astoria services but still feel that they are working against regular objects and have general integration with the development environment?
There are probably more, but I think this is a good starting list.
Where do we go from here?
The folks in the AtomPub community understand this the best, so we’ll take our questions to the atom-syntax and atom-protocols lists to hear opinions there. We’ll probably track posts and comments in the Astoria blog as well so people that follow it can keep track of what’s going on in this space.
This post is part of the transparent design exercise in the Astoria Team. To understand how it works and how your feedback will be used please look at this post.
Pablo Castro
Technical Lead
Microsoft Corporation