-
We are very excited to announce that .NET 3.5 SP1 Beta 1 and Visual Studio 2008 SP1 Beta 1 are now available!
This beta marks the entry of the ADO.NET Data Services Framework as well as the ADO.NET Entity Framework as part of the overall .NET/Visual Studio product and will be the final beta before the RTM of both technologies.
The remainder of this post will cover the changes and additions to the ADO.NET Data Services Framework since the last CTP in Dec 07. The
Since our last CTP in Dec 2007 along with the ASP.NET 3.5 Extensions Preview, there have been a number of changes and added features. I'll try to summarize the changes and features below. We'll follow up as we go with some more details on the changes and what to expect post Beta 1.
Changes:
- Assembly and namespace changes. Now that we are part of the .NET Framework we have changed our assembly, namespace and API names to reflect the standard .NET naming conventions. The main assembly and namespace changes are:
- All Microsoft.Data.*.dll assemblies have been renamed to System.Data.Services.dll (server) and System.Data.Services.Client.dll (client)
- Anything in Microsoft.Web.* namespaces have moved to System.Data.Services (server types) & System.Data.Services.Client(client types)
- The assemblies are now installed to the standard location for .NET 3.5 assemblies
- API Name Changes. The main API name changes are:
- In general anything which was named WebData* has changed to DataService*
- In general anything with was named ResourceSet* or Resource* was changed to EntitySet* and Entity*
- WebDataService class changed to DataService
- IWebDataServiceConfiguration changed to IDataServiceConfiguration
- WebDataServiceContext class changed to DataServiceContext
- WebDataQuery class changed to DataServiceQuery
- ResourceActions enum changed to UpdateOperations
- Query Interceptor Changes. We changed the syntax of query interceptors to take 0 arguments and return a predicate (return type = Expression<Func<[EntityType],bool>>. An example interceptor that limits queries to all categories starting with the letter "B" now looks like:
1: [QueryInterceptor("ProductCategory")] 2: public Expression<Func<ProductCategory, bool>>
OnQueryProductCategory()
3: { 4: return (pc) => pc.Name.StartsWith("B"); 5: }
- Update Interceptor Changes. The ResourceActions enum changed name to UpdateOperations
- AJAX/Javascript Library. The Javascript library for data services which was part of the ASP.NET 3.5 Extensions preview is not part of this beta release. Instead, we will iterate in short intervals on this library, making intermediate drops available on http://codeplex. The first drop which works with this Beta 1 release is available here.
- Command line tool changes. The command line tool to generate client side types for a data service (webdatagen.exe) has had a its name changed to datasvcutil.exe and its parameter list simplified. You can now find this tool in the \Windows\Microsoft.Net\Framework\V3.5 directory
- A bunch of bug fixes :)
- Tweaks to the ATOM payload format. We've made a few tweaks to the payload format based on feedback from the ATOM community. We've got a bit more to do here so please expect a bit of churn to the payload formats post Beta 1.
Features:
- Batching: data services now support the ability to group a set of requests into a "batch" to be sent to the server in a single HTTP request. The system supports the idea of an atomic group of operations as well as a loose group of operations without such guarantees. This release doesn't quite have what we're thinking in terms of a final design for this feature, but is quite representative of our thinking. An early write up of the feature is here.
- Optimistic Concurrency: Data services now support the notion of optimistic concurrency by passing concurrency token values using HTTP ETags and making conditionals requests using HTTP If-* requests. Some notes on how this works are here. We'll also likely extend support post this Beta release to include use of the '*' character in conditional requests.
- New IUpdatable interface. As was the case in the last CTP, you can create a data service over relational databases using integration with the Entity Framework or you can expose any data source as a REST service that has an IQueryable provider. In the last CTP we had defined an IUpdatable interface which could be implemented to make such data sources r/w at the service tier. We have significantly changed this interface to make it easier to use. I've put this "change" in the list of features as we redesigned the API based on our teams reviews and user feedback. A write up of the new interface is here.
We look forward to your feedback...
-Mike Flasko
Program Manager, ADO.NET Data Service Framework
-
We have received a lot of feedback over the past few weeks asking when will be update the Silverlight library for data services. I thought I'd put up a short post to update everyone on where we are at and what our thinking is....
I'll start by saying we are targeting Beta 2 of the Silverlight SDK (no dates to announce for this just yet) to have a version of the client library for ADO.NET Data Services. Given that we use many of the core pieces of Silverlight to enable data service interaction, we've waited for those to come into the platform so that we can start to round out our Silverlight experience.
The scenarios we're looking to enable in Beta 2 are: the ability to send async queries, inserts, updates and deletes for same domain requests. We're still working through how we'll enable these types of requests in cross domain scenarios.
If you have questions/comments about data services or our plans around Silverlight and data services please leave a comment here or on our online forum:
http://forums.microsoft.com/MSDN/ShowForum.aspx?ForumID=1430&SiteID=1
Mike Flasko
Program Manager, ADO.NET Data Services Framework
-
Different applications have different requirements around consistency and how concurrent modifications are handled. I’ll oversimplify and put all these applications in two buckets: either you care about controlling concurrent changes or you don’t.
If you’re creating a REST interface to your data and don’t care about concurrency (e.g. no deep consistency rules, or nice units of change that change in whole consistent ways), then you can use the basic HTTP methods to retrieve (GET) and manipulate (POST, PUT, DELETE) resources directly without any more context than the representations of your resources. You get “last one wins” semantics on updates in this case.
On the other hand, if you do care about concurrency in your REST interface, there are more aspects to take into consideration. If your resources are atomic (no further structure that’s interesting from the concurrent changes perspective than the resource as a whole), then you can have an out of band mechanism for creating a “version number” for each resource -typically a monotonically increasing number- and use HTTP’s existing mechanism to ensure you overwrite stuff that you know about. In HTTP you can stick an “entity tag” or ETag to your responses that contain an opaque value used to denote the version or state of a resource. Later on, when you want to modify a resource, you can use that value in a “if-match” request header to make sure that your knowledge about the state of the resource you’re modifying is still current. If it’s not the resource in the sever would have an ETag that won’t match the one you provided and you’d get back a 412 “Precondition failed” status code. All that is standard HTTP 1.1 stuff described in RFC 2616. (ETags are also used for caching and conditional gets in addition to the scenario I described here).
Now, REST data services that expose structured data have to deal with various challenges beyond the basics, which I’ll go into details below. While I discuss this in the context of the ADO.NET Data Services Framework (Astoria), I’m sure some of these problems apply to a broader set of applications.
Creating ETags: concurrency tokens
The data services framework has to deal with the fact that we don’t control the data sources that we expose through the REST interface. Sometimes each entity that we turn into a resource will have a nice clean property that’s a timestamp or similar and maps perfectly to ETag semantics (e.g. whenever we change the value in a significant way the value of this property changes). However, often the schema of the underlying data is not under the control of the service developer so we have to work with what we have. What that means in practice is that you can tell the data services framework which properties of each entity type are “concurrency tokens”. Changing those values means that you chanced the version of the resource.
The way you do that in the framework is by using an [ETag(props…)] attribute in your class or an annotation in your EDM schema. For types that don’t have any concurrency tokens we won’t generate ETags for the responses for those types, and they get “last one wins” update behavior.
Once you indicated which property or properties are your concurrency tokens we can produce ETags by using the values of those properties for the particular instance we’re returning.
During update the data services runtime works with the data source to determine whether the concurrency token values that were marshaled through ETags and if-match headers still match, and if so perform the update/delete operation. If they don’t match a 412 response is sent to the client.
Including ETags in headers and/or payloads
The HTTP spec describes the ETag response header to transport the entity tag for a given resource. That works great for us for cases were we respond with a single entity (e.g. an entry in Atom terms), but it doesn’t when we return a collection of entries from a URL (e.g. an Atom feed). For the latter scenario, we include the ETag as part of the resource representation (in the entry for Atom, in the “__metadata” property for JSON), for example:
<entry m:type="BikesModel.Customer" m:etag="'A%20Bike%20Store'">
<!-- rest of the entry -->
</entry>
Validation during side-effecting operations
Concurrency tokens are validated whenever you perform an operation that affects the state of an existing resource. In the data services REST interface that means HTTP PUT and DELETE methods.
As I mentioned above, validation happens during update processing by extracting the “original” values from the ETag (which was sent back through the if-match header) and comparing them with the data in the data source. If they are the same, we consider the whole resource the same and proceed with the modification.
An interesting question is whether presenting an ETag in a if-match header should be mandatory for resources that have concurrency tokens. Put another way: should the decision of whether it’s ok to potentially overwrite changes based on state knowledge be up to the client or restricted by the server? The HTTP spec defines a special value of “*” for the if-match header that effectively means “any value will match”. The behavior that we are planning for is that if an entity type has concurrency tokens then we’ll always require an “if-match” header in modification operations. The header value can be an actual ETag obtained through a GET request or “*” meaning “I know this type supports concurrency control, but I’ll overwrite it anyway”.
Almost, but not quite, a perfect match
HTTP ETags and conditional operations are almost a perfect match to what we need to handle concurrent activity in RESTful data services. There are, however, a few glitches. This is where we get into the fine-print that’s not necessarily popular knowledge. Mike brought up many of these details I wasn’t aware of.
ETags can be “strong entity tags” or “weak entity tags”. Weak ETags are very similar to what happens when we have entities for which only some properties are designated concurrency tokens. From section 13.3.3 of the HTTP spec:
“However, there might be cases when a server prefers to change the
validator only on semantically significant changes, and not when
insignificant aspects of the entity change. A validator that does not
always change when the resource changes is a "weak validator." “
The problem is that weak ETags only apply to GET operations, they cannot be used for PUT/DELETE which is what we’re trying to do.
For cases where you own the data, the data services framework can expose a compliant interface by using constructs such as timestamps (if using a database as a data source), where any change in the entity will reflect in the ETag. You can also used a relaxed form of ETags where the entity might change but the ETag stay the same. It’s not completely HTTP compliant and may confuse intermediate systems, but it may be your only option in some scenarios.
As always, thoughts and feedback is welcome.
Pablo Castro
Software Architect
Microsoft Corporation
This post is part of the transparent design exercise in the Astoria Team. To understand how it works and how your feedback will be used please look at this post.
-
Around a year ago in the Mix 2007 conference we announced Project Astoria, an overall initiative to understand how data is used on the web and what frameworks, tools and services could we create to enable new and better applications in this space. Several things resulted from that effort already.
One of these results is a unified pattern for exposing and manipulating data across various data-centric services using the Atom format and the AtomPub protocol, plus a set of common conventions for constructing URLs to point to resources. This pattern is shared by some Windows Live services as announced here and articulated in more detail here, and also by any services created with the ADO.NET Data Services Framework.
The ADO.NET Data Services Framework is another one of the results of this project. It’s a set of libraries and tools for the .NET Framework to create and consume data services over various data sources, from relational databases to XML content to arbitrary web services. These data services expose the same AtomPub-based interface and same URL conventions, so clients and tools equally apply to online services and to your own services, on the web or on-premises.
The other piece we discussed at Mix 2007 was an experimental online service. As we explained when we first announced it, this service wasn’t a real “online service” in the sense that it wasn’t backed by an internet-scale infrastructure and such. The goal of the service was to learn about data interfaces for online services.
Now, at Mix 2008, we announced our plans to offer a real internet-scale data service called SQL Server Data Services (SSDS). General information about SSDS can be found here, and you can also watch the talk that Nigel Ellis gave at Mix about it. SSDS is a real service that is being made available in a closed beta release. We are now taking registrations for participation in the SSDS beta program. Please go here to sign up. Customers are being accepted into the beta program on a rolling basis.
As part of our unified Data Services vision, we will provide a solution that enables seamless integration between on-promise deployments (software) and cloud based deployments (services). We’re working on aligning aspects of SSDS and Astoria and this alignment will come over a series of updates to both Astoria and SSDS. For example, we will likely add AtomPub and JSON support to SSDS to match the results encoding of Astoria and are already working on extensions to EDM to incorporate the open content model of SSDS. We’ll be working to extend Astoria as needed to ensure it provides a great development experience over the SSDS service. It’s also worth noting how the Microsoft Sync Framework can be used to tie multiple deployments together – Nigel covers this in his talk. Stay tuned for more details.
Given that we have SSDS out there now, in the next week or so we will take down the experimental service we’ve hosted at http://astoria.sandbox.live.com for the last year. That service uses an old interface (it’s not even compatible with the current Astoria patterns), and it’s not meant for real use anyway.
Having a service out there with sample data is really handy though, so we’re exploring options to host a few read-only services somewhere for everybody to access for experimentation and demo purposes.
Are we done? Absolutely not. You’ll see more coming from Project Astoria over time. An example of things we’re exploring for the future is synchronization/offline capabilities for services and service clients, as we demo’ed in this Mix session (Astoria-related stuff starts at minute 35 or so, but the whole talk is really interesting).
Pablo Castro
Software Architect
Microsoft Corporation
-
Astoria service allows reading/querying of data via the already-established IQueryable interface – this helps in abstracting Astoria from the underlying data source. But there is no existing interface for the update operations (CUD – create, update, delete operations). Hence we came up with IUpdatable interface to support CUD operations and support read-write services.
One of the main design goals while designing the IUpdatable interface was to make it resource independent. In other words, the methods that return objects representing resources can return anything – for Astoria, the returned object is a opaque object that represents the resource being asked, and whenever we want to use the resource (reading/updating a value from the resource), we will pass the same opaque back to IUpdatable. The actual implementation of IUpdatable needs to track the mapping between this opaque object to the actual object it represents. The only time we need the actual clr instance of the resource is when we need to serialize the object and we call a specify method on IUpdatable (ResolveResource) for that.
Let take a quick look at IUpdatable interface
public interface IUpdatable
{
/// <summary>
/// Creates the resource of the given type and belonging to the given container
/// </summary>
/// <param name="containerName">container name to which the resource belongs</param>
/// <param name="fullTypeName">full type name i.e. Namespace qualified type name of the resource</param>
/// <returns>object representing a resource of given type and belonging to the given container</returns>
object CreateResource(string containerName, string fullTypeName);
/// <summary>
/// Gets the resource of the given type that the query points to
/// </summary>
/// <param name="query">query pointing to a particular resource</param>
/// <param name="fullTypeName">full type name i.e. Namespace qualified type name of the resource</param>
/// <returns>object representing a resource of given type and as referenced by the query</returns>
object GetResource(IQueryable query, string fullTypeName);
/// <summary>
/// Gets the resource of the given type that the query points to. The resource returned contains the default values,
/// and not the value as present in the server
/// </summary>
/// <param name="query">query pointing to a particular resource</param>
/// <param name="fullTypeName">full type name i.e. Namespace qualified type name of the resource</param>
/// <returns>object representing a resource of given type and belonging to the given container and containing default values</returns>
object ReplaceResource(IQueryable query, string fullTypeName);
/// <summary>
/// Sets the value of the given property on the target object
/// </summary>
/// <param name="targetResource">target object which defines the property</param>
/// <param name="propertyName">name of the property whose value needs to be updated</param>
/// <param name="propertyValue">value of the property</param>
void SetValue(object targetResource, string propertyName, object propertyValue);
/// <summary>
/// Gets the value of the given property on the target object
/// </summary>
/// <param name="targetResource">target object which defines the property</param>
/// <param name="propertyName">name of the property whose value needs to be updated</param>
/// <returns>the value of the property for the given target resource</returns>
object GetValue(object targetResource, string propertyName);
/// <summary>
/// Sets the value of the given reference property on the target object
/// </summary>
/// <param name="targetResource">target object which defines the property</param>
/// <param name="propertyName">name of the property whose value needs to be updated</param>
/// <param name="propertyValue">value of the property</param>
void SetReference(object targetResource, string propertyName, object propertyValue);
/// <summary>
/// Adds the given value to the collection
/// </summary>
/// <param name="targetResource">target object which defines the property</param>
/// <param name="propertyName">name of the property whose value needs to be updated</param>
/// <param name="resourceToBeAdded">value of the property which needs to be added</param>
void AddReferenceToCollection(object targetResource, string propertyName, object resourceToBeAdded);
/// <summary>
/// Removes the given value from the collection
/// </summary>
/// <param name="targetResource">target object which defines the property</param>
/// <param name="propertyName">name of the property whose value needs to be updated</param>
/// <param name="resourceToBeRemoved">value of the property which needs to be removed</param>
void RemoveReferenceFromCollection(object targetResource, string propertyName, object resourceToBeRemoved);
/// <summary>
/// Delete the given resource
/// </summary>
/// <param name="targetResource">resource that needs to be deleted</param>
void DeleteResource(object targetResource);
/// <summary>
/// Saves all the pending changes made till now
/// </summary>
void SaveChanges();
/// <summary>
/// Returns the actual instance of the resource represented by the given resource object
/// </summary>
/// <param name="resource">object representing the resource whose instance needs to be fetched</param>
/// <returns>Returns the actual instance of the resource represented by the given resource object</returns>
object ResolveResource(object resource);
}
Let’s go through each api one by one.
object CreateResource(string containerName, string fullTypeName) :– This is called when one tries to insert a new resource via the POST http method. The first parameter points to the container that the resource belongs to and the second parameter tells the namespace qualified name of the resource type that needs to be created. The second parameter might not be that useful when there is no inheritance, since from the container, one can easily figure out the type, but for inheritance cases, this is helpful. The return type, as said before, need not be the actual clr instance of the resource. It can be anything (a cookie) that only the IUpdatable implementor needs to understands.
object GetResource(IQueryable query, string fullTypeName) :- Get the given resource as resolved by the query and the namespace qualified name of the type that the query resolves to. In some cases, the full type name can be null. Look at the examples below to see the cases when the fullTypeName can be null. Again, the return type can be anything that represents the resource.
object ReplaceResource(IQueryable query, string fullTypeName) :- Very similar to the GetResource API, but this is used for replace-semantics and GetResource is used for merge-semantics. The implementation needs to return the resource referred by the query, but with default values for all non-key properties.
void SetValue(object targetResource, string propertyName, object propertyValue) :- Set the value of the property with the given name on the target resource to the given property value. The target resource is the opaque object returned by either CreateResource or GetResource. This method is called for scalar properties and complex properties only.
object GetValue(object targetResource, string propertyName) :- Gets the value of the property with the given name for the target resource. The target resource is the opaque object returned by either CreateResource or GetResource. This method is called for scalar properties or complex properties. If it’s a scalar property, we expect the returned object to be the actual value.
void SetReference(object targetResource, string propertyName, object propertyValue) :- This method is called for setting the value of navigation property with the given name on the target resource to the given propertyValue. The target resource and the propertyValue are opaque objects returned by GetResource or CreateResource API. This method is called for navigation properties that refer to a single resource – representing 0 or 0..1 side of a relationship.
void AddReferenceToCollection(object targetResource, string propertyName, object resourceToBeAdded):- This method is called for adding a resource to the collection navigation property with the given name on the target resource. Again, targetResource and resourceToBeAdded are opaque objects returned by GetResource or CreateResource API. The navigation property presents many side of a relationship. We generally call this operation as binding, which means you are binding the resourceToBeAdded to targetResource. In other words, you are setting up a relationship between the 2 resource objects.
void RemoveReferenceFromCollection(object targetResource, string propertyName, object resourceToBeRemoved):- Very similar to the above method, except it removes the given resource from the collection navigation property on the target object. We generally call this operation as unbinding, which means you are deleting the relationship between the two resource objects in question.
void DeleteResource(object targetResource):- This actually deletes the given resource. Again, the targetResource is the opaque object returned by GetResource or CreateResource API.
void SaveChanges():- This actually saves all the changes that has been made till now, using the above API’s. The IUpdatable implementation needs to track all changes until this API is called and then save all of them when this API is called. The IUpdatable implementation is expected to save all the changes or nothing
object ResolveResource(object resource):- This API is called whenever we want to resolve the opaque object returned by the CreateResource or GetResource API into the actual clr instance. This normally is called after SaveChanges, when we want to serialize out the resource (for POST methods). This method is also called if there are UpdateInterceptors, that needs to be invoked with the actual clr resource instances or the provider supports optimistic concurrency and the resource type has concurrency tokens (defined via etag properties in clr based provider).
This blog has already become bigger that I intended it to be. In my next blog, I will try and come up with some examples and the sequence of calls made on the IUpdatable interface for each of them. I have purposefully keep this post simple just so that people can get the basic idea of IUpdatable interface. There are few features like ETag, Update interceptors, etc due to which there might be additional calls on this interface. I will try and cover them in my coming blogs.
Pratik Patel
Developer, ADO.NET Data Services Framework
(Project Astoria)
This post is part of the transparent design exercise in the Astoria Team. To understand how it works and how your feedback will be used please look at this post.
-
We have received a fair amount of feedback regarding a number of use cases where it would be beneficial to enable a client of a data service to “batch” up a group of operations and send them to the data service in a single HTTP request. This reduces the number of roundtrips to the data service for apps that need to make numerous requests to the data service to perform a given action and allows a set of operations to be logically grouped together.
Below is the design we have landed on. Note: We will have something close to this design in our next CTP/Beta release of Astoria, but its not quite there yet.
Background
The base ADO.NET Data Services Framework semantics provide two mechanisms to query and send updates to a data service. From a high-level they are:
Query:
1) Send an HTTP GET request to a URI representing a resource (or set of resources) and receive in the response a representation of the resource (or set of resources). Example: a GET request to /Customers(1) returns a single customer entity in the response
2) Same as #1, but add the $expand query string operator to the request to request resources related to the resource(s) specified in the request URI be returned in the response as well. Example: a GET request to /Products(1)?expand=Category, Parts returns product #1 as well as the Parts and Category associated the product in the response
Update:
1) Insert / update / delete a single resource per HTTP request by sending POST, PUT or DELETE requests to a data service
2) Insert a new resource and related resources in a single request. Example: a POST to /Customers can insert a Customer and related Orders in a single request by inlining the related orders in the request body
Why do we need batching?
Now assume you have the following situation: Single “Save” button per page in my RIA line of business application: Contoso Solutions is building an online Silverlight-based order entry system for its salesforce. Any given sale requires a number of entities within the data service be inserted and/or updated. Some of the entities are associated via navigation properties while others have no relation to the other entities being acted on as part of processing the sale. The user experience contoso wants to enable is to paint the full order processing information on a single screen and include a “save changes” button at the bottom to persist all the updates made to create the order.
In this case, the $expand operation cannot be used to pull down all the data to paint the order entry screen in a single HTTP request. Also, the update operations cannot easily be persisted as an atomic set of operations to the underlying data store.
Batching Design
To support batching, ADO.NET Data Services has added a new $batch URI which will accept batch requests and return batch responses. Logically a batch request is a group of 0 or more QueryOperations and 0 or more ChangeSet operations. QueryOperations are analogous to a simple "non batch" query request. ChangeSet operations are just a group of unordered, atomic CUD (update,insert&delete) operations (ie. all operations succeed or none do).
Now that we have the logical model (Batch is a collection of ordered QueryOps and ChangeSets), we needed a wire representation. After a bit of exploring various ways to represent batches in ATOM, JSON , etc, Yaron Goland pointed out to us there is already a well defined way to represent multiple HTTP requests in a single request using multipart/mixed MIME messages and the mime type application/http. This turned out to be just what we needed and enables us to easily encapsulate binary and text based content in a request or response. Also, it looks like using multipart/mime for batching has been explored (with pretty positive feedback) a number of times in the blogosphere, so perhaps we'll all land on something generally applicable. Instead of describing this, lets just look at an example.
Example - Batch Request:
The example assumes the batch request is sent to a data service located at: http://foo.com/dataservice.svc
The Batch example contains the following operations (in order):
- A Change Set which contains the following operations in order:
- POST operation
- PUT operation
- A Query Operation
- A Query Operation
Note:
- Outer HTTP Request elements & batch boundaries are shown in blue
- Query Operations are shown in green
- Change Sets are shown in red
POST /dataservice.svc/$batch HTTP/1.1
Host: foo.com
Content-Type: multipart/mixed; boundary=batch(36522ad7-fc75-4b56-8c71-56071383e77b)
--batch(36522ad7-fc75-4b56-8c71-56071383e77b)
Content-Type: multipart/mixed; boundary=changeset(77162fcd-b8da-41ac-a9f8-9357efbbd621)
Content-Length: ###
--changeset(77162fcd-b8da-41ac-a9f8-9357efbbd621)
Content-Type: application/http
Content-Transfer-Encoding:binary
POST /dataservice.svc/Categories HTTP/1.1
Host: foo.com
Content-Type: application/atom+xml;type=entry
Content-Length: ###
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<entry xmlns:d="http://schemas.microsoft.com/ado/..."
xmlns:m="http://schemas.microsoft.com/ado/.../metadata"
xmlns="http://www.w3.org/2005/Atom">
…
<content type="application/xml">
<d:CategoryName>Software</d:CategoryName>
<d:Description d:null="true" />
<d:Picture d:null="true" />
</content>
</entry>
--changeset(77162fcd-b8da-41ac-a9f8-9357efbbd621)
Content-Type: application/http
Content-Transfer-Encoding:binary
PUT /Categories(5) HTTP/1.1
Host: foo.com
Content-Type: application/atom+xml;type=entry
If-Match: xxxxx
Content-Length: ###
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<entry xmlns:d="http://schemas.microsoft.com/ado/..."
xmlns:m="http://schemas.microsoft.com/ado/.../metadata"
xmlns="http://www.w3.org/2005/Atom">
…
<content type="application/xml">
<d:CategoryID>5</d:CategoryID>
<d:CategoryName>UpdateCategoryName</d:CategoryName>
</content>
</entry>
--changeset(77162fcd-b8da-41ac-a9f8-9357efbbd621)--
--batch(36522ad7-fc75-4b56-8c71-56071383e77b)
Content-Type: application/http
Content-Transfer-Encoding:binary
GET /Categories(5) HTTP/1.1
Host: foo.com
--batch(36522ad7-fc75-4b56-8c71-56071383e77b)
Content-Type: application/http
Content-Transfer-Encoding:binary
Operation: GET /Categories(6)
Host: foo.com
--batch(36522ad7-fc75-4b56-8c71-56071383e77b)--
Batch Response
Now that we have seen what a request looks like, the response is pretty much the mirror image of the request (also uses multipart/mime), with a mime part containing the associated HTTP response for each operation in the batch request. The exception to this rule is for responses to ChangeSets. Since ChangeSets are atomic if an operation in the set fails, the response for the ChangeSet is a single HTTP response instead of a nested multipart/mixed collection of responses.
This is already getting a bit long, so I'll cut this off here. In a future post we'll walk through an end to end request + response and talk about how to cross reference operations within a batch request. What do you think so far? Are we overlooking/missing something?
Mike Flasko
ADO.NET Data Services, Program Manager
This post is part of the transparent design exercise in the Astoria Team. To understand how it works and how your feedback will be used please look at this post.
-
(sorry, tricky problem -> long write-up)
One of the few things pending in the server library of the ADO.NET Data Services Framework is query caching to help with performance. Here is a brief explanation of why we needed and a couple of design options. Feedback is welcome.
Query processing in Astoria
To give a little bit of context, let me first briefly go through the process that takes place when Astoria receives a URL and works its magic to turns it in query results ready for serialization in the HTTP response.
Data sources are hooked into Astoria services through LINQ expression trees. That means that the role of Astoria during query processing is to take a URL and translate it into an expression tree that then we can ask the data source to execute and give us results. The general flow is more or less like this:
URL -> Expression Tree -> [Data Source Execution and Materialization] -> Objects -> Serialization (Atom/JSON)
The thing in brackets is data source-specific. For example, if using the ADO.NET Entity Framework it would look like this:
URL -> Expression Tree -> Canonical Query Tree -> View expansion/Query simplification -> Canonical Query Tree -> SQL -> Rows (DataReader) -> Entities (DataReader) -> Objects -> Serialization (Atom/JSON)
This is not a cheap thing to do in every request, hence this discussion about query caching.
Why query caching
The processing pipeline that I showed above can be expensive. In particular, all that query translation between different tree types as well as all that analysis for view expansion and query simplification is an expensive, CPU-bound activity that we want to avoid as much as we can.
To help with this we are planning to introduce query caching, similar to what database systems do (e.g. the SQL Server “proc cache”). The idea of query caching is that for commonly requested URLs we’d bypass most of the processing required to setup the query for execution, and we’d go as directly as possible to the execution phase.
Since Astoria is a generic framework that works on many data sources, we have to enable this in a way that allows different data sources plug in their query caching capabilities if they have such thing.
Data source-independent query compilation
In order to implement caching, we need a way of doing our own work for translation, then tell the data source to do its own work on the expression trees, and then save that work to be used every time we have to respond to the “same” request (the definition of “same” is well…difficult, more about this later); that is, we need a way of compiling queries.
You can imagine an interface with two methods for this:
interface IQueryCompilationProvider
{
object CompileQuery(Expression<Func<T1, T2, …, Tn, TResult>> query);
IEnumerable ExecuteCompiledQuery(object compiledQuery);
}
The idea is that each data source can give whatever meaning it wants to “compiling” a query, and they can return us an opaque token (object) that we’d pass back along with parameter values in order to execute the query. Ideally you’d do as much work as possible. For example, there is a CompiledQuery class in LINQ to SQL and LINQ to Entities that can do the translation all the way to a SQL statement only once and then re-use it in all subsequent executions (re-binding parameter values).
Query caching design, part 1: for simple cases, a simple design does it
In the simpler cases where the data services does not have customizations the URL -> Expression Tree translation is 1:1. The only consideration in this case would be to parameterize the URLs so that we don’t fragment the query cache with URLs that are the same query with different constants (e.g. /Customers(1) vs /Customers(2), or /Customers?$filter=City eq ‘Seattle’ and /Customers?$filter=City eq ‘Las Vegas’).
In this case we can keep a simple map of parameterized URL to compiled query. If we don’t find a given URL, we go through the full translation process to produce an expression tree, and then –assuming the data source supports compilation- call CompileQuery to obtain a compiled query opaque token. On subsequent executions we look up the URL, find the compiled query token, bind the constants that we turned into parameters to parameter values and execute the query directly.
The rest is standard caching stuff…keep a map, have a limit and an eviction policy, make it “spike resistant”, etc.
Of course, life is rarely that simple…
Query interceptors
So far we’ve assumed that there is a 1:1 correspondence between (parameterized) URLs and expression trees. Astoria has a nice feature called “query interceptors” that causes that correspondence to break.
A query interceptor allows the service developer to introduce a custom filter predicate for each entity set that’s exposed through the service interface. For example, if you had a Customers table and a CustomerAccess table that indicates which user-ids can see which customers, you could implement entity-level security for customers by adding this interceptor:
[QueryInterceptor("Customers")]
Expression<Func<Customer, bool>> QueryCustomers()
{
// retrieve the user from the environment (e.g. the currently //logged-in user)
UserDescriptor u = GetUser();
if (u.IsAdmin)
return c => true; // can see all customers
else
return c => c.CustomerAccess.Any(ca => ca.UserID ==
u.UserID);
}
Interceptors are invoked whenever a URL involves the entity-set the interceptor is bound to, regardless of how. The system injects a Where operator with the predicate returned by the interceptor. This includes top level entity-set access (e.g. /Customers), access of subsets through link traversal (e.g. /SalesPeople(123)/AssignedCustomers), and inline expansion (e.g. /SalesPeople(123)?$expand=AssignedCustomers).
Note that now URLs and expression trees are no longer 1:1. There are two kinds of differences that might show up:
a) Same query but different parameters. For example, if user 1 and user 2, both not administrators, fetch the URL /Customers, then both requests will produce identical query trees, with the only difference that the value of “u.UserID” will be different. The difference with the parameterization of the URLs is that in this case we’re not the ones parsing the input.
b) Different query. In the example above, if one of the users accessing the URL /Customers is administrator and the other is not, the filter predicates used in each of them will be different.
Now the caching thing got complicated.
Side-note: I’ve excluded update interceptors in this discussion because they don’t participate in query composition. They are a key part of enforcing access control the way I described above though.
Query caching design, part 2
We really wanted to make query caching work without any API surface other than maybe a configuration knob for the size of the cache. That’s not looking great at this point, but we do have a few options on the table.
The essence of the problem is the fact that query interceptors need to be able to capture data from the execution environment at the time a given request is being processed. The main example of this is grabbing the user credentials for a given request (e.g. Thread.CurrentThread.CurrentPrincipal, or the user-id extracted from an encrypted cookie/custom HTTP header), but there can be other scenarios as well.
There are a couple of approaches that could do the trick, but then come with their trade-offs:
Option I: a bit of extra magic for a nicer API.
We could let users write interceptors just like I showed above. Those interceptors mix references to environment data and the description of the filter predicate in a single construct, a LINQ expression tree that has references to external variables in the reference closure. What the developer writes looks like this (copied from above):
[QueryInterceptor("Customers")]
Expression<Func<Customer, bool>> QueryCustomers()
{
// retrieve the user from the environment (e.g. the currently logged-in user)
UserDescriptor u = GetUser();
if (u.IsAdmin)
return c => true; // can see all customers
else
return c => c.CustomerAccess.Any(ca => ca.UserID ==
u.UserID);
}
To cache queries we would have to:
· On every request invoke all interceptors that are needed
· Extract all uncorrelated subexpressions from each interceptor filter predicate and turn them into constants (do “funcletization” in LINQ jargon), then replace the constants with generic parameters. Now we have a little tree for each interceptor
· Now use a parameterized URL + all the interceptor trees the caching key.
Option II: explicitness at the cost of API complexity
The other approach is to explicitly separate the definition of the filter predicate from the per-execution state that comes from the environment. The developer would write two pieces, statically-defined filter predicate and a method that sets-up per-request state, as follows:
[QueryInterceptor("Customers")]
static Expression<Func<Customer, CustomersDbContext, Dictionary<string, object>, bool>> QueryCustomers()
{
return (c, ctx, state) => (bool)state["IsAdmin"] ||
c.CustomerAccess.Any(ca => ca.UserID ==
(string)state["UserID"]);
}
void override OnStartRequest(RequestDescriptor descriptor, Dictionary<string, object> state)
{
// retrieve the user from the environment (e.g. the currently logged-in user)
UserDescriptor u = GetUser();
state[“IsAdmin”] = u.IsAdmin;
state[“UserID”] = u.UserID;
}
As you can see, the definition of the filter predicate gets a bit trickier (more parameters in the lambda expression in particular).
Of course, since at this point the interceptor is called only once and can’t use any request-bound context, we may as well just remove the idea of a method all together, and move the filter setup to the service initialization, where we already have APIs for configuring service policies. So the code would become:
static void InitializeService(IDataServiceConfiguration config)
{
// other policy initialization
// ...
// filters
config.SetEntitySetFilterPredicate<Customer>("Customers",
(c, ctx, state) => (bool)state["IsAdmin"] ||
c.CustomerAccess.Any(ca => ca.UserID ==
(string)state["UserID"]);
}
void override OnStartRequest(RequestDescriptor descriptor, Dictionary<string, object> state)
{
// retrieve the user from the environment (e.g. the currently logged-in user)
UserDescriptor u = GetUser();
state["IsAdmin"] = u.IsAdmin;
state["UserID"] = u.UserID;
}
The idea is that OnStartRequest (or whatever is a nice name for that) would explicitly capture the state any and all interceptors would use. Then interceptors become static constructs that are setup during initialization. An important detail is that now the predicate expression cannot refer to variables in the environment any more. Instead, for anything that’s request specific it needs to access the “state” object, which is then setup with data on a per-request basis.
This yields a very efficient system, because now we can do minimal work on repeated requests that result in the same query, even in the presence of interceptors. On the other hand, the code that developers have to write is pretty tricky…
Is it better to take a performance hit and leave it more usable? Is there a middle ground that we didn’t consider?
If you made it reading this far, you’re probably one of few J. In any case, feedback is very welcome.
Pablo Castro
Software Architect
Microsoft Corporation
This post is part of the transparent design exercise in the Astoria Team. To understand how it works and how your feedback will be used please look at this post.
-
We’re gearing up for the next release of the .NET Framework, and we are looking for people that have a passion for building great frameworks to help with the effort. Since we’ve shipped the .NET Framework 3.5 we’ve been working on projects like the ADO.NET Data Services Framework (aka Astoria) and LINQ to XML support in Silverlight. If you’re interested in designing the next set of API’s for data and want to work for a team that’s focused on shipping technologies and having fun, we’re interested in hearing from you. Drop us a line through the team blog or directly to me and we’ll get back to you.
Carl Perry
Lead Program Manager
cperry@microsoft.com
-
Its been a year since Pablo first announced Project Astoria at MIX 07. Since then we've started from scratch and build a production version of the product which is now known as the ADO.NET Data Services Framework. At the MIX conference this time around we'll have a bunch of sessions to talk about how the ADO.NET Data Services Framework has evolved into a platform to create your own services and to talk to those from Windows Live. A snippet from the Windows Live blog is below and we'll have a lot more to show at the conference next week!
"At MIX we are enabling several new Live services with AtomPub endpoints which enable any HTTP-aware application to easily consume Atom feeds of photos and for unstructured application storage (see below for more details). Or you can use any Atom-aware public tools or libraries, such as .NET WCF Syndication to read or write these cloud service-based feeds.
In addition, these same protocols and the same services are now ADO.NET Data Services (formerly known as “ Project Astoria”) compatible. This means we now support LINQ queries from .NET code directly against our service endpoints, leveraging a large amount of existing knowledge and tooling shared with on-premise SQL deployments...."
Pablo, myself and Andy will be heading to MIX this year from the Astoria team. So, if you are at the conference and want to chat all things data services, drop us a note or we'll likely bump into you at the talks and open space (discussion) areas....
The data services focused sessions are:
Wed, March 5th - RESTful Data Services with the ADO.NET Data Services Framework
Fri, March 7th - Accessing Windows Live Services via AtomPub
Fri, March 7th - Building RESTful Real World Applications with the ADO.NET Data Services Framework
See you at MIX,
Mike Flasko
Program Manager, ADO.NET Data Services Framework
-
While going through application scenarios for the ADO.NET Data Services Framework (Project Astoria) one of the first things we noticed is that data-centric applications usually want to bring down graphs of related resources in each interaction with the server. For example, if you are retrieving a resource that represents an "Event", you may want to also bring in the set of related Contact resources that are invited or the "Venue" resource where the event will take place. This write up briefly describes how we model associations between resources as Atom links and proposes a usage pattern of the atom:link element to support retrieving resource graphs in a single response. We're looking for feedback on the approach and also to get folks thinking about inlined content and whether it should be considered an extension to Atom.
More context on Astoria support for Atom here:
http://blogs.msdn.com/astoriateam/archive/2008/02/13/atompub-support-in-the-ado-net-data-services-framework.aspx
1. Links for modeling associations between resources
Related resources can be seen at the instance level as "links" in Atom terms. Of course, from the data application development perspective, it's interesting to make this discoverable at the service description (schema) level. In Astoria data services the underlying model is the Entity Data Model (EDM), which describes data in terms of "Entities" (instances of Entity Types) and associations between entities. In the context of the Atom interface, Entities are mapped to entries and Associations to links. So by looking at the service description a developer can discover the links that will be present in an entry of a given type.
We model related entries or feeds using a link with a "rel" attribute of "related", and with a "type" of either "application/atom+xml;type=feed" or "application/atom+xml;type=entry" depending on the cardinality of the other end of the association.
One tricky aspect is that we need to indicate which association it is. At the model level we have a "navigation property" that identifies the starting "end" of the association (e.g. "Attendees", "Venue"). We currently put that name in the "title" attribute of the link. That solution is not perfect, as we try not to overload constructs that are for human-readable content. However, the alternative is to use a custom attribute, and we've been trying not to introduce custom attributes unless absolutely needed. Another option would be to use different "rel" values to specify the relationship, which feels natural but makes it much less likely that generic processors will be able to do something interesting with it.
Do these trade-offs sound reasonable? Is any of the other options more appropriate?
Continuing with the Events sample, this is what an entry (/Events(456)) with links looks like:
<entry xml:base="http://localhost:81/EventsSample/" xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata" xmlns="http://www.w3.org/2005/Atom" m:type="EventsSample.Event">
<id>http://localhost:81/EventsSample/Events(456)</id>
<title type="text"></title>
<updated>2008-02-17T02:52:38Z</updated>
<author>
<name />
</author>
<link rel="edit" title="Event" href="Events(456)" />
<link rel="related" type="application/atom+xml;type=entry" title="Venue" href="Events(456)/Venue" />
<link rel="related" type="application/atom+xml;type=feed" title="Attendees" href="Events(456)/Attendees" />
<content type="application/xml">
<d:EventID m:type="Int32">456</d:EventID>
<d:Name>Big Party</d:Name>
<d:NoteToAttendees>It's going to be a great party!</d:NoteToAttendees>
<d:DateAndTime m:type="DateTime">2008-03-05T06:00:00</d:DateAndTime>
</content>
</entry>
From the data modification perspective, links pointing to other resources in the service can be specified in the payload of POST and PUT operations, to establish links between the resource being manipulated and other existing resources.
2. Expanding links inline
As I summarized at the beginning of this note, we want to enable clients to request whole sub-graphs of data starting at some resource or set of resources. There are two aspects that need to be addressed: how does the client indicate that it wants one or more links expanded and how are the expanded links represented on the response.
How link expansion is requested is outside of the atom-syntax problem space, so I'll just briefly state what we currently do in case you have an opinion: data services support the query string option "$expand" to request link expansion. So you could say "/Events?$expand=Attendees " to retrieve all Events and all contacts that are attendees for each of them, or "/Events(456)?$expand=Attendees" to retrieve a single event (with key 456) and its attendees. Expand syntax allows for deep expands such as "Attendees/BestFriend" (expand Attendees, and on the expanded entry(es) expand BestFriend) and wide expands such as "Venue, Attendees/BestFriend" meaning expand two immediate links, and for the Attendees one further expand its BestFriend link.
For representing expanded links we put the expanded content inside the link element itself. According to section 4.2.7 of RFC 4287:
"The "atom:link" element defines a reference from an entry or feed to a Web resource. This specification assigns no meaning to the content (if any) of this element."
So it seems that adding content to the link element is not disallowed and at the same time it does not overlap with any existing semantics given to such construct. Based on that we thought it would be the perfect place for this information, as the link itself already contains the metadata about the link that we needed.
When a client indicates that the target of a link should be expanded, the server responds with the Atom representation of the resources pointed at by links wrapped in an <inline> element. For example, for "/Events(456)?$expand=Attendees,Venue" the response would be:
<entry xml:base="http://localhost:81/EventsSample/" xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata " xmlns="http://www.w3.org/2005/Atom" m:type="EventsSample.Event">
<id>http://localhost:81/EventsSample/Events(456)</id>
<title type="text"></title>
<updated>2008-02-17T03:01:18Z</updated>
<author>
<name />
</author>
<link rel="edit" title="Event" href="Events(456)" />
<link rel="related" type="application/atom+xml;type=entry" title="Venue" href="Events(456)/Venue">
<m:inline>
<entry m:type="EventsSample.Venue">
<id>http://localhost:81/EventsSample/Venues(789)</id>
<title type="text"></title>
<updated>2008-02-17T03:01:18Z</updated>
<author>
<name />
</author>
<link rel="edit" title="Venue" href="Venues(789)" />
<link rel="related" type="application/atom+xml;type=entry" title="SalesContact" href="Venues(789)/SalesContact" />
<content type="application/xml">
<d:VenueID m:type="Int32">789</d:VenueID>
<d:Name>The Cool Place</d:Name>
<d:Description>Great place for parties!</d:Description>
<d:Capacity m:type="Int32">1500</d:Capacity>
<d:Type>Nightclub</d:Type>
</content>
</entry>
</m:inline>
</link>
<link rel="related" type="application/atom+xml;type=feed" title="Attendees" href="Events(456)/Attendees">
<m:inline>
<feed>
<title type="text">Attendees</title>
<id>http://localhost:81/EventsSample/Events(456)/Attendees</id>
<updated>2008-02-17T03:01:18Z</updated>
<link rel="self" title="Attendees" href="Events(456)/Attendees" />
<entry m:type="EventsSample.Contact">
<id>http://localhost:81/EventsSample/Contacts(123)</id>
<title type="text"></title>
<updated>2008-02-17T03:01:18Z</updated>
<author>
<name />
</author>
<link rel="edit" title="Contact" href="Contacts(123)" />
<link rel="related" type="application/atom+xml;type=entry" title="BestFriend" href="Contacts(123)/BestFriend" />
<content type="application/xml">
<d:ContactID m:type="Int32">123</d:ContactID>
<d:FirstName>John123</d:FirstName>
<d:LastName>Doe123</d:LastName>
<d:EmailAddress>jd123@foo.com</d:EmailAddress>
<d:Phone>123-456-123</d:Phone>
<d:BirthDate m:type="Nullable`1[System.DateTime]">1990-04-01T00:00:00</d:BirthDate>
</content>
</entry>
<entry m:type="EventsSample.Contact">
<id>http://localhost:81/EventsSample/Contacts(124)</id>
<title type="text"></title>
<updated>2008-02-17T03:01:18Z</updated>
<author>
<name />
</author>
<link rel="edit" title="Contact" href="Contacts(124)" />
<link rel="related" type="application/atom+xml;type=entry" title="BestFriend" href="Contacts(124)/BestFriend" />
<content type="application/xml">
<d:ContactID m:type="Int32">124</d:ContactID>
<d:FirstName>John124</d:FirstName>
<d:LastName>Doe124</d:LastName>
<d:EmailAddress>jd124@foo.com</d:EmailAddress>
<d:Phone>123-456-124</d:Phone>
<d:BirthDate m:type="Nullable`1[System.DateTime]">1990-05-01T00:00:00</d:BirthDate>
</content>
</entry>
<!-- more entries for contacts that will be -->
<!-- attendees in this party -->
</feed>
</m:inline>
</link>
<content type="application/xml">
<d:EventID m:type="Int32">456</d:EventID>
<d:Name>Big Party</d:Name>
<d:NoteToAttendees>It's going to be a great party!</d:NoteToAttendees>
<d:DateAndTime m:type="DateTime">2008-03-05T06:00:00</d:DateAndTime>
</content>
</entry>
I focused on the GET operations above. We think it would be better to stay away from attempting to support full modification operations on expanded graphs. In particular, we do not handle PUT on more than one entry at a time today. We do support POSTing an expanded graph, and we simply create all the nested entries and link them to the parent entry, creating the whole graph in a single operation.
Feedback in general about this approach would be greatly appreciated.
Pablo Castro
Technical Lead
Microsoft Corporation
This post is part of the transparent design exercise in the Astoria Team. To understand how it works and how your feedback will be used please look at this post.
-
We have been looking for the last few months at adding first-class support for AtomPub to Project Astoria (we briefly touched on it before here). We are at a point where we have some parts of the AtomPub story and their initial implementation running (and we'll share fresh experimental bits soon), some parts on the design board and other parts that we haven’t explored yet. I wanted to write a few words to explain why we think AtomPub support is important and enumerate the challenges we face. Guidance and general feedback on the reasoning, the approach and on the details is very much appreciated.
Why are we looking at AtomPub?
Astoria data services can work with different payload formats and to some level different user-level details of the protocol on top of HTTP. For example, we support a JSON payload format that should make the life of folks writing AJAX applications a bit easier. While we have a couple of these kind of ad-hoc formats, we wanted to support a pre-established format and protocol as our primary interface.
If you look at the underlying data model for Astoria, it boils down to two constructs: resources (addressable using URLs) and links between those resources. The resources are grouped into containers that are also addressable. The mapping to Atom entries, links and feeds is so straightforward that is hard to ignore. Of course, the devil is in the details and we'll get to that later on.
The interaction model in Astoria is just plain HTTP, using the usual methods for creating, updating, deleting and retrieving resources. Furthermore, we use other HTTP constructs such as "ETags" for concurrency checks, "location" to know where a POSTed resource lives, and so on. All of these also map naturally to AtomPub.
From our (Microsoft) perspective, you could imagine a world where our own consumer and infrastructure services in Windows Live could speak AtomPub with the same idioms as Astoria services, and thus could both have a standards-based interface and also use the same development tools and runtime components that work with any Astoria-based server. This would mean less clients/development tools for us to create and more opportunity for our partners in the libraries and tools ecosystem out there.
How are we approaching this?
We are simply mapping whatever we can to regular AtomPub elements. Sometimes that is trivial, sometimes we need to use extensions and sometimes we leave AtomPub alone and build an application-level feature on top. Here is an initial list of aspects we are dealing with in one way or the other. We’ll also post elaborations of each one of these to the appropriate Atom syntax|protocol mailing lists.
a) Mapping the data model: how do we map Astoria’s underlying data model, the Entity Data Model, to Atom constructs. This is quite straightforward but it deserves a look for completeness.
b) We use just the regular format/protocol whenever we can, we would be interested in validating our use with folks out there
c) Using AtomPub constructs and extensibility mechanisms to enable Astoria features:
· Inline expansion of links (“GET a given entry and all the entries related through this named link”, how we represent a request and the answer to such a request in Atom?).
· Properties for entries that are media link entries and thus cannot carry any more structured data in the <content> element
· HTTP methods acting on bindings between resources (links) in addition to resources themselves
· Optimistic concurrency over HTTP, use of ETags and in general guaranteeing consistency when required
· Request batching (e.g. how does a client send a set of PUT/POST/DELETE operations to the server in a single go?)
d) Astoria design patterns that are not AtomPub format/protocol concepts or extensions:
· Astoria gives semantics to URLs and has a specific syntax to construct them
· How metadata that describes the structure of a service end points is exposed. This goes from being to find out entry points (e.g. collections in service documents) to having a way of discovering the structure of entries that contain structured data
e) How do we deal with aspects that AtomPub does not handle by design or just because it has not been needed so far?
· What to do with fields that may not have a backing value in the input source (e.g. updated, author).
· Replace versus merge semantics during updates
f) High-level client libraries. How high-level can we make clients so they can consume AtomPub-based Astoria services but still feel that they are working against regular objects and have general integration with the development environment?
There are probably more, but I think this is a good starting list.
Where do we go from here?
The folks in the AtomPub community understand this the best, so we’ll take our questions to the atom-syntax and atom-protocols lists to hear opinions there. We’ll probably track posts and comments in the Astoria blog as well so people that follow it can keep track of what’s going on in this space.
This post is part of the transparent design exercise in the Astoria Team. To understand how it works and how your feedback will be used please look at this post.
Pablo Castro
Technical Lead
Microsoft Corporation
-
I've responded to a few posts on our online forums asking what the motivations were for building Astoria. After one of our recent posts to the forums a comment was left that the replies would be a good blog post. So, what follows is a few of my responses to those forums questions appended together and touched up a bit so they can be read together as a single post.
In general, the goal of the ADO.NET Data Services Framework is to create a simple REST-based framework for exposing data centric services. We built the framework in part from analysis of traditional websites and then looked at how architectures were changing with the move to AJAX and RIA based applications. One key observation the team had was that in traditional approaches to web development the information exchanged between a client (ex. a web browser) and the mid tier was a combination of presentation + behavior + data (ex. HTML file with JavaScript and inline HTML tables of data) and that the core interactions to retrieve raw data was between the mid-tier and backend store (ex. SQL server or other). When we looked at RIA, AJAX, smart client, etc applications it became apparent that these architectures pushed much more "smarts" to the client tier where the client first retrieves the presentation + behavior information (as a DLL in the case of a Silverlight application) and then, as the user interacts with the client application, the app turns back (ex. background async call) to the mid-tier to retrieve the data needed to drive the user experience. This is nothing new (separation of presentation + behavior from data), but it’s interesting to note it now is not only a best practice but mandated in the architectures of today’s web and RIA apps. From this we looked at how such clients could consume data from the mid-tier today and how could we help improve the experience for the developer. A few areas came up:
1)Creating and maintaining rich data oriented services with current approaches requires a significant developer investment
2) Building generation purpose client libs/tools with current approaches to data centric services is hard
For #1, imagine you wanted to expose the data in your CRM database to you client tier application. Further assume you want to enable typical application scenarios like retrieving sorted views of the data, paging over the data, filtering, etc. To expose this data as a set of callable remote methods (using current approaches to developing web services) you would need to write a large number of methods to expose each of the entities in your CRM DB (customers, orders, etc) and then add additional methods for each to retrieve entities by key, sort them, page over them, etc etc. ADO.NET Data Services, addresses this issue by allowing you to declaratively state the contract of such a data centric service, by telling us the schema of the data and having the data services technology automatically create the required remote endpoints, enabling paging, sorting, etc with no code from the developer. Then as you change your data model, your service endpoints also change.
For #2 above, an interesting artifact of a REST-based approach to web services is that it promotes creating a uniform interface. That is, how you address items in an ADO.NET Data Service (i.e. how to construct URIs), how to interact with data (using HTTP verbs), etc is the same across any ADO.NET Data Service, regardless of the data it exposes. This uniform interface enables code reuse against your web services such that one can create reusable client libraries and UI widgets for all their services. For example, the ADO.NET Data Service team is doing this by shipping .NET , Silver light, AJAX, etc libraries which can talk to any data service. In addition, this feature (uniform interface) enables us to add features such as LINQ to ADO.NET Data Services since the translation of LINQ query statements to URIs is stable and well known.
So far I’ve talked mainly about how Astoria fits into the mid-tier and makes aspects of app development easier for AJAX/RIA/etc developers. Another key trend that drove us to build the ADO.NET Data Services Framework was the observation that an ever increasing amount of data is being stored in the cloud and web-based APIs to access that data seem to be growing by the minute. A goal ADO.NET Data Services Framework is to fit into this use case by providing a way of easily creating a service that has a RESTful API and a uniform interface that plays well with libraries and tools just like in the app (AJAX, Silverlight, etc) use case described above.
This post is already getting a bit long, but in addition to the items noted above, additional advantages of REST-based approaches also apply such as rich integration with HTTP such that you can leverage existing HTTP infrastructure (ex. HTTP Proxies) deployed at large .
After writing this, I couldn't help but wonder how this aligns with the general view the developer community has of Astoria now that we have released a few CTPs and everyone can try out the bits first hand. After reading this post, did this describe your overall view of Astoria and its target use cases?
- Mike Flasko
Program Manager, ADO.NET Data Services Framework
-
This past week we attended to some very high priority issues (shown below). We got a few good shots of our team ....
This isn't quite the whole Astoria team (the next set of picks we'll have to get one of the entire team), but the folks in the picture are:
Back row (left to right): Andy Conrad, Carl Perry, Marcelo Ruiz, Mike Flasko (me), Pratik Patel
Front row (left to right): Chris Robinson, Shyam Pather, Pablo Castro
The pic below has the same folks as the one above, except it adds Phani Raj on the far right.
Waiting in the cold to hit the slopes ..... I'm not posting any after photos as we looked a bit rough after the day :)
-Mike Flasko
Program Manager, ADO.NET Data Services
-
We made few tweaks to our URI syntax to clean it up in the last CTP of ADO.NET Data Services. Marcelo details them here: Updates to URL syntax for December CTP of ADO.NET Data Services
-Mike
-
Marcelo from our team has posted a nice write up detailing how the $filter query string operator works in ADO.NET Data Services and highlights a number of the functions it supports - check out his post here: $filter Query Option in ADO.NET Data Services
-Mike
Program Manager, ADO.NET Data Services