(sorry, tricky problem -> long write-up)
One of the few things pending in the server library of the ADO.NET Data Services Framework is query caching to help with performance. Here is a brief explanation of why we needed and a couple of design options. Feedback is welcome.
Query processing in Astoria
To give a little bit of context, let me first briefly go through the process that takes place when Astoria receives a URL and works its magic to turns it in query results ready for serialization in the HTTP response.
Data sources are hooked into Astoria services through LINQ expression trees. That means that the role of Astoria during query processing is to take a URL and translate it into an expression tree that then we can ask the data source to execute and give us results. The general flow is more or less like this:
URL -> Expression Tree -> [Data Source Execution and Materialization] -> Objects -> Serialization (Atom/JSON)
The thing in brackets is data source-specific. For example, if using the ADO.NET Entity Framework it would look like this:
URL -> Expression Tree -> Canonical Query Tree -> View expansion/Query simplification -> Canonical Query Tree -> SQL -> Rows (DataReader) -> Entities (DataReader) -> Objects -> Serialization (Atom/JSON)
This is not a cheap thing to do in every request, hence this discussion about query caching.
Why query caching
The processing pipeline that I showed above can be expensive. In particular, all that query translation between different tree types as well as all that analysis for view expansion and query simplification is an expensive, CPU-bound activity that we want to avoid as much as we can.
To help with this we are planning to introduce query caching, similar to what database systems do (e.g. the SQL Server “proc cache”). The idea of query caching is that for commonly requested URLs we’d bypass most of the processing required to setup the query for execution, and we’d go as directly as possible to the execution phase.
Since Astoria is a generic framework that works on many data sources, we have to enable this in a way that allows different data sources plug in their query caching capabilities if they have such thing.
Data source-independent query compilation
In order to implement caching, we need a way of doing our own work for translation, then tell the data source to do its own work on the expression trees, and then save that work to be used every time we have to respond to the “same” request (the definition of “same” is well…difficult, more about this later); that is, we need a way of compiling queries.
You can imagine an interface with two methods for this:
object CompileQuery(Expression<Func<T1, T2, …, Tn, TResult>> query);
IEnumerable ExecuteCompiledQuery(object compiledQuery);
The idea is that each data source can give whatever meaning it wants to “compiling” a query, and they can return us an opaque token (object) that we’d pass back along with parameter values in order to execute the query. Ideally you’d do as much work as possible. For example, there is a CompiledQuery class in LINQ to SQL and LINQ to Entities that can do the translation all the way to a SQL statement only once and then re-use it in all subsequent executions (re-binding parameter values).
Query caching design, part 1: for simple cases, a simple design does it
In the simpler cases where the data services does not have customizations the URL -> Expression Tree translation is 1:1. The only consideration in this case would be to parameterize the URLs so that we don’t fragment the query cache with URLs that are the same query with different constants (e.g. /Customers(1) vs /Customers(2), or /Customers?$filter=City eq ‘Seattle’ and /Customers?$filter=City eq ‘Las Vegas’).
In this case we can keep a simple map of parameterized URL to compiled query. If we don’t find a given URL, we go through the full translation process to produce an expression tree, and then –assuming the data source supports compilation- call CompileQuery to obtain a compiled query opaque token. On subsequent executions we look up the URL, find the compiled query token, bind the constants that we turned into parameters to parameter values and execute the query directly.
The rest is standard caching stuff…keep a map, have a limit and an eviction policy, make it “spike resistant”, etc.
Of course, life is rarely that simple…
So far we’ve assumed that there is a 1:1 correspondence between (parameterized) URLs and expression trees. Astoria has a nice feature called “query interceptors” that causes that correspondence to break.
A query interceptor allows the service developer to introduce a custom filter predicate for each entity set that’s exposed through the service interface. For example, if you had a Customers table and a CustomerAccess table that indicates which user-ids can see which customers, you could implement entity-level security for customers by adding this interceptor:
Expression<Func<Customer, bool>> QueryCustomers()
// retrieve the user from the environment (e.g. the currently //logged-in user)
UserDescriptor u = GetUser();
return c => true; // can see all customers
return c => c.CustomerAccess.Any(ca => ca.UserID ==
Interceptors are invoked whenever a URL involves the entity-set the interceptor is bound to, regardless of how. The system injects a Where operator with the predicate returned by the interceptor. This includes top level entity-set access (e.g. /Customers), access of subsets through link traversal (e.g. /SalesPeople(123)/AssignedCustomers), and inline expansion (e.g. /SalesPeople(123)?$expand=AssignedCustomers).
Note that now URLs and expression trees are no longer 1:1. There are two kinds of differences that might show up:
a) Same query but different parameters. For example, if user 1 and user 2, both not administrators, fetch the URL /Customers, then both requests will produce identical query trees, with the only difference that the value of “u.UserID” will be different. The difference with the parameterization of the URLs is that in this case we’re not the ones parsing the input.
b) Different query. In the example above, if one of the users accessing the URL /Customers is administrator and the other is not, the filter predicates used in each of them will be different.
Now the caching thing got complicated.
Side-note: I’ve excluded update interceptors in this discussion because they don’t participate in query composition. They are a key part of enforcing access control the way I described above though.
Query caching design, part 2
We really wanted to make query caching work without any API surface other than maybe a configuration knob for the size of the cache. That’s not looking great at this point, but we do have a few options on the table.
The essence of the problem is the fact that query interceptors need to be able to capture data from the execution environment at the time a given request is being processed. The main example of this is grabbing the user credentials for a given request (e.g. Thread.CurrentThread.CurrentPrincipal, or the user-id extracted from an encrypted cookie/custom HTTP header), but there can be other scenarios as well.
There are a couple of approaches that could do the trick, but then come with their trade-offs:
Option I: a bit of extra magic for a nicer API.
We could let users write interceptors just like I showed above. Those interceptors mix references to environment data and the description of the filter predicate in a single construct, a LINQ expression tree that has references to external variables in the reference closure. What the developer writes looks like this (copied from above):
// retrieve the user from the environment (e.g. the currently logged-in user)
To cache queries we would have to:
· On every request invoke all interceptors that are needed
· Extract all uncorrelated subexpressions from each interceptor filter predicate and turn them into constants (do “funcletization” in LINQ jargon), then replace the constants with generic parameters. Now we have a little tree for each interceptor
· Now use a parameterized URL + all the interceptor trees the caching key.
Option II: explicitness at the cost of API complexity
The other approach is to explicitly separate the definition of the filter predicate from the per-execution state that comes from the environment. The developer would write two pieces, statically-defined filter predicate and a method that sets-up per-request state, as follows:
static Expression<Func<Customer, CustomersDbContext, Dictionary<string, object>, bool>> QueryCustomers()
return (c, ctx, state) => (bool)state["IsAdmin"] ||
c.CustomerAccess.Any(ca => ca.UserID ==
void override OnStartRequest(RequestDescriptor descriptor, Dictionary<string, object> state)
state[“IsAdmin”] = u.IsAdmin;
state[“UserID”] = u.UserID;
As you can see, the definition of the filter predicate gets a bit trickier (more parameters in the lambda expression in particular).
Of course, since at this point the interceptor is called only once and can’t use any request-bound context, we may as well just remove the idea of a method all together, and move the filter setup to the service initialization, where we already have APIs for configuring service policies. So the code would become:
static void InitializeService(IDataServiceConfiguration config)
// other policy initialization
(c, ctx, state) => (bool)state["IsAdmin"] ||
state["IsAdmin"] = u.IsAdmin;
state["UserID"] = u.UserID;
The idea is that OnStartRequest (or whatever is a nice name for that) would explicitly capture the state any and all interceptors would use. Then interceptors become static constructs that are setup during initialization. An important detail is that now the predicate expression cannot refer to variables in the environment any more. Instead, for anything that’s request specific it needs to access the “state” object, which is then setup with data on a per-request basis.
This yields a very efficient system, because now we can do minimal work on repeated requests that result in the same query, even in the presence of interceptors. On the other hand, the code that developers have to write is pretty tricky…
Is it better to take a performance hit and leave it more usable? Is there a middle ground that we didn’t consider?
If you made it reading this far, you’re probably one of few J. In any case, feedback is very welcome.
This post is part of the transparent design exercise in the Astoria Team. To understand how it works and how your feedback will be used please look at this post.