(A few days ago I decided to investigate implementing IQueryable provider. The first reference I always find for this task is Matt Waren's blog post

http://blogs.msdn.com/b/mattwar/archive/2007/07/30/linq-building-an-iqueryable-provider-part-i.aspx

There's nothing wrong with Matt's post except that personally I found that even though it's the 3rd time I've looked at his introduction, it's very hard to understand what he's saying. So for me, and for everyone like me, I'm writing an alternative introduction post. It doesn't replace his series in any way, I just think that if are going to read his series, also reading this can help with some of the concepts.)

Fact #1: If you really want to do a custom IQueryable where you provide the execution logic, there are actually two interfaces to implement. IQueryable and IQueryProvider.

But, and here is the counter-fact: if you don't really want to do a custom IQueryable with lots of fancy execution logic, you just need to implement IQueryable to satisfy some inheritance rules, then you can do the quick and dirty IQueryable<T> implementation where you forward the calls to some other IQueryable like EnumerableQuery<T> like this:

IQueryable<T> _queryable = new EnumerableQuery<T>(_data); // _data is an IEnumerable of the dataset

 

public Type ElementType

{

get { return typeof(T); }

}

 

public Expression Expression

{

get { return _queryable.Expression; }

}

 

public IQueryProvider Provider

{

get { return _queryable.Provider; }

}

Fact #2: There are two common varieties of objects which implement IQueryable: Leafs and Branches.

Or Terms and Expressions. Same idea. Either way…

Fact #3: An IQueryable is an Expression.

As in a System.Linq.Expressions Expression<T>.

An IQueryable is exactly notionally equivalent to some expression (which is a piece of code), that might either look like this:

var tempQueryable = (IQueryable)BLARG;

or it might look like this:

var tempQueryable = BLARG.Where(x => x.IsNice);

In the first case the IQueryable is BLARG, or to put it another way, it is the trivial expression returning BLARG.

In the second case, the IQueryable is created from a non-trivial expression: Where(x => x.IsNice) acting on BLARG. The method which created this expression is* the IQueryable.Where() extension method.

We can think of the first one as a leaf node in the tree, or an expression term. We can think of the second as a branch in the tree (the expression tree), or a non-trivial expression.

You can retrieve 'the' expression that the IQueryable represents from the IQueryable itself, by calling IQueryable.Expression.

Once we understand this fact, we can ask ourselves why do we need the IQueryable class at all, why are we not just using Expressions themselves? Well,

  • Reason #1: an IQueryable binds with it one other very important fact which is knowledge of how to execute (or interpret) the expression. This is IQueryable.Provider.
  • Reason #2: the IQueryable doesn't literally have to return the exact same expression as it was created with. It just has to be something that is logically the same and evaluates to the same result, and that the QueryProvider knows how to work with.

Fact #4: Everything you do with IQueryables until you enumerate them is just building more and expressions.


If BLARG is an IQueryable<T>, Blarg.Where(x => x.IsNice) doesn't actually do anything except build a bigger expression.

Fact #5: IQueryable extension build bigger expressions by calling IQueryProvider.CreateQuery().


I imagine this extensibility point is useful if you wanted to do fun stuff like optimize the expression online as it is being built… But otherwise it seems a little redundant.

Fact #6: It's easy to implement IQueryable.


Matt provides a template class for doing this in his first blog post called Query. It can correctly implement IQueryable, given an arbitrary IQueryProvider.

Fact #7: It's also easy to implement IQueryProvider.CreateQuery().


Matt provides template code for doing this too.

Fact #8: IQueryable, System.Linq.Expressions and IQueryProvider.Execute() is kinda like the interpeter pattern.

Do you know your design patterns? Don't worry, this one only takes a minute.

http://en.wikipedia.org/wiki/Interpreter_pattern

Laying out the mapping from wikipedia's diagram to IQueryable we have a close correspondence with one interesting difference:

  • AbstractExpression, TerminalExpression, and NonterminalExpression all map to IQueryable (or IQueryable.Expression)
  • 'Interpret(Context)' is like IQueryProvider.Execute(). But instead of the expression knowing how to interpreting itself, that task has in fact been farmed out to a different object, the query provider. This design makes sense for LINQ scenarios because a) the expressions we're dealing with are fairly complex and knowledge of how to optimally execute the entire tree b) the expressions don't even know what the domain (SQL? Enumerables? Something else?) they are dealing with is. Expressions need to be able to vary their behavior across domain.
  • Context doesn't exist explicitly - it is just the current state of your program.

Therefore, to implement IQueryProvider.Execute() you just have to write an interpreter. J

Fact #9: Actually, there's more than one way…

The way I've described doing things is indeed, just build up the expression tree, but postpone interpreting and evaluating it to the very last minute with IQueryProvider.Execute(). There's really nothing to stop you, however, analyzing and optimizing along the way while you're doing IQueryProvider.CreateQuery().

Epilog

Hopefully you and I can now easily get Matt's posts. The rest of my post is just a few more helpful things I noticed before you jump in.

Tip #1: ExpressionVisitor is in the framework now

When Matt wrote his blogs, this class didn't exist, so he had to write his own, but it was later released in .Net 4.0.

http://msdn.microsoft.com/en-us/library/system.linq.expressions.expressionvisitor.aspx

This class lets you implement logic for walking the expression tree just by writing handlers for each node type. You are thereby saved from writing all of the repetitive 'look up what is expression type, jump to right code for walking children of that expression type and recurse' boilerplate.

Tip #2: There are all sorts of crazy things you can do in LINQ expressions which make writing a complete IQueryProvider.Execute() hard.

-local variable references
-selecting temporary objects which you continue to refer to in the expression
-method invocations in query expressions, which may or may not be translatable to your problem domain, depending on what you are querying against
-detecting and handling joins, if what you are querying against is not objects in memory
-handling where and select occurring in either order

Tip #3: Matt did an expression toolkit with some things that are debugging aids


http://blogs.msdn.com/b/mattwar/archive/2008/11/17/building-a-linq-iqueryable-provider-part-xii.aspx

Tip #4: If you just want to intercept and modify a query, then let some other provider evaluate it, there's a nuget package for that


See https://www.nuget.org/packages/QueryInterceptor