For a while now I've been thinking that the best way to get better at API & protocol design is to try to articulate how you do design.
Articulating your thought process has a number of significant benefits:
Clearly these are compelling.
So I thought I'd give this a go by trying to document an idealized version of the thought process for Any/All in OData...
OData had no way to allow people to query entities based on properties of a related collection, and a lot of people, myself included, wanted to change this...
I always look for things that I can borrow ideas from. Pattern matching, if you will, with something proven. Here you are limited only by your experience and imagination, there is an almost limitless supply of inspiration. The broader your experience the more source material you have and the higher the probability you will be able to see a useful pattern.
If you need it – and you shouldn’t - this is just another reason to keep learning new languages, paradigms and frameworks!
Sometimes there is something obvious you can pattern match with, other times you really have to stretch...
In this case the problem is really easy. Our source material is LINQ. LINQ already allows you to write queries like this:
from movie in Movies where movie.Actors.Any(a => a.Name == 'Zack') select movie;
Which filters movies by the actors (a related collection) in the movie, and that is exactly what we need in OData.
My next step is generally to take the ideas from the source material and try to translate them to my problem space. As you do this you'll start to notice the differences between source and problem, some insignificant, some troublesome.
This is where judgment comes in, you need to know when to go back to the drawing board. For me I know I'm on thin ice if I start saying things like 'well if I ignore X and Y, and imagine that Z is like this' .
But don't give up too early either. Often the biggest wins are gained by comparing things that seem quite different until you look a little deeper and find a way to ignore the differences that don't really matter. For example Relational Databases and the web seem very different, until you focus on ForeignKeys vs Hyperlinks, pull vs push seems different until you read about IQueryable vs IObservable.
Notice not giving up here is all about having tolerance for ambiguity, the higher your tolerance, the further you can stretch an idea. Which is vital if you want to be really creative.
In the case of Any/All it turns out that the inspiration and problem domain are very similar, and the differences are insignificant.
So how do you do the translation?
In OData predicate filters are expressed via $filter, so we need to convert this LINQ predicate:
(movie) => movie.Actors.Any(actor => actor.Name == 'Zack')
into something we can put in an OData $filter.
Let's attack this systematically from left to right. In LINQ you need to name the lambda variable in the predicate, i.e. movie, but in OData there is no need to name the thing you are filtering, it is implicit, for example this:
from movie in Movies where movie.Name == "Donnie Darko" select movie
is expressed like this in OData:
~/Movies/?$filter=Name eq 'Donnie Darko'
Notice there is no variable name, we access the Name of the movie implicitly.
So we can skip the ‘movie’.
Next in LINQ the Any method is a built-in extension method called directly off the collection using '.Any'. In OData '/' is used in place of '.', and built-in methods are all lowercase, so that points are something like this:
As mentioned previously in OData we don't name variables, everything is implicit. Which means we can ignore the actor variable. That leaves only the filter, which we can convert using existing LINQ to OData conversion rules, to yield something like this:
~/Movies/?$filter=Actors/any(Name eq 'Zack')
There is a good chance you lost something important in this transformation, so my next step is generally to assess what information has been lost in translation. Paying particular attention to things that are important enough that your source material had specific constructs to capture them.
As you notice these differences you either need to convince yourself the differences don’t matter, or you need to add something new to your solution to bring it back.
In our case you'll remember that in LINQ the actor being tested in the Any method had a name (i.e. 'actor'), yet in our current OData design it doesn't.
Is this important?
Yes it is! Unlike the root filter, where there is only one variable in scope (i.e. the movie), inside an Any/All there are potentially two variables in scope (i.e. the movie and the actor). And if neither are named we won't be able to distinguish between them!
For example this query, which finds any movies with the same name as any actors who star in the movie, is unambiguous in LINQ:
from movie in Movies where movie.Actors.Any(actor => actor.Name == movie.Name) select movie;
But our proposed equivalent is clearly nonsensical:
~/Movies/?$filter=Actors/any(Name eq Name)
It seems we need a way to refer to both the inner (actor) and outer variables (movie) explicitly.
Now we can't change the way existing OData queries work - without breaking clients and servers already deployed - which means we can't explicitly name the outer variable, we can however introduce a way to refer to it implicitly. This should be a reserved name so it can't collide with any properties. OData already uses the $ prefix for reserved names (i.e. $filter, $skip etc) so we could try something like $it . This results in something like this:
~/Movies/?$filter=Actors/any(Name eq $it/Name)
And now the query is unambiguous again.
But unfortunately we aren't done yet. We need to make sure nesting works calls works too, for example this:
from movie in Movies where movie.Actors.Any(actor => actor.Name == movie.Name && actor.Awards.Any(award => award.Name == 'Oscar')) select movie;
If we translate this, with the current design we get this:
~/Movies/?$filter=Actors/any(Name eq $it/Name AND Awards/any(Name eq 'Oscar'))
But now it is unclear whether Name eq 'Oscar' refers to the actor or the award. Perhaps we need to be able to name the actor and award variables too. Here we are not restricted by the past, Any/All is new, so it can include a way to explicitly name the variable. Again we look at LINQ for inspiration:
award => award.Name == 'Oscar'
Cleary we need something like '=>' that is URI friendly and compatible with the current OData design. It turns out ':' is a good candidate, because it works well in querystrings, and isn’t currently used by OData, and even better there is a precedent in Python lambda’s (notice the pattern matching again). So the final proposal is something like this:
~/Movies/?$filter=Actors/any(actor: actor/Name eq $it/Name)
Or for the nested scenario:
~/Movies/?$filter=Actors/any(actor: actor/Name eq $it/Name AND actor/Awards/any(award: award/Name eq 'Oscar'))
And again nothing is ambiguous.
In this case the design feels good, so we are done.
But clearly that won't always be the case. However you should at least know whether the design is looking promising or not. If not it is back to the drawing board.
If it does look promising, I would essentially repeat steps 1-3 again using some other inspiration to inform further tweaking of the design, hopefully to something that feels complete.
While writing this up I definitely teased out a number of things that had previously been unconscious, and I sure this will help me going forward, hopefully you too?
Next time I do this I'll explore something that involves more creativity... and I'll try to tease out more of the unconscious tools I use.
What do you think about all this? Do you have any suggested improvements?