Eugenio Pace in his 5th article in a series on SSDS gives a pattern for cross container queries. The article can be found here. This is an important scenario since SSDS partitions data into containers to scale out the system. But it actually limits scenarios like cross container queries and transactions. Cross container queries can be loosely grouped into 2 patterns:
- Fan out queries: where a query can be run against every container in parallel and the results returned and a union-all performed at the client. This is an important scenario for us and we are looking at how we can make this pattern run efficiently in SSDS.
- Queries that have cross container dependencies: join, sort and groupby across containers fall in this category. Map-reduce is one option as it is a technique that is known to scale well when data is partitioned. Again this is an important scenario and you can expect to see us make progress here as well.
Question for you, how far do you think fan out queries get you in your scenario?