A couple of weeks ago I had a discussion with a co-worker about what would be the proper way to asynchronously iterate over some data in azure tables. Exploring different options was very interesting and let us understand different pros and cons for each asynchronous strategy. So over the next few weeks I'll go over each option we looked at in more detail. But first a little background so you understand the basic problem.

When you retrieve rows form an azure table you can obviously get all rows first and then do what ever you need to do on the result. However if that is a lot of rows it means a (relatively) long waiting time to get the data that also will use a lot of memory. If you do not need all that data at once or if the first thing you do is some kind of filtering on the data that results in only a few records being left in the final collection you want to execute on, then an asynchronous retrieval of records would be a good thing. If you know a little about azure you know that when you retrieve records from azure tables you may or may not get a continuation token. Something that most people never deal with manually. But in this case you might want to by for example using the BeginExecuteSegmented method. That method will give you an asynchronous access to zero or more rows at a time.

Already you can see that there are several different scenarios to take into account. Do you need all the records together for your processing? Can you process each record individually? Do you need to scan a lot of records only filtering out a few? Depending on what you need I believe your choice of "asynchronous enumeration" will differ. But be assured; I'll help you pick the right one for you! The options I will cover in the next few weeks are:

  • Get the whole enumeration asynchronously, a.k.a. Task<IEnumerable<T>>
  • Get each item asynchronously, a.k.a. IEnumerable<Task<T>>
  • Use reactive extensions, a.k.a. IObservable<T>
  • Get segments asynchronously, a.k.a. IEnumerable<Task<IEnumerable<T>>>
  • Create your own custom solution, a.k.a. MyEnumerationAsync