Asynchronous Programming in C# 5.0 part two: Whence await?

Asynchronous Programming in C# 5.0 part two: Whence await?

Rate This

I want to start by being absolutely positively clear about two things, because our usability research has shown this to be confusing. Remember our little program from last time?

async void ArchiveDocuments(List<Url> urls)
{
  Task archive = null;
  for(int i = 0; i < urls.Count; ++i)
  {
    var document = await FetchAsync(urls[i]);
    if (archive != null)
      await archive;
    archive = ArchiveAsync(document);
  }
}

The two things are:

1) The “async” modifier on the method does not mean “this method is automatically scheduled to run on a worker thread asynchronously”. It means the opposite of that; it means “this method contains control flow that involves awaiting asynchronous operations and will therefore be rewritten by the compiler into continuation passing style to ensure that the asynchronous operations can resume this method at the right spot.” The whole point of async methods it that you stay on the current thread as much as possible. They’re like coroutines: async methods bring single-threaded cooperative multitasking to C#. (At a later date I’ll discuss the reasons behind requiring the async modifier rather than inferring it.)

2) The “await” operator used twice in that method does not mean “this method now blocks the current thread until the asynchronous operation returns”. That would be making the asynchronous operation back into a synchronous operation, which is precisely what we are attempting to avoid. Rather, it means the opposite of that; it means “if the task we are awaiting has not yet completed then sign up the rest of this method as the continuation of that task, and then return to your caller immediately; the task will invoke the continuation when it completes.

It is unfortunate that people’s intuition upon first exposure regarding what the “async” and “await” contextual keywords mean is frequently the opposite of their actual meanings. Many attempts to come up with better keywords failed to find anything better. If you have ideas for a keyword or combination of keywords that is short, snappy, and gets across the correct ideas, I am happy to hear them. Some ideas that we already had and rejected for various reasons were:

wait for FetchAsync(…)
yield with FetchAsync(…)
yield FetchAsync(…)
while away the time FetchAsync(…)
hearken unto FetchAsync(…)
for sooth Romeo wherefore art thou FetchAsync(…)

Moving on. We’ve got a lot of ground to cover. The next thing I want to talk about is “what exactly are those ‘thingies’ that I handwaved about last time?”

Last time I implied that the C# 5.0 expression

document = await FetchAsync(urls[i])

gets realized as:

state = State.AfterFetch;
fetchThingy = FetchAsync(urls[i]);
if (fetchThingy.SetContinuation(archiveDocuments))
  return;
AfterFetch: ;
document = fetchThingy.GetValue();

what’s the thingy?

In our model for asynchrony an asynchronous method typically returns a Task<T>; let’s assume for now that FetchAsync returns a Task<Document>. (Again, I’ll discuss the reasons behind this "Task-based Asynchrony Pattern" at a later date.) The actual code will be realized as:

fetchAwaiter = FetchAsync(urls[i]).GetAwaiter();
state = State.AfterFetch;
if (fetchAwaiter.BeginAwait(archiveDocuments))
  return;
AfterFetch: ;
document = fetchAwaiter.EndAwait();

The call to FetchAsync creates and returns a Task<Document> - that is, an object which represents a “hot” running task. Calling this method immediately returns a Task<Document> which is then somehow asynchronously fetches the desired document. Perhaps it runs on another thread, or perhaps it posts itself to some Windows message queue on this thread that some message loop is polling for information about work that needs to be done in idle time, or whatever. That’s its business. What we know is that we need something to happen when it completes. (Again, I’ll discuss single-threaded asynchrony at a later date.)

To make something happen when it completes, we ask the task for an Awaiter, which exposes two methods. BeginAwait signs up a continuation for this task; when the task completes, a miracle happens: somehow the continuation gets called. (Again, how exactly this is orchestrated is a subject for another day.) If BeginAwait returns true then the continuation will be called; if not, then that’s because the task has already completed and there is no need to use the continuation mechanism.

EndAwait extracts the result that was the result of the completed task.

We will provide implementations of BeginAwait and EndAwait on Task (for tasks that are logically void returning) and Task<T> (for tasks that return a value). But what about asynchronous methods that do not return a Task or Task<T> object? Here we’re going to use the same strategy we used for LINQ. In LINQ if you say

from c in customers where c.City == "London" blah blah blah

then that gets translated into

customers.Where(c=>c.City=="London") …

and overload resolution tries to find the best possible Where method by checking to see if customers implements such a method, or, if not, by going to extension methods. The GetAwaiter / BeginAwait / EndAwait pattern will be the same; we’ll just do overload resolution on the transformed expression and see what it comes up with. If we need to go to extension methods, we will.

Finally: why "Task"?

The insight here is that asynchrony does not require parallelism, but parallelism does require asynchrony, and many of the tools useful for parallelism can be used just as easily for non-parallel asynchrony. There is no inherent parallelism in Task; that the Task Parallel Library uses a task-based pattern to represent units of pending work that can be parallelized does not require multithreading.

As I've pointed out a few times, from the point of view of the code that is waiting for a result it really doesn't matter whether that result is being computed in idle time on this thread, in a worker thread in this process, in another process on this machine, on a storage device, or on a machine halfway around the world. What matters is that it's going to take time to compute the result, and this CPU could be doing something else while it is waiting, if only we let it.

The Task class from the TPL already has a lot of investment in it; it's got a cancellation mechanism and other useful features. Rather than invent some new thing, like some new "IFuture" type, we can just extend the existing task-based code to meet our asynchrony needs.

Next time: How to further compose asynchronous tasks.

  • @MH Lim

    Thank you, I found this

    foreach (await var x in xx) Console.WriteLine(x);

    in the slide, but it doesn't compile..

  • I think "yield while" or  "yield until" are the keywords that are the most appropriate to express the exact meaning of the intent, without suggesting any kind of  block operation.

    P.S. I would really like to replace "lock" with "synchronize" or "sync". I know it's impossible at this stage, but would be really helpful. It would avoid lots of missusages, by C# newcomers.

  • async definitely the best keyword in front of the method declaration. I'd go with async for the method call too as in:

    var document = async FetchAsync(urls[i]);

  • I would vote for instead of "await" these ideas:

    "when"

    "proceed when"

    "until"

    "continue until" (favourite because it uses current meaning of "continue")

    "continue while"   (then we would have the negative logical statement)

    async void ArchiveDocuments(List<Url> urls)

    {

     Task archive = null;

     for(int i = 0; i < urls.Count; ++i)

     {

       var document = continue until FetchAsync(urls[i]);

       if (archive != null)

         continue until archive;

       archive = ArchiveAsync(document);

     }

    }

  • I'm not confident that I fully understand all the particulars of the new functionality, but what about "branch" for the keyword? It seems to me that the point here is to divide the control flow into two paths; the "main path" that we're enumerating ourselves and the "support path" that's handling some operation off in the background and will complete at an indeterminate point later. The "branch" might not last long-- it might not happen at all, in fact, if the asynchronous operation completes immediately-- but I think "branch" gets the point across that the we're splitting something off.

    But, as I said, I could be totally off the mark.

  • I think async and await are fine. await is definitely better than anything that includes 'yield' in it. I understand the similarity of this construct to the IEnumerable yield but I don't think "yield" gives the semantics of what's going on.

    However, I'm looking forward to read the reasons behind requiring the async modifier rather than inferring it.

  • I would like to argue for the "await" keyword.

    If I understand correctly, some people object to "await" on the grounds that it isn't actually waiting, but is actually returning immediately to the caller. I don't think this is the correct way to look at it.

    Suppose two people go into a movie theater and decide to get some popcorn before watching the movie. If the line for the popcorn is short, then they will both stand in line and buy their popcorn, then go see the movie. If the line is long, then one will say to the other, "I'll await the popcorn. You go get us a couple of good seats, and I will join you in a few minutes."

    This is similar to what is going on here. The one who buys the popcorn (like the Async method) is going to wait for the task to complete in any case. The one who sometimes waits for the popcorn and sometimes goes straight to the seat (like the caller of the Async method) may or may not wait.

    In either case, at the location in the code in which the "await" keyword appears, waiting is going on.

    In the case where no waiting is necessary, because the called task has returned immediately (as in Jon Skeet's blog entry objecting to "yield"), then the await is analogous to a "foreach" called on an empty collection. In that case, there is simply nothing to await, so the method continues on.

    That's just my US$0.02.

  • I think that async/await is fine (possibly making the async keyword optional, since it's almost useless without await).

    However, I've never thought of "await" as just one word; in my head I've always translated it to "asynchronous wait", which is exactly what it is. (With the caveat that it may not *actually* be asynchronous - but this would likely be rare in practice).

    This opinion is coming from someone who's been doing async programming for >10 years, and is tremendously excited about this language change. :)

  • There's a video interview with the async team on Channel 9 where it was said they have a 2nd version of the compiler which doesn't need the async modifier. On another msdn blog there is a slide that says the async modifier is only needed for VB. From that you can draw your own conclusions why they chose to add it to C# anyway.

    From the suggestions so far, if C# has to have both then I'll settle for async + await, atleast they start with the same letter which makes them stand out.

  • i like async/async better than async/await.

  • I think another reason "async" modifier should exist is that the Task of async method is kicked off when the method returns. Normal methods that return Task would not necessarily kick it off when the they return.

  • WOW! Now all you need is to ask the Windows Azure team to guarantee the continuation is restored after machine fail-over.

    http://bit.ly/ae7Dih

  • The biggest compliant I have with the synax is that a keyword is embedded in an otherwise perfectly valid expression.  As a corollary to this it breaks away from the spirit of how iterator blocks were designed. For example, "return yield value" is not legal synax, but "yield return value" certainly is and fits the intuitive pattern of a keyword followed by a continuation statement.  Modelling the syntax of this new feature after the iterator syntax we might be better served with the following syntax.

    async void ArchiveDocuments(List<Url> urls)

    {

       Task archive = null;

       foreach (Url item in urls)

       {

           Document document;

           await FetchAsync(item) capture document;

           if (archive != null) await archive;

           archive = ArchiveAsync(document);

       }

    }

    Notice that I precede the continuation statement with the `await` keyword and use a new keyword 'capture' to capture the return value into a local variable.  I am envisioning several different variations for capturing that return value, but the point I want to drive home is that the keyword that gets everything "going" precedes the statements it acts upon instead of getting embedded in it.

    Also, I vote to abolish the 'async' keyword if it is deemed sufficient that it could be inferred by the compiler. I am, of course, prepared to retract my vote if there is some complusory reason for its existence.

  • How about callback for the keyword?

  • Why do keywords need to be actual English words? In this instance, that seems to be causing some problems, because most of the candidates have meanings or associations in regular usage that are irrelvant or misleading.

    Suggestion: use "cwith" instead of "await". It is pronounceable and reasonably memorable, short and easy to type, and relatively free of stray associations. (I created it by contracting "continue with", which is my favorite of the suggestions so far.)

    If you don't like that one, try other neologisms / contractions / invented words.

Page 9 of 11 (161 items) «7891011