Asynchronous Programming in C# 5.0 part two: Whence await?

Asynchronous Programming in C# 5.0 part two: Whence await?

Rate This

I want to start by being absolutely positively clear about two things, because our usability research has shown this to be confusing. Remember our little program from last time?

async void ArchiveDocuments(List<Url> urls)
{
  Task archive = null;
  for(int i = 0; i < urls.Count; ++i)
  {
    var document = await FetchAsync(urls[i]);
    if (archive != null)
      await archive;
    archive = ArchiveAsync(document);
  }
}

The two things are:

1) The “async” modifier on the method does not mean “this method is automatically scheduled to run on a worker thread asynchronously”. It means the opposite of that; it means “this method contains control flow that involves awaiting asynchronous operations and will therefore be rewritten by the compiler into continuation passing style to ensure that the asynchronous operations can resume this method at the right spot.” The whole point of async methods it that you stay on the current thread as much as possible. They’re like coroutines: async methods bring single-threaded cooperative multitasking to C#. (At a later date I’ll discuss the reasons behind requiring the async modifier rather than inferring it.)

2) The “await” operator used twice in that method does not mean “this method now blocks the current thread until the asynchronous operation returns”. That would be making the asynchronous operation back into a synchronous operation, which is precisely what we are attempting to avoid. Rather, it means the opposite of that; it means “if the task we are awaiting has not yet completed then sign up the rest of this method as the continuation of that task, and then return to your caller immediately; the task will invoke the continuation when it completes.

It is unfortunate that people’s intuition upon first exposure regarding what the “async” and “await” contextual keywords mean is frequently the opposite of their actual meanings. Many attempts to come up with better keywords failed to find anything better. If you have ideas for a keyword or combination of keywords that is short, snappy, and gets across the correct ideas, I am happy to hear them. Some ideas that we already had and rejected for various reasons were:

wait for FetchAsync(…)
yield with FetchAsync(…)
yield FetchAsync(…)
while away the time FetchAsync(…)
hearken unto FetchAsync(…)
for sooth Romeo wherefore art thou FetchAsync(…)

Moving on. We’ve got a lot of ground to cover. The next thing I want to talk about is “what exactly are those ‘thingies’ that I handwaved about last time?”

Last time I implied that the C# 5.0 expression

document = await FetchAsync(urls[i])

gets realized as:

state = State.AfterFetch;
fetchThingy = FetchAsync(urls[i]);
if (fetchThingy.SetContinuation(archiveDocuments))
  return;
AfterFetch: ;
document = fetchThingy.GetValue();

what’s the thingy?

In our model for asynchrony an asynchronous method typically returns a Task<T>; let’s assume for now that FetchAsync returns a Task<Document>. (Again, I’ll discuss the reasons behind this "Task-based Asynchrony Pattern" at a later date.) The actual code will be realized as:

fetchAwaiter = FetchAsync(urls[i]).GetAwaiter();
state = State.AfterFetch;
if (fetchAwaiter.BeginAwait(archiveDocuments))
  return;
AfterFetch: ;
document = fetchAwaiter.EndAwait();

The call to FetchAsync creates and returns a Task<Document> - that is, an object which represents a “hot” running task. Calling this method immediately returns a Task<Document> which is then somehow asynchronously fetches the desired document. Perhaps it runs on another thread, or perhaps it posts itself to some Windows message queue on this thread that some message loop is polling for information about work that needs to be done in idle time, or whatever. That’s its business. What we know is that we need something to happen when it completes. (Again, I’ll discuss single-threaded asynchrony at a later date.)

To make something happen when it completes, we ask the task for an Awaiter, which exposes two methods. BeginAwait signs up a continuation for this task; when the task completes, a miracle happens: somehow the continuation gets called. (Again, how exactly this is orchestrated is a subject for another day.) If BeginAwait returns true then the continuation will be called; if not, then that’s because the task has already completed and there is no need to use the continuation mechanism.

EndAwait extracts the result that was the result of the completed task.

We will provide implementations of BeginAwait and EndAwait on Task (for tasks that are logically void returning) and Task<T> (for tasks that return a value). But what about asynchronous methods that do not return a Task or Task<T> object? Here we’re going to use the same strategy we used for LINQ. In LINQ if you say

from c in customers where c.City == "London" blah blah blah

then that gets translated into

customers.Where(c=>c.City=="London") …

and overload resolution tries to find the best possible Where method by checking to see if customers implements such a method, or, if not, by going to extension methods. The GetAwaiter / BeginAwait / EndAwait pattern will be the same; we’ll just do overload resolution on the transformed expression and see what it comes up with. If we need to go to extension methods, we will.

Finally: why "Task"?

The insight here is that asynchrony does not require parallelism, but parallelism does require asynchrony, and many of the tools useful for parallelism can be used just as easily for non-parallel asynchrony. There is no inherent parallelism in Task; that the Task Parallel Library uses a task-based pattern to represent units of pending work that can be parallelized does not require multithreading.

As I've pointed out a few times, from the point of view of the code that is waiting for a result it really doesn't matter whether that result is being computed in idle time on this thread, in a worker thread in this process, in another process on this machine, on a storage device, or on a machine halfway around the world. What matters is that it's going to take time to compute the result, and this CPU could be doing something else while it is waiting, if only we let it.

The Task class from the TPL already has a lot of investment in it; it's got a cancellation mechanism and other useful features. Rather than invent some new thing, like some new "IFuture" type, we can just extend the existing task-based code to meet our asynchrony needs.

Next time: How to further compose asynchronous tasks.

  • I think that "resume after" better captures the intent than "await", although "yield until" has a nice ring to it.

  • I also find it very difficult to read that code snippet. Let me see if I have this right:

    The function calls FetchAsync on each element of urls, and arranges for ArchiveAsync to be called on each result whenever that appears.

    No matter how many times I try to read the code snippet, I can't convince myself that that's what it says. I encourage you to take the feedback seriously: "await" sounds like it does the opposite of what it actually does. It's like when someone breaks up with you and says "See you around."

    I think you're having problems with people understanding "async" because the word "asynchronous" is unfortunate in the first place. It doesn't really mean what it sounds like, because "synchronous" in this context means not "at the same time" but "sequentially." So what we call "asynchronous" in CS should really have been called "non-sequential." Anyway, it's a highly technical meaning and it doesn't deserve to be used outside of CS papers.

    I realize you have probably spent many hours on this, so what do I know. But since Every Programmer Has An Opinion (tm), here's my suggestion:

    someday void ArchiveDocuments(List<Url> urls) {

     Task archive = null;

      for(int i = 0; i < urls.Count; ++i)   {

       when(var document = FetchAsync(urls[i])) {

            when(archive == null)

                archive = ArchiveAsync(document);

       }

     }

    }

    Here, "someday" clearly indicates that the function is to schedule doing something eventually, rather than actually doing those things now, but does not have the same thread-related baggage as "async."

    The "when" block clearly indicates the contents of the continuation, and it does not suggest (which the await syntax does) that the contents of the block will be executed before the when clause finishes. It also resembles using blocks, because it's a block that sets up a context in which a given variable is defined. The only difference is that unlike a using, it does not block on the clause finishing, but continues the program, and executes the block whenever the clause is done.

  • This:    after (var document = FetchAsync(urls[i])):

    .. would make no sense. FetchAsync is returning a Task<T>. await is an operation on that task. It's not that we are waiting (sync or async) on the assignment to take place.

    I think the lexical syntax is completely understandable, however, 'yield', 'yield for', or 'yield until' may have made clearer syntax. I like to think of 'await' as meaning 'asynchronous wait'. It's the mathematical dual of 'yield'! (ok, maybe not)

  • I agree with @M.E. that the "async" keyword could also be improved on, but it seems to me it rather depends on what you do with "await".

    "yielding" or "continuing" would be possible options, for example.

    I wonder whether you also considered requiring a keyword on calls to async methods even when they are not being immediately awaited?

    For example, ArchiveAsync returns void. So "archive = ArchiveAsync(document)" seems weird.

    How about "archive = begin ArchiveAsync(document);"?

    That way any call to an async method that *didn't* have either "begin" or "await" (or await's replacement) would be in error - I suspect that would avoid some misunderstandings or confusing code.

  • @Jason: "FetchAsync is returning a Task<T>."

    I thought it was declared as returning "async Document"?

  • I think 'async' and 'await' are fine. People are going to have to clear up their misconceptions about asynchronous programming and threads - it's not like Stream.BeginWrite/EndWrite were ever creating threads.

    That said, I do like the idea of using 'yield' in there, as that reminds me of cooperative multitasking. Maybe something like "yield wait"?

    @Stuart: you can't do that, there already are plenty of asynchronous methods in existing code. Remember that anything returning Task (or something else with a GetAwaiter() method) can be used with "await" - there's no need for an "async" modifier on the method, and callers of the method really shouldn't care how it was implemented - whether the compiler generated the task (using "async" keyword) or whether the programmer wrote the task-creating code manually (without using "async" keyword).

  • I doff my cap to the folks working on this model of async programming. I'm very impressed with how easy it is to interpret and understand this model and how well it composes with itself (which is typically one of the challenges of asynchronous development).

    Two things, however, I'm very curious to see are the debugging experience and the BCL enhancements relating to this model of asynchrony. Intuitive debugging in an async environment is hard to achieve - and with composable continuations, I suspect that we'll need some new debugging capabilities to make sure developers can wrap their heads around what is happening in the code ... especially with the amount of sophisticated compiler magic involved.

  • I love "hearken unto". I think it fits great.

  • Why does Task<T> contain the method GetValue() when a property seems more appropriate? (Regardless of what it compiles to.) Or are there parameter versions of it as well?

  • I think the choice of keyword possibly depends on what point you're trying to get across: what happens at the *start* of the operation ("yield back to the caller") or what happens at the *end* of the operation ("continue with the rest of the program flow").

    One subtelty which hasn't been brought up - at least in the comments - is that it *might not* yield back to the caller. If the call to awaiter.BeginAwait() returns false (IIRC - maybe true; I haven't got the spec in front of me) then the method will continue *immediately*.

    Likewise although Eric mentioned continuing "on the same thread" that's only necessary for some synchronization contexts - such as ones in a UI. If you're writing server-side code and this is executing on a thread-pool thread, I believe the default synchronization context will allow it to continue on any random thread pool thread later.

    I think "async" is fine, but I think I'd like something involving "continue" where we've currently got "await". For example:

       var result = continue after DownloadStringTaskAsync(url);

  • With respect to BCL enhancements, I am curious how this asynchrony model will work with exist constructs such as BeginInvoke() / EndInvoke() on delegates. Those are already a common construct used to invoke operations asynchronously - however they return IAsyncResult - not Task<>. Will the compiler allow for transparent conversions between the two? Will there be new versions of these methods? Will developers have to explcitly use extension methods (or the like) to bridge the two? It would be a shame to not cleanly integrate with existing async constructs in .NET.

    I also think that it would be useful to add some syntactic sugar for expressing cancellation - rather than requiring developers to manually pass cancellation tokens everywhere. In my experience, you typically want *all* async operations started from some point to be cancellable - it's rarely necessary to be able to have fine grained control over which specific async steps should be cancelled (when you start more than one, that is). Since the compiler is already orchestrating the scheduling and execution of async operations, I think it makes sense to support abstracting away the concept of cancellation as well. I think an "ambient cancellation token" similar to how TransactionScope works for transactions would be helpful.

  • Declaring 'async Document' as the return type will compile as a method returning Task<Document>.

  • @Stuart, declaring 'async Document' as the return type will compile as a method returning Task<Document>.

  • I think that a new concept deserves a new keyword.  Granted, that technically, "async" is not currently a keyword, but I think that if it was used as shown, then a general misunderstanding would be present.  Putting something like "async" on a method declaration, at first inspection, I would think that it executes on a different thread.  

    Consider using a different word:  How about "deferred"?

    Not sure about "await"  I agree it feels somewhere between "yield" and "continue".  

    Just throwing things together:  How about "continue when"?

    deferred void ArchiveDocuments(List<Url> urls)

    {

     Task archive = null;

     for(int i = 0; i < urls.Count; ++i)

     {

       var document = continue when FetchAsync(urls[i]);

       if (archive != null)

         continue when archive;

       archive = ArchiveAsync(document);

     }

    }

  • I suggest  using 'threaded' instead of 'async' and 'execute' instead of 'await'. It would not make code much more readable, but IMHO it also wouldn't make it less readable. However these words avoid the associations programmers already have with words 'async' and 'await' (the wait part in await :), probably making them more intuitive. Research of course is needed. Anyway, just a suggestion.

Page 2 of 11 (161 items) 12345»