FAQ on Task.Start

FAQ on Task.Start

Rate This
  • Comments 9

Recently I’ve heard a number of folks asking about Task.Start, when and when not to use it, how it behaves,and so forth.  I thought I’d answer some of those questions here in an attempt to clarify and put to rest any misconceptions about what it is and what it does.

1. Question: When can I use Task.Start?

The Start instance method may be used if and only if the Task is in the Created state (i.e. Task.Status returns TaskStatus.Created).  And the only way a Task can be in the Created state is if the Task were instantiated using one of Task’s public constructors, e.g. "var t = new Task(someDelegate);”.

2. Question: Should I call Start on a Task created by Task.Run / Task.ContinueWith / Task.Factory.StartNew / TaskCompletionSource / async methods / …?

No.  Not only shouldn’t you, but you simply can’t… it would fail with an exception. See question #1. The Start method is only applicable to a Task in the Created state.  Tasks created by all of those mentioned means are already beyond the Created state, such that their Task.Status will not return TaskStatus.Created, but something else, like TaskStatus.WaitingForActivation, or TaskStatus.Running, or TaskStatus.RanToCompletion.

3. Question: What does Start actually do?

It queues the Task to the target TaskScheduler (the parameterless overload of Start targets TaskScheduler.Current).  When you construct a Task with one of Task’s constructors, the Task is inactive: it has not been given to any scheduler yet, and thus there’s nothing to actually execute it.  If you never Start a Task, it’ll never be queued, and so it’ll never complete.  To get the Task to execute, it needs to be queued to a scheduler, so that the scheduler can execute it when and where the scheduler sees fit to do so.  The act of calling Start on a Task will twiddle some bits in the Task (e.g. changing its state from Created to WaitingToRun) and will then pass the Task to the target scheduler via the TaskScheduler’s QueueTask method.  At that point, the task’s future execution is in the hands of the scheduler, which should eventually execute the Task via the TaskScheduler’s TryExecuteTask method.

4. Question: Can I call Start more than once on the same Task?

No.  A Task may only transition out of the Created state once, and Start transitions a Task out of the Created state: therefore, Start may only be used once.  Any attempts to call Start on a Task not in the Created state will result in an exception.  The Start method employs synchronization to ensure that the Task object remains in a consistent state even if Start is called multiple times concurrently… only one of those calls may succeed.

5. Question: What’s the difference between using Task.Start and Task.Factory.StartNew?

Task.Factory.StartNew is shorthand for new’ing up a Task and Start’ing it.  So, the following code:

var t = Task.Factory.StartNew(someDelegate);

is functionally equivalent to:

var t = new Task(someDelegate);
t.Start();

Performance-wise, the former is slightly more efficient.  As mentioned in response to question #3, Start employs synchronization to ensure that the Task instance on which Start is being called hasn’t already been started, or isn’t concurrently being started.  In contrast, the implementation of StartNew knows that no one else could be starting the Task concurrently, as it hasn’t given out that reference to anyone… so StartNew doesn’t need to employ that synchronization.

6. Question: I’ve heard that Task.Result may also start the Task.  True?

False.  There are only two ways that a Task in the Created state may transition out of that state:

  1. A CancellationToken was passed into the Task’s constructor, and that token then had or then has cancellation requested.  If the Task is still in the Created state when that happens, it would transition into the Canceled state.
  2. Start is called on the Task.

That’s it, and notice that Result is not one of those two.  If you use .Wait() or .Result on a Task in the Created state, the call will block; someone else would need to Start the Task so that it could then be queued to a scheduler, so that the scheduler could eventually execute it, and so that the Task could complete… the blocking call could then complete as well and wake up.

What you might be thinking of isn’t that .Result could start the task, but that it could potentially “inline” the task’s execution.  If a Task has already been queued to a TaskScheduler, then that Task might still be sitting in whatever data structure the scheduler is using to store queued tasks.  When you call .Result on a Task that’s been queued, the runtime can attempt to inline the Task’s execution (meaning to run the Task on the calling thread) rather than purely blocking and waiting for some other thread used by the scheduler to execute the Task at some time in the future.  To do this, the call to .Result may end up calling the TaskScheduler’s TryExecuteTaskInline method, and it’s up to the TaskScheduler how it wants to handle the request.

7. Question: Should I return unstarted Tasks from public APIs?

The proper question is “Should I return Tasks in the Created state from public APIs?”  And the answer is “No.”  (I draw the distinction in the question here due to questions #1 and #2 above… the majority of mechanisms for creating a Task don’t permit for Start to be called, and I don’t want folks to get the impression that you must call Start on a Task in order to allow it to be returned from a public API… that is not the case.)

The fundamental idea here is this.  When you call a normal synchronous method, the invocation of that method begins as soon as you’ve invoked it.  For a method that returns a Task, you can think of that Task as representing the eventual asynchronous completion of the method.  But that doesn’t change the fact that invoking the method begins the relevant operation.  Therefore, it would be quite odd if the Task returned from the method was in the Created state, which would mean it represents an operation that hasn’t yet begun.

So, if you have a public method that returns a Task, and if you create that Task using one of Task’s constructors, make sure you Start the Task before returning it.  Otherwise, you’re likely to cause a deadlock or similar problem in the consuming application, as the consumer will expect the Task to eventually complete when the launched operation completes, and yet if such a Task hasn’t been started, it will never complete.  Some frameworks that allow you to parameterize the framework with methods/delegates that return Tasks even validate the returned Task’s Status, throwing an exception if the Task is still Created.

8. Question: So, should I use Task’s ctor and Task.Start?

In the majority of cases, you’re better off using some other mechanism.  For example, if all you want to do is schedule a Task to run some delegate for you, you’re better off using Task.Run or Task.Factory.StartNew, rather than constructing the Task and then Start’ing it; not only will the former methods result in less code, but they’re also cheaper (see question #5 above), and you’re less likely to make a mistake with them, such as forgetting to Start the Task.

There are of course valid situations in which using the ctor + Start makes sense.  For example, if you choose to derive from Task for some reason, then you’d need to use the Start method to actually queue it.  A more advanced example is if you want the Task to get a reference to itself.  Consider the following (buggy) code:

Task theTask = null;
theTask = Task.Run(() => Console.WriteLine(“My ID is {0}.”, theTask.Id));

Spot the flaw?  There’s a race.  During the call to Task.Run, a new Task object is created and is queued to the ThreadPool scheduler.  If there’s not that much going on in the ThreadPool, a thread from the pool might pick it up almost instantly and start running it.  That thread is now racing to access the variable ‘theTask’ with the main thread that called Task.Run and that needs to store the created Task into that ‘theTask’ variable.  I can fix this race by separating the construction and scheduling:

Task theTask = null;
theTask = new Task(() =>Console.WriteLine(“My ID is {0}.”, theTask.Id));
theTask.Start(TaskScheduler.Default);

Now I’m now sure that the Task instance will have been stored into the ‘theTask’ variable before the ThreadPool processes the Task, because the ThreadPool won’t even get a reference to the Task until Start is called to queue it, and by that point, the reference has already been set (and for those of you familiar with memory models, the appropriate fences are put in place by Task to ensure this is safe).

Leave a Comment
  • Please add 5 and 4 and type the answer here:
  • Post
  • Hi Stephen,

    Thanks for putting those together!

    Andrey

  • Happy to help.  I hope you found it useful.

  • Hi Stephen,

    Regarding the race condition in point 8.. is there a similar way to get hold of a reference to a ContinueWith before actually scheduling it? I have a situation where there is a similar race condition (which I can get around in other ways, but it would be nice if there were a similar technique to {t = new Task(); t.Start})

    BTW, I'd like to reiterate what Andrey said - thanks for this, and in fact all the blog posts on TPL and async, they're absolutely excellent. I've just started seriously using TPL, and have been digging into the back catalogue of posts!

    Pete

  • Hi Pete-

    It's great to hear you've found the posts helpful.  Thanks.

    Regarding ContinueWith, there's no equivalent Start solution available built-in for ContinueWith, though depending on what you need there may still be related solutions.  For example, if you have a ContinueWith off of another Task (which I'll call the "antecedent"), and you could use this Start approach on that antecedent, then you know that the continuation won't run until the antecedent has been started, e.g.

    var t1 = new Task(...);

    Task t2 = null;

    t2 = t1.ContinueWith(delegate

    {

       Console.WriteLine(t2.Id);

    });

    t1.Start();

    Or rather than using ContinueWith directly off of the antecedent, you could instead use ContinueWhenAll, passing in the real antecedent along with a TaskCompletionSource task that you can SetResult on in order to allow the continuation to run, e.g.

    var t1 = Task.Run(...);

    var t2starter = new TaskCompletionSource<bool>();

    Task t2 = null;

    t2 = Task.Factory.ContinueWhenAll(new [] { t1, t2starter.Task }, tasks =>

    {

       Console.WriteLine(t2.Id);

    });

    t2Starter.SetResult(true);

    I hope that helps.

  • Thanks for the suggestions.

    The situation I had was a basic async producer/consumer type thing (if I understand the term correctly, in a kind of 'trampolining' style), where I never want messages processed concurrently. (This was before I noticed you have an AsyncCall method in the Extensions library, which sounds like it would be ideal for this purpose..)

    The crux of it is that whenever a message is sent to the queue, it calls 'Alert' to trigger more message processing:

    void Alert()

    {

    ...

    _currentMsgProcessingTask = _currentMsgProcessingTask.ContinueWith((antecedent)=>{ProcessMessages();});

    ...

    }

    I tried experimenting with some of the options, and when I tried ExecuteSynchronously, the following situation would sometimes happen:

    1) Suppose Alert was called after a message was sent, and _currentMsgProcessing was Completed. Therefore the continuation got called synchronously.

    2) Then, suppose part of the message processing caused a message to be sent to our own queue (which is possible in this system).

    This would mean Alert would again be called, on the same thread - it would get to the line shown above, and since _currentMsgProcessingTask had not yet been updated to be the continuation task which got called synchronously in (1), it would again add another continuation to the previously completed task, which would again execute synchronously.

    Anyway, I think your second suggestion would be close, but I think it would never execute synchronously. However it inspired me to come up with the following, which might solve it:

    var tcs = TaskCompletionSource = new TaskCompletionSource<bool>();

    Task prevTask = _currentMsgProcessingTask;

    _currentMsgProcessingTask = tcs.Task;

    prevTask.ContinueWith(_ => {ProcessMessages();}).ContinueWith(_=>{tcs.SetResult(true);});

    I'm not even sure I would want to use ExecuteSynchronously anyway, especially as I would then need to make the call to Alert itself asynchronous to avoid blocking sending messages, which would defeat the purpose - but it was an interesting conundrum.

  • Hi Pete-

    t sounds like you understand it correctly. In this context, think of a "trampoline" as something you logically bounce off of in order to get back to a certain place on your stack, typically used to avoid "stack dives", which is the problem you describe where you continually execute synchronously without unwinding, thereby going deeper and deeper on the stack, until eventually you overflow the stack.  This is why both the APM pattern and the new async/await support in the compiler have built-in support for synchronous completion that avoids stack dives.  In the case of the APM pattern, the IAsyncResult representing the async operation has a CompletedSynchronously flag on it.  If true, it means that the operation completed during the call to the BeginXx method, the AsyncCallback is invoked synchronously, and it should be the call site then that handles the completion of the operation (e.g. calling EndXx and launching any subsequent async operation) rather than the AsyncCallback; conversely, if CompletedSynchronously is false, it means that the AsyncCallback should handle the completion and that it should be invoked asynchronously.  In the new async/await support, the compiler-generated code checks the awaiter's IsCompleted property, and if it returns true, it continues running synchronously rather than calling to the awaiter's OnCompleted method, which will schedule the continuation to run asynchronously.

    I've drafted a blog post that outlines various ways of getting the behavior you want, and I'll try to get that posted to the blog in the near future.  In the meantime, yes, the AsyncCall type in the Parallel Extensions Extras sounds like it's exactly what you want.  In .NET 4.5, there's a new System.Threading.Tasks.Dataflow.dll library that provides a type called ActionBlock<T>, which is a production implementation of a similar idea and design; there's also a preview version of that library available for .NET 4.

  • Thank you for some clarifications on a somewhat confusing thing.

    Would you agree that Task.Start(), in retrospect, can be seen as a design flaw? As Task is really mostly an interface to an asynchronous operation, would it not be better to subclass it with a StartableTask or offer some other mechanism, the way it's very neatly done with TaskCompletionSource / CancellationSource, that would allow the starting of the operation to be separated out from the Task object itself?

  • Hi Freed-

    I don't know that I'd call it a design flaw, though with 20/20 hindsight and knowing how the system has evolved since, there might have been a better way to expose the same functionality.

  • Very helpful post, thank you. Translated into Chinese:) www.cnblogs.com/.../2338070.html

    Many Chinese developers wish to learn things like Async/Parallel Programming but there are few resources in Chinese. As I know, F# team has opened a blog in Chinese(blogs.msdn.com/.../visual-f-csdn.aspx). May I know whether you guys also have any plan like that? Thank you.

Page 1 of 1 (9 items)