Task Exception Handling in .NET 4.5

Task Exception Handling in .NET 4.5

Rate This
  • Comments 30

For the .NET Framework 4.5 Developer Preview, a lot of work has been done to improve the Task Parallel Library (TPL), in terms of functionality, in terms of performance, and in terms of integration with the rest of the .NET Framework.  With all of this work, we’ve strived for a very high compatibility bar, which means your applications that use TPL in .NET 4 should “just work” when they upgrade to run against .NET 4.5 (if they don’t, please let us know, as it’s something we’ll want to fix).  Even with such a high compatibility bar, however, there are a few interesting behaviors to be aware of when it comes to Tasks and exception handling.

Unobserved Exceptions

Those of you familiar with Tasks in .NET 4 will know that the TPL has the notion of “unobserved” exceptions.  This is a compromise between two competing design goals in TPL: to support marshaling unhandled exceptions from the asynchronous operation to the code that consumes its completion/output, and to follow standard .NET exception escalation policies for exceptions not handled by the application’s code.  Ever since .NET 2.0, exceptions that go unhandled on newly created threads, in ThreadPool work items, and the like all result in the default exception escalation behavior, which is for the process to crash.  This is typically desirable, as exceptions indicate something has gone wrong, and crashing helps developers to immediately identify that the application has entered an unreliable state.  Ideally, tasks would follow this same behavior.  However, tasks are used to represent asynchronous operations with which code later joins, and if those asynchronous operations incur exceptions, those exceptions should be marshaled over to where the joining code is running and consuming the results of the asynchronous operation.  That inherently means that TPL needs to backstop these exceptions and hold on to them until such time that they can be thrown again when the consuming code accesses the task.  Since that prevents the default escalation policy, .NET 4 applied the notion of “unobserved” exceptions to complement the notion of “unhandled” exceptions.  An “unobserved” exception is one that’s stored into the task but then never looked at in any way by the consuming code.  There are many ways of observing the exception, including Wait()’ing on the Task, accessing a Task<TResult>’s Result, looking at the Task’s Exception property, and so on.  If code never observes a Task’s exception, then when the Task goes away, the TaskScheduler.UnobservedTaskException gets raised, giving the application one more opportunity to “observe” the exception.  And if the exception still remains unobserved, the exception escalation policy is then enabled by the exception going unhandled on the finalizer thread.

In .NET 4.5, Tasks have significantly more prominence than they did in .NET 4, as they’re baked in to the C# and Visual Basic languages as part of the new async features supported by the languages.  This in effect moves Tasks out of the domain of experienced developers into the realm of everyone.  As a result, it also leads to a new set of tradeoffs about how strict to be around exception handling.  An example of how this could affect developers is highlighted in the below code snippet:

Task op1 = FooAsync();
Task op2 = BarAsync();
await op1;
await op2;

In this code, the developer is launching two asynchronous operations to run in parallel, and is then asynchronously waiting for each using the new await language feature (for multiple reasons, it would be better if this code were written with a single “await Task.WhenAll(op1, op2)” statement, rather than individually awaiting each of op1 and op2, but that ideal doesn’t significantly decrease the likelihood that the above code will still be written).  If while testing this method neither op1 or op2 faults, there’s no problem as there are no exceptions.  If just op1 faults but op2 completes successfully, there’s no problem: op1’s exception will propagate out of the await and everything will proceed as expected.  And if just op2 faults but op1 completes successfully, again, no problem.  However, consider what will happen if both op1 and op2 fault.  Awaiting op1 will propagate op1’s exception, and therefore op2 will never be awaited.  As a result, op2’s exception will not be observed, and the process would eventually crash.

To make it easier for developers to write asynchronous code based on Tasks, .NET 4.5 changes the default exception behavior for unobserved exceptions.  While unobserved exceptions will still cause the UnobservedTaskException event to be raised (not doing so would be a breaking change), the process will not crash by default.  Rather, the exception will end up getting eaten after the event is raised, regardless of whether an event handler observes the exception.  This behavior can be configured, though.  A new CLR configuration flag may be used to revert back to the crashing behavior of .NET 4, e.g.

<configuration>
    <runtime>
        <ThrowUnobservedTaskExceptions enabled="true"/>
    </runtime>
</configuration>

Note that this change doesn’t mean developers should be careless about ignoring unhandled exceptions… it just means the runtime is a bit more forgiving than it used to be.  My recommendation is that any developer building library components should run their tests with this flag enabled, and should make sure that no exceptions are going unobserved in the components they build.  That way, the application developer consuming these components can make the best decision for their app as to whether to set this flag or not: it would be unfortunate if the application developer wanted to be strict about enforcing all exceptions to be observed, but couldn’t because the library they consumed wasn written by developers that weren’t careful enough.

“Task.Result” vs “await task”

When you use Task.Wait() or Task.Result on a task that faults, the exception that caused the Task to fault is propagated, but it’s not thrown directly… rather, it’s wrapped in an AggregateException object, which is then thrown. There were two primary motivations for wrapping all exceptions like this.  First, as of .NET 4, there was no good way in managed code to throw an exception that had previously been thrown without overwriting important information stored in that exception.  Namely, throwing an exception with “throw e;” overwrites the exceptions’ stack trace and its “Watson bucket” information (the data collected and uploaded to help application developers after deployment find the root cause of the most common crashes in their applications) with details about the throw site, leading to very poor debuggability when dealing with exceptions marshaled across threads.  For this reason, it’s been a .NET design guideline that in cases like this, where an exception’s propagation needs to be interrupted, it should be wrapped in another exception object.  You can see that with reflection, for example, where exceptions are wrapped in a TargetInvocationException.  For better or worse, this is a way to preserve the relevant information as part of an inner exception.  And thus for TPL, where marshaling exceptions across threads is the name of the game, we wrap exceptions before propagating them. 

We could have picked various wrapper exception types, such as TargetInvocationException, but we chose AggregateException because of the second motivation: tasks may fault with more than one exception, and thus need to store multiple.  Whether because of child tasks that fault, or because of combinators like Task.WhenAlll, a single task may represent multiple operations, and more than one of those may fault.  In such a case, and with the goal of not losing exception information (which can be important for post-mortem debugging), we want to be able to represent multiple exceptions, and thus for the wrapper type we chose AggregateException.

That explains why we chose the design we did for .NET 4. Now, let’s assume for a moment that we didn’t have to deal with the first issue above, that of overwriting an exception’s information.  In that case, we have a choice to make, since whether the task actually stores multiple exceptions in an aggregate is separate from whether multiple exceptions are propagated in aggregate when you wait on a task.  We have three primary options here: only propagate the first exception even if there are multiple, always propagate all exceptions in an aggregate even if there’s only one, and propagate a single exception if there’s one or an aggregate if there’s multiple.  We very quickly ruled out the last option, as in many scenarios it leads to needing duplicate catch blocks: for cases when multiple exceptions occur, you need a handler for an aggregate exception (and that handler needs to be able to special case each of the inner exceptions), and for cases when there’s only one exception, you need a separate catch block for each of the specialized exceptions… as a result, you end up having the same logic duplicated in two places.  In our experience, it’s also just more difficult to reason about.  That leaves the options of always propagating the first (by some definition of “first”) or always propagating an aggregate.  When designing Task.Wait in .NET 4, we chose the latter.  That decision was influenced by the need to not overwrite details, but also by the primary use case for tasks at the time, that of fork/join parallelism, where the potential for multiple exceptions is quite common.

While similar to Task.Wait at a high level (i.e. forward progress isn’t made until the task completes), “await task” represents a very different primary set of scenarios.  Rather than being used for fork/join parallelism, the most common usage of “await task” is in taking a sequential, synchronous piece of code and turning it into a sequential, asynchronous piece of code.  In places in your code where you perform a synchronous operation, you replace it with an asynchronous operation represented by a task and “await” it.  As such, while you can certainly use await for fork/join operations (e.g. utilizing Task.WhenAll), it’s not the 80% case.  Further, .NET 4.5 sees the introduction of System.Runtime.ExceptionServices.ExceptionDispatchInfo, which solves the problem of allowing you to marshal exceptions across threads without losing exception details like stack trace and Watson buckets.  Given an exception object, you pass it to ExceptionDispatchInfo.Create, which returns an ExceptionDispatchInfo object that contains a reference to the Exception object and a copy of the its details.  When it’s time to throw the exception, the ExceptionDispatchInfo’s Throw method is used to restore the contents of the exception and throw it without losing the original information (the current call stack information is appended to what’s already stored in the Exception).

Given that, and again having the choice of always throwing the first or always throwing an aggregate, for “await” we opt to always throw the first.  This doesn’t mean, though, that you don’t have access to the same details.  In all cases, the Task’s Exception property still returns an AggregateException that contains all of the exceptions, so you can catch whichever is thrown and go back to consult Task.Exception when needed.  Yes, this leads to a discrepancy between exception behavior when switching between “task.Wait()” and “await task”, but we’ve viewed that as the significant lesser of two evils.

As I mentioned previously, we have a very high compatibility bar, and thus we’ve avoided breaking changes. As such, Task.Wait retains its original behavior of always wrapping.  However, you may find yourself in some advanced situations where you want behavior similar to the synchronous blocking employed by Task.Wait, but where you want the original exception propagated unwrapped rather than it being encased in an AggregateException.  To achieve that, you can target the Task’s awaiter directly.  When you write “await task;”, the compiler translates that into usage of the Task.GetAwaiter() method, which returns an instance that has a GetResult() method.  When used on a faulted Task, GetResult() will propagate the original exception (this is how “await task;” gets its behavior).  You can thus use “task.GetAwaiter().GetResult()” if you want to directly invoke this propagation logic.

Leave a Comment
  • Please add 1 and 4 and type the answer here:
  • Post
  • Hello, Stephen -

    I'm enjoying your posts,

    I opened Connect feedback a while ago with an objection to the ExceptionDispatchInfo.Throw method:

     connect.microsoft.com/.../exceptiondispatchinfo-api-modifications

    To me, a method whose semantics are to always throw is not a wise API design decision. This is simply because it interferes with static code analysis.

    Did the team consider this?

  • The UnObservedException is thrown only after the task is GC'ed and it's finalizer runs - can you please confirm?

  • Could you confirm that these new behaviors are related to the framework version, not the language / compiler version. So if you use the c# 5 compiler in Visual Studio 11 to target .Net 4.0 you get the 4.0 behavior, not the 4.5 behavior.

  • The decision to swallow exceptions by default causes a nasty problem: You introduce a global setting for all components in you address space although these components depend on their specific setting because they have been developed and tested in isolation. You know how library developers are: They just won't disable this behavior.

    Most people also don't have the "break on all exceptions (ctrl-e)" behavior enabled. They will never notice even if half their tasks fail. Maybe you want to add a managed debugging assistant for this that is default-enabled.

  • Rohit Sharma, that's correct.  The ony way we know the exception will never be observed by the user's code is when the Task object is no longer available for them to access, and we only know that to be the case during finalization.

    Remco Blok, that's correct.  For the unobserved exception behavior, in .NET 4 the finalization code in the Task's exception holder throws the exception on the finalizer thread.  In .NET 4.5, it doesn't (by default).  So it's entirely about the Framework version rather than the compiler version; further, the CLR configuration switch mentioned is only in .NET 4.5, not in .NET 4.  For the await behavior, the methods in question (e.g. Task.GetAwaiter and TaskAwaiter.GetResult) don't even exist in .NET 4, so it's also the Framework version that matters.

    tobi, thanks for the thoughtful feedback and for the suggestion.  This is definitely something we've considered, and trust me, we've debated it at length ;).  It is important to keep in mind that the .NET 4 crashing behavior wasn't perfect: if it was, we would have kept it enabled.  One of the key things about the .NET exception escalation policy as it normally plays out is that the process crashes at the time when the exception goes unhandled, so crash dumps will reflect the current state of the app at that time, if the debugger breaks in at that time the developer can explore the full state of the app at that time, etc.  By relying on garbage collection for Task unobserved exceptions, we are able to alert the developer to the fact that they didn't see an exception, but we end up alerting them well after the problem has already occurred.  It could be a few milliseconds, a few seconds, or even a few minutes later, depending on how much allocation is happening in the program and as a result how often the GC needs to run and the finalizer needs to kick in.  At that point, the state of the program that resulted in the exception may be long gone.  In any event, there are pros and cons to all known approaches here, and we'll continue to weigh those as we go forward with this release.  Please do keep the feedback coming.

    Stephen, thanks, and this is something that was considered.  I've forwarded your question to the exceptions team in the CLR for them to comment.

  • I am in favor of swallowing exceptions in production because at least in my experience "state corruption" rarely happens from this (at least not in an asp.net application where all request state is thrown away after it has executed). In any case, shutting down the service is oftentimes much worse than failing a few % of all requests.

    I am deeply opposed to not alerting developers during development. My dream change in VS11 would be to break by default on all exceptions passing through user code (ctrl-shift-e dialog). All developers need the slightest problem to be rubbed in their faces all the time. _That_ is how reliable software gets created by standard developers.

  • Many applications swallow exception all the time in production: They just log them and keep running. This is from the perspective of the running application the same as just swallowing them. So I argue that swallowing exceptions is what 90% of all applications are doing right now (and rightly so).

  • @tobi: I disagree. I am a proponent of the "fail fast" approach. For desktop/client apps in particular, I recommend failing immediately so that WER can kick in and collect useful data. Critical applications can register to have the OS immediately restart them.

    Even for ASP.NET apps, an unexpected exception is indicative of a serious error, and there is AppDomain or even process-level state (including resources) that can be corrupted (including resource leaks). I still recommend failing immediately for these apps because it allows the ASP.NET to follow its recycling logic - again, collecting useful data - while delaying incoming requests so that they execute correctly when the service comes back online.

    So, I disagree that 90% of applications swallow exceptions (e.g., no application I've ever written does this), and I certainly disagree that it is the correct behavior. In fact, it's not generally possible to do this for *all* exceptions in .NET (e.g., StackOverflowException, ThreadAbortException, OutOfMemoryException). See: msdn.microsoft.com/.../ms228970(v=VS.100).aspx

    Applications should catch vexing and exogenous exceptions, but nothing else. (Terminology is from http://tinyurl.com/364c5kh)

    In the async world, the only exceptions that get swallowed are for asynchronous operations that are abandoned. Asynchronous operations whose results are used (e.g., by await) always have their exceptions observed. I've thought a whole lot about it, and concluded that this is the correct (or at least the "least incorrect") behavior.

  • Wouldn't be it possible to instead of adding "unobserved" exceptions, have some method on Task to allow it to swallow not handled exceptions, and call it in code generated from using "await" keyword? Then we would still have good rule of always crashing application on unhandled exception, and ability to use many times "await"?

  • @Whut: Sure, it would be technically possible.  And you could build such a method yourself, e.g.

    public static void IgnoreExceptions(this Task task) { task.ContinueWith(t => { var e = t.Exception; }, TaskContinuationOptions.OnlyOnFaulted); }

    But I don't completely understand your question.  When you await a task that faults, its exception will automatically be propagated out of the await, so we're already doing something with exceptions in that case, and we wouldn't want to suppress them.

  • These topics are the biggest pain points right now for me. I do stuff like

    await Task.Run(() => { Parallel.ForEach(...XYZ...) });

    and it's almost impossible to debug: Hard to catch the correct exceptions and VS always breaks in Program.Main. I hope this will work auto-magically at some point in the future where VS breaks at XYZ and I can catch the exception in the Parallel.ForEach, in the Task.Run lambda or around the await.

  • Your blog says that

    // both t1 and t2 throws exception.

    Task t1 = createTask(); Task T2 = createTask();

    Task.WhenAll (t1, t2);

    the result of above code will be that the caller will get an exception which is thrown first (coming from either of t1 or t2). Task's Exception property now will have an AggregateException property.

    This is an exact conflict with msdn article here msdn.microsoft.com/.../dd997415.aspx which applies to .NET 4.5. It says that an AggregateException is thrown.

    <quote>

    If a task is the parent of attached child tasks, or if you are waiting on multiple tasks, then multiple exceptions could be thrown. To propagate all the exceptions back to the calling thread, the Task infrastructure wraps them in an AggregateException instance.

    </quote>

    Whom should we believe? Especially given that you are talking about CTP of 4.5

  • Nitin: I'm not seeing what conflict you're referring to.

    - Task.WhenAll is very different from parent/child tasks, which refers specifically to the ability for a task to attach itself to whatever task is currently running via the TaskCreationOptions.AttachedToParent flag.

    - The Task returned from WhenAll will contain all of the exceptions from all of the faulted tasks provided, and that Task's Exception property will return an AggregateException containing all of them.

    - If you have a Task with multiple child tasks (e.g. while the task was running, AttachedToParent tasks were created) that fault, all of their exceptions will propagate up to the parent task, such that the parent task will store all of the exceptions.  That parent Task's Exception property will return an AggregateException containing all of the children's exceptions.

    - If you .Wait() on a faulted Task, regardless of how many exceptions it contains, an AggregateException will be thrown.

    - If you await a faulted Task, regardless of how many exceptions it contains, only the first of those exceptions will be thrown.

  • Thanks. I see now that msdn article is referring only to Task.Wait(). While we are on the subject can you also explain how TaskCompletionSource(TResult).TrySetException Method (IEnumerable(Exception)) works?

    msdn does not say anything about this subject. msdn.microsoft.com/.../dd783450.aspx.

    I am hoping its useful when dealing with TaskCompletionSource and some other Task's Task.Exception.InnerExceptions property.

  • Nitin: It's just like TrySetException(Exception), except that instead of storing just one exception, it iterates through the IEnumerable<Exception> and stores all of the exceptions from it.  As you suggest, this is useful for transferring all of the inner exceptions of one task to another task.  It's also useful for combinators like WhenAll, which need to aggregate the exceptions from multiple places into one task.

Page 1 of 2 (30 items) 12