A first hand look from the .NET engineering teams
You’ve seen the advice before—it’s not a good programming practice to catch System.Exception. Because managed exceptions are hierarchical, catching the top of the hierarchy—e.g., catch(Exception)—is an easy way to make sure that you catch all exceptions. But do you really want to catch all exceptions?
Empty catch statements can be just as bad, depending on the MSIL code that your language generates. C# turns an empty catch statement into catch(System.Object) which means you end up catching all exceptions—even non-CLS compliant exceptions. VB is better-behaved, turning an empty catch statement into catch e as System.Exception which limits you to catching CLS compliant exceptions.
Instead, you should catch the exact exception types that you expect because these are the types your code is prepared to handle. Obviously, any code that throws an exception is a good candidate for corresponding catch. MSDN documents all exceptions that can be raised from BCL functions. For example, System.IO.StreamReader () can throw ArgumentException or ArgumentNullException. If you use a StreamReader you should catch—and handle—these two exceptions.
We sometimes see advice that suggests you should catch all exceptions. In Eric Brechner’s Hard Code column he said that letting an application crash and invoke Windows Error Reporting (commonly called “Watson” from the pre-Vista days) was “completely idiotic and irresponsible.” He suggests that instead you catch all exceptions and “try to correct” them.
There’s one small problem with Eric’s advice: the word “try”. When you catch an exception you MUST handle the exceptional condition. Catching an exception is a statement that you can handle the exception and restore the program to a non-exceptional state. Trying to handle an exception isn’t sufficient. Follow the advice of someone older and wiser: “Do, or do not. There is no ‘try’.”
One key problem is that our exception hierarchy is a single mechanism representing multiple exceptional conditions. Krzysztof Cwalina explains this pretty well, breaking exceptional conditions into three sets: usage errors, logical errors and system failures. Usage errors are a condition that should always be fixed by the developer—for example, if an API cannot take null, the developer should ensure that she doesn’t pass in null. Logical errors can’t be detected at development time but should be handled at program runtime by the program itself. For example, a FileNotFoundException should be handled by creating the file or prompting the user for a new filename. System failures, however, cannot be handled by code. System failures indicate that something has gone wrong outside of your program’s scope. Your program can not be expected to correct a system failure. Thus the suggestion that you “try to correct” the error condition cannot reasonably be applied to system failures.
Let’s say your program causes an access violation and you catch the AccessViolationException. What can you do now? Find the pointer which has the bad value and set it to point to the right memory? If you could do that, you wouldn’t have AV’d in the first place. Catching the exception is the wrong thing to do if you can’t correct the error. In the case of a system failure you can’t correct the error.
We’re making a change in Visual Studio 2010’s CLR to make it harder to make this mistake. When an exception could indicate a corrupted process state—what Krzysztof calls a system failure—you’ll have to use a new attribute to indicate your intention explicitly. You can read about the details in the CLR Inside Out column in February’s MSDN Magazine. Even with our Corrupted State Exceptions work, catching all exceptions is not a good idea. If you can’t correct the condition that resulted in an exception being raised you’re always much better off Watsoning up (that is, letting your program exit and invoke Windows Error Reporting.) Watson will report back from the user’s machine so that you can fix the error in your next servicing or release.
Program Manager, CLR
That seems just a little short-sighted to me. There are plenty of instances where catching an unexpected exception is a good idea. Just one example:
Suppose I have a service that processes various jobs throughout the day, sometimes multi-threaded. On an individual job, if I do not catch all exceptions, then any unhandled exception will kill the entire application, instead of just the specific job. I have a catch (Exception) clause that marks this specific job as errored without stopping the service.
Another example: If you have a GUI app, and it throws some kind of unexpected exception - you want to log it, give the user a friendly error, then decide whether or not to let the application die. Letting the exception go unhandled gives the user an incredibly unfriendly error.
I agree that most exceptions can be caught and handled appropriately, but developers aren't perfect, so one or two logical exceptions may slip through. Combined with funky exceptions like OutOfMemoryException that you really can't plan for, and I'd say that having an overall catcher is pretty much a necessity in a lot of cases.
An unexpected exception indicates that your thread has encountered an unexpected state. You can't handle something programmatically if you didn't prepare for it in your code.
On your first example, you say that you can mark that specific job as errored without stopping the service. This presumes that you know the exception's impact is limited to that specific job. What if the exception came from the "service" code (as opposed to the "job" code)? Your code doesn't know whether it came from the service or the job. How do you know that the code running the service is still in the correct state?
It doesn't change with the GUI app. If you don't know what the exception is, what kind of friendly error are you going to give the user? "Something went wrong" isn't very friendly, but you can't honestly be more specific. You can hide the error, put up a "nice" message and try to keep running the program but if you've just overwritten data persisted in the user's files then you've created a situation much worse than an application crash.
I don't want any application I run to say "something went wrong, I don't know what it was, so I'm just going to keep running." If you want your application to continue on, do what Office does: let the process die, log the exception (using Windows' unhandled exception mechanism) and restart the process. The OS offers process isolation so you know that anything that went wrong--whatever it was--in the excepting process has been cleaned up. If you get an SEH exception on your thread--such as an access violation--you have no idea what your program code will do. You cannot reasonably decide to let it continue to run.
Exceptions don't just slip through: some code on your thread has to raise the exception to indicate that something is wrong. An exception means there is a problem. If you can't fix the problem you shouldn't pretend that everything will just be ok.
I wonder if you are aware of a series of posts on the FxCop blog by Nick Guerrera from 2006, in which he spends three posts arguing that catch(Exception) is always wrong. By the end of the comment chain on the third post (http://blogs.msdn.com/fxcop/archive/2006/06/19/FAQ_3A00_-Why-does-FxCop-warn-against-catch_2800_Exception_29003F00_-_2D00_-Part-3-_5B00_Nick-Guerrera_5D00_.aspx), he actually (gives up and?) admits that in a stateless and/or transactional system, catch(Exception) CAN be a good idea. In his words:
"...if your architecture is fault-tolerant using stateless components, immutable shared state, transactional state, etc., then it is possible to introduce general catch handlers without falling in to the pitfalls that I discussed in my blog entries.
So, to be absolutely clear, if you've thought hard about the potential impact on your application state and taken the right architectural steps to preserve its consistency, then I do not dispute that as a valid reason to exclude the FxCop warnings [against catching Exception]."
I used to preach the same hard line, "Never catch Exception!" However, that comment chain converted me. I am now of the opinion that, because most applications are NOT adequately hardened against unknown exceptions, the best thing to do is let the process (or at least the AppDomain) die. But given the proper preparation, catching Exception CAN result in a better user experience / better functioning system, without sacrificing correctness.
Applying the thoughts from that thread to your statements:
"There’s one small problem with Eric’s advice: the word “try”. When you catch an exception you MUST handle the exceptional condition."
This is not actually true. All you have to do is make sure that your program is in a consistent state. Proper use of finally blocks (and constrained execution regions if necessary) can ensure that consistency, at least to the same degree that the CLR gives you in the first place.
"Let’s say your program causes an access violation and you catch the AccessViolationException. What can you do now? Find the pointer which has the bad value and set it to point to the right memory?"
Obviously you cannot "fix" the AV, but you don't have to. All you have to do is ensure that, once the exception is thrown, the program remains in a consistent state as the exception is propagated up the call stack.
"How do you know that the code running the service is still in the correct state?"
If you are in a stateless environment, or an environment where the state has been hardened (probably using a transactional system), against unknown exceptions, then you can safely continue.
"An exception means there is a problem."
Again, not technically correct, and I think this is actually the heart of the matter. An exception means there WAS a problem. As long as you make sure that you are in a consistent state from the point where the exception is thrown to the point where it is caught, the fact that a problem existed in the past does not have to mean letting the process die. Even if the problem is ongoing, as long as the proper steps are taken to ensure consistency, the worst that can happen is that the exception (or another one) gets thrown again (the issue of repeatedly thrown exceptions does have to be accounted for).
One final thing: letting Watson do the final error reporting IS irresponsible, given that I as the developer will never see that report and can't do anything about the error. Some steps have to be taken to actually notify someone who can do something about the problem.
I think you are making two mistakes here.
First, you say "don't ever do that and if you do, accept whatever punishment results from that"
but the reality is that in real life, code such as Joe Enos' often works rather fine, and doing this the pure and right way would be much harder. Assume a service that processes jobs: what's the chance of any exception being a tough one that invalidates everything? compared to the chance that there was just something specific about that job that you did not account for, but that doesn't affect the rest of the service?
then, how hard would it be to do it right? drain out all threads of the service or just stop it cold? can you log a reasonable message before you quit? make sure that parallel jobs will be left in a consistent state?
there will always be people who go for the catch all. and while it's technically not right, in the scope of a project (considering time, budget and skill constraints) it might not be a bad decision.
instead of just pointing out mistakes, I believe you should think hard about giving some pragmatic advice about how to deal with various exception types in this regard. once you start doing it Joe's way, you can still improve your code a lot by differing between catastrophic exceptions (access violation, runtime engine exception etc.) and exceptions that will in many cases be recoverable (like an uncaught FileNotFoundException in a specific job).
The other problem I see in your posting is this: you're suggesting that if a method, according to the docs, throws an ArgumentException or an ArgumentNullException, we "should catch—and handle—these two exceptions"
These are typically programming errors. I'd say let them happen and fix them. Having unit tests is even better. But catching your own programming errors seems like very bad advice to me. It clutters the code and does not solve the problem. It might even hide bugs!
Catching a FileNotFoundException seems evil to me too. If the file name is an input parameter, I'd explicitly want to check that it exists first, and no exception would happen. (Not true for situations where race conditions might occur.) Just like any input parameter needs to be checked before it's passed on.
If I'm assuming that the file is there because my code has already done something to ensure that (like creating it), a FileNotFoundException indicates a violation of some internal precondition -> a bug. I'd say just let the exception happen, find the reason and fix it.
So in reality, aside from Joe's scenario, there should be very few catch blocks at all in any good code.
I just checked our own repository:
426 catch statements in > 500K lines of code
compare that to 1872 using and finally statements for cleanup after exceptions (no, not counting the namespace usings)
- conditions where checking up-front is impossible or too hard
- rare conditions that would be too slow or otherwise expensive to check otherwise
- exceptional conditions that might occur in race conditions, where checking first would be pointless unless you add synchronization
- exceptions that are wrapped in other, more meaningful exceptions and rethrown
- anything else I can't think of right now ;-)
I've seen this programming advice many times before. However, every time I see it, no one mentions the exceptions.
A lot of my programming centers around developing asynchronous components. Guess what we have to do: catch all exceptions and deliver them as a parameter to the caller (AsyncCompletedEventArgs.Error).
This pattern is also useful at top-level thread procedures.
At the risk of reiterating what most people are saying, I would also like to disagree with this blog post.
I make typical desktop applications with GUIs. If anything unexpected goes wrong we raise our own fault report dialog, we harvest as much information as possible, and we post it back to HQ for support. This means that the user is almost always presented with meaningful information (to users, not software engineers) and automatically nudged along the path to resolution. To do this we must at least protected every thread start point with a catch-all. Note that I am not advocating a catch-all any lower than that.
If we don't do this, how are we supposed to get the details of the error from Watson back to us? My reading of this post suggests that the recommended method is this:
1. Let Windows catch unforeseen errors with Watson.
2. The user then contacts us to tell us that the application crashed, assuming they have our contact details.
3. We ask the user what the error dialog said.
4. The user digs out the Windows Application Log (assuming they have admin access), pastes this into an email, finds our logs, adds those logs to the email, sends the email to us. This assumes they have email on the machine where the error happened, which is often not the case.
This is unacceptably more complex, and for the record Watson is ugly and users ignore it.
Also: C# is right to catch all exceptions. If I catch Exception it is precisely because I am anticipating the unexpected and I intend to do something about it, even if it just apologise to the user. For instance, System.IO.StreamReader can also throw ThreadAbortException, but you forgot that one didn't you? And you guys wrote the CLR! Imagine how difficult it is for us mere mortals to write perfect code.
No disrespect to Andrew but I get a bit fed up reading "holier than thou" rules I must abide by or I am a very naughty programmer.
If I've learnt one thing in my (far too many) years of programming is that there are no rules, only guidelines. And really don't believe everything you read - especially on t'internet.
A lot of people who post these sort of things obviously only work on ones sort of project like database centric enterprise stuff with fixed schemas or maybe low level coding like the MS teams and espouse patterns that work in their own area. It's not that simple!!
To add the list of examples what about calling an external 3rd party component written by someone who hasn't published every exception they throw. And what about if they don't catch all their errors so you get the framework throwing things you can't possibly anticipate ...?
One of the previous comments was spot on, sometimes you have a list of things you want to perform some process on and you don't want to fail the whole 'transaction' if individual steps fail.
If that process involves (often slightly dodgy) third party code or interop or office, then the only safe way to handle that is to catch Exception. As long as each procedure has a defined start and end state you can handle it quite happily and inform the user in a nice way.
I think Andrew fails clarify that catching Exception in a top-level exception handler before letting the application/AppDomain terminate is not what he's talking about. He's talking about catching Exception and attempting to continue running the application.
Even in a "stateless" application, you still don't know if you can continue runnig when catching Exception. The application may not have it's own state; but it certainly has state: both framework and system. And you don't know if that state is still valid when catching Exception. Many times you don't know the side-effects of a library or framework method call; catching Exception may have left the framework, a library, or the system in a bad state and continuing may lead to further data loss.
Great post. It's always frustrating to see people chime in with "advice" on why catch(Exception) is a good idea. The usual argument is that it somehow magically prevents crashes (and therefore, perceived bugs) in the software. It's programming by FUD. "catch(Exception)" is basically throwing your hands up into the air and saying "I give up" as a programmer.
I really hope a few new L3 or L4 warnings are added for the C# 4.0 compiler in VS2010:
* catch and catch(Exception)
* throw new Exception() (ie/ throwing the exception base class)
Actually, Andrew specifically stated the opposite:
"If you can’t correct the condition that resulted in an exception being raised you’re always much better off Watsoning up (that is, letting your program exit and invoke Windows Error Reporting.)"
That is, he is specifically arguing against catch(Exception), EVEN in a top level handler.
Would you care to provide an example of a "stateless" application which suffers data corruption due to a general catch handler? I have seen this argument many times, but I have yet to see a concrete example. If I am taking the appropriate steps in my application to control my own state (which, granted, is difficult, but not impossible), why do you still assume that I am vulnerable to data corruption?
Perhaps you have seen others claiming that catch(Exception) magically prevents bugs, but that is not what is being argued here. You accuse others of FUD, and yet you mis-characterize the arguments of anyone who doesn't agree with you. That kind of tactic doesn't do much for your credibility.
@David. I took Andrew's comment about "Watsoning up" as not meaning instead of a top-level handler. If you subscribe the the Windows Error Reporting for your app, it's much easier to debug problems that way than to develop a top-level handler.
The framework and libraries often have state. An exception could have been thrown by each of these, effectively leaving the application's state in a bad way--regardless of whether it has no state of its own. If you don't call the framework, or use any libraries; sure, you might be able to catch Exception whenever you want; but then what's the point of using .NET?
Another problem with catch(Exception) is that it catches all future exceptions too. Let's say that calling MethodX today could throw AException, BException or CException. If I catch(Exception) I'll catch those three exceptions. But, what if, in the future, MethodX now throws DException? catch(Exception) will now catch that too.
And what if higher level code *does* handle DException and deals with it logically because it has more context than your code wrapped with catch(Exception). The higher level code will never see that exception because it was swallowed by lower-level code thinking it's making the user's experience better--when in fact it's made the experience worse.
The lower-level code has effectively coupled itself to what higher-level code does (or doesn't do). If I *did* want that higher level code to handle that exception then I'd have to modify the lower-level code to let that exception pass through. And what would that code look like? (besides messy) The fact that the code now let's that exception pass through means it's more coupled to the higher-level code when they should be abstracted from one another.
To be clear, I'm not saying don't ever catch(Exception); just do it at the top-level (entry point for the application, for an AppDomain or for a thread, for example) and do something useful with the exception before terminating the application/AppDomain/thread. And there *are* exceptions to this too; but they're far fewer than exception to "only catch what you know you can handle".
@all: I'm not surprised this is a popular topic. Let me respond to these comments generally instead of point-by-point. In composing a point-by-point response I found myself saying the same things over and over. If there's a salient point that I do not address please feel free to remind me of it.
I agree that if your architecture is fault-tolerant and stateless then catching general exceptions is not necessarily a bad thing. I would like you to consider that the whole .NET Framework is not itself sufficiently fault-tolerant or stateless for this statement to be applied to it. In normal scenarios (including all non-hosted scenarios), if your CLR process has AV'd then I recommend shutting down the CLR process. You do NOT know the state of the CLR itself so how can you trust it to keep executing your program properly?
Consider, however, the SQL scenario. SQL catches general exceptions in managed code without shutting down the process. How do they do this safely? They use a subset of the .NET Framework which is more fault-tolerant and they carefully manage state to be transactional. The CLR is sufficiently robust to be considered a component of a fault-tolerant system but as a general purpose programming platform being fault-tolerant to this level is not the goal of the CLR. (Read about Constrained Execution Regions for some of the ugly details.)
Consider an application with a plugin framework that lets plugins run unfamiliar code. Provided that the framework limits what the plugin can do (for example, no unsafe code) then it may be reasonable for the application to catch random exceptions coming out of a plugin. If you've guaranteed that your code won't cause a condition that is unrecoverable then you've limited the scope of potential exceptions. You've "pre-handled" exceptions by preventing the ones you cannot handle. This is a big "if": in order to do this correctly you need to make sure that your plugins cannot cause exceptions that your application cannot safely ignore.
Therefore, I won't change my statement that you MUST handle the exceptional condition. Read that sentence in context of the next sentence: "Catching an exception is a statement that you can handle the exception and restore the program to a non-exceptional state." If you've guaranteed that the program will not enter an unknown state then you have handled the exception.
In reality, yes, code like Joe's often works rather fine. And doing this the pure and right way *is* much harder. The Corrupted State Exceptions feature actually helps you to differ between catastrophic errors and possibly recoverable exceptions. In fact, it allows you to write catch(exception e) with the knowledge that you will not accidentally catch an AV. In discussing this feature discussed that "catching all exceptions is bad" but we didn't make the next version of the CLR prevent you from catching all exceptions. We provided two ways to let you catch all exceptions--even AVs. We've written real programs. We're just trying to help make real programs better.
When I was on the C++ team I modified a particularly buggy tool by adding a catch(...) around main. The tool was incredibly difficult to maintain (it had been ported by a tool from Fortran to C) and it AV'd frequently. It was used in our nightly test runs and invoking Watson meant that the batch test run would hang all night. I still think adding catch(...) was the right solution given the situation because as soon as the exception was caught (and ignored) the process exited. I passed an errorlevel back to the script to mark this particular test as having run incorrectly. In my situation Windows process isolation guaranteed that I didn't mess anything up by ignoring an AV because although I ignored the AV I immediately exited the process where the AV was raised.
I'm not going to change my advice: blindly catching all exceptions is bad. If this puts me into the "Go To Statement Considered Harmful" category then so be it. Face it: the CLR is full of things that are inconsistent at a superficial level. We tell you to let the garbage collector handle all your memory, and then we give you APIs like GC.Collect(). We claim that you should only write typesafe code and then we give you the "unsafe" keyword (ok, actually, that's actually C#, but we provide the facilities for unsafe code to execute). The CLR is an amazingly flexible, general-purpose runtime. It's not possible to give any advice that works for every user. We give advice that works for most users and expect that those who listen to the advice know how to apply it properly to their situation. GC.Collect() and unsafe code should be used wisely but we provide these facilities for you to use.
So I will say it again, slightly rephrased: Do not let your program execute in an unknown state. For most programmers this statement implies "do not catch exceptions your program is not prepared to handle". I don't mind saying "catch (Exception) Considered Harmful" because I expect the advice is applicable to an overwhelmingly majority of codebases.
We are trying to make things better for programmers. One example is code contracts, coming in the next version of the .NET Framework. When I read stefan.wenig's comment, it sounded like he was asking precisely for code contracts. The BCL team has a blog post about them at http://blogs.msdn.com/bclteam/archive/2008/11/11/introduction-to-code-contracts-melitta-andersen.aspx.
Another way we're trying to make things better is Watson. The procedure Rupert Rawnsley outlined breaks at step 2 (the user contacts us, has to dig out the Windows Application Log and emails it to us...) Windows Error Reporting is there so that you don't have to have the user contact you. Developers can sign up for WER to receive dumps from crashes on users' machines. WER exists because there's often not a good path between the users' box and the developer. Also, WER frees up the code from having to take dumps of all the information needed to debug an unknown crash. (It sounds like we need a post on WER as well.) If you have the kind of relationship with your user that you can catch unknown errors, take your own crash dumps and get them back to your box for debugging, then use that relationship instead of WER. WER is there for applications that don't have robust and detailed logging facilities and a mechanism for getting error logs back to the developer.
I tried to be extremely precise in my post and my MSDN article. And I believe that my advice is consistent with most advice coming out of Microsoft (even Eric's advice...in his first "aside" he notes that relying on process isolation is a good way to handle unknown errors.) Error handling is a difficult topic to speak about in absolute terms. If you can say anything absolutely about exception handling it's this: exception handling is a hard problem that no one has solved perfectly. That includes the CLR team.
I'm not going to go back on anything I said in the post or the article. I will say this: take the advice as advice and apply it to your particular situation. My advice is meant for you to think about when you code. You don't have to do what I say. If you know that your code is executing in a fault-tolerant and stateless system that can ignore any unexpected errors then my advice really does not apply to you. Most users of the CLR don't have code that executes in a fault-tolerant and stateless system so I will continue to "preach": catch(Exception) Considered Harmful.
How are code contracts the answer? I spoke out against catching ArgumentExceptions, because that's just an example of catching your own programming errors, which is a bad thing. Static analysis might reduce the need for ArgumentExceptions, but the message remains the same: those that are still left should not be caught, but avoided.
As for the catch all: Of course you're right, catching _everything_ is not a good idea. But the other extreme, catch only the specific exceptions you expect, is just as useless, only on the other end of the scale.
There are many exceptions that can usually be caught and ignored as long as the part of the call stack that threw them does not modify data structures that outlive it. As long as this is true, I don't really care whether the exception is a FileNotFound, NullReference, IndexOutOfBounds or whatever.
I was processing message no. 27309 and encountered a NullReferenceException? There's probably something in the message that my code doesn't anticipate. Do I want to shut down the process therefore? Not if I can avoid it. Can I make 100% sure that the program still works after the catch? Probably not. Will I get a reasonably good program if I catch everything except RuntimeEngineException & Co, log everything and then proceed with the next message? Probably yes. Now what I'd expect from the CLR team is advice on which exceptions to handle, and which not. Or how avoiding global data or using immutable data could help. Be free to attach a warning, but please do help getting pragmatic solutions done. In the real world, this stuff often works. You're technically right, but we're not all coding stuff like the CLR or SQL server.
View it from another perspective: if we should only catch exceptions that we expect from looking at the documentation of the methods we're calling, then
- why does .net not have declarative exceptions? (not that I think they are such a good idea)
- why are we so obsessed with releasing stuff in the event of an exception (using, finally)? only for the occasional expected exception that travels down 10 stack frames? there's very few of those.