This is the third installment in a three-part series on why FxCop warns against catch(Exception):FAQ: Why does FxCop warn against catch(Exception)? - Part 1FAQ: Why does FxCop warn against catch(Exception)? - Part 2FAQ: Why does FxCop warn against catch(Exception)? - Part 3
I said from the beginning that this issue is controversial, and some of your feedback certainly confirms that. I want to make it clear that I respect everyone’s right to disagree with me, to exclude or disable the FxCop warning, and to implement a less rigid exception policy in their application code as they see fit. Furthermore, all of the opinions I’ve expressed are entirely my own and they represent a stricter interpretation of the Framework Design Guidelines than absolutely required. I believe strongly that the approach that I’ve outlined helps to find and fix defects sooner during development and testing and therefore helps to deliver more robust production software. Nevertheless, the choice to adopt a different strategy for your application is entirely yours.
On the other hand, as I mentioned in the previous post, this FxCop rule is of paramount importance for class library code. When class libraries call back to application code (via virtual methods, interface methods, events, or delegates), it is disastrous if they swallow or disguise the arbitrary exceptions that may be raised. (To see just how aggravating this can be for library consumers, witness the frustration with the WebBrowser control that readers expressed earlier.) Furthermore, class library code cannot make any assumptions about the implications of arbitrary exceptions for application code further up the call stack, so it must let them go so that the application can decide how to handle them.
With that said, let me now address the most recent comments with more of my own personal opinions. :)
Jeremy: “… I hope you do realize that a number of the recommendations you have given to people, myself included, basically amount to ‘let your app crash, your customers and business partners be damned.’ …"
No, I most certainly do not realize that, because it’s simply not true. Applications which catch only the precise exceptions that they are prepared to handle gracefully are easier to develop and maintain over the long run. When this strategy is combined with strong testing, the application will ship with fewer defects and there will be very few unhandled exceptions in the wild. The truth is that once an unexpected exception has been raised in your application, customers and partners are likely to suffer one way or another. The impact of corrupt program state can range from security vulnerability to plain incorrect output, which can end up costing customers more time and money than dealing with a crash.
Please don’t misconstrue this to mean that it’s ok to write sloppy software which crashes regularly in common cases. Clearly that’s unacceptable and I never said otherwise. However, the right way to avoid this fate is to design your software carefully, implement it methodically, and test it thoroughly.
Jeremy: “I think all of your readers know full well that applications need to be designed to fail fast and do so clearly in order to ease the diagnosis and correction of software defects.”
I’m glad that we agree that applications should be designed to “fail fast.” From my point of view, once an exception is raised in your application and you haven’t decided exactly how to handle it on its own terms, then the application has already “failed” and process termination represents the “fast” part of the equation. :)
Jeremy: “We do this, however, by modularizing software and testing it at an appropriate level of granularity, not by letting the whole app fall down in production.”
I do agree that it’s possible to sandbox components (for example, by using separate app-domains and ensuring that all shared data is immutable) such that the failure of a single component does not imply failure for the entire process. This then becomes, as you say, a question of granularity. You can think of each component as a mini-process, and then apply all of my arguments on a component-by-component basis. That is to say that when an isolated component throws an unexpected exception, its state must be discarded and it must be reloaded before it can reliably be consumed again. If your application is hardened in this way, then you would in fact need to catch (Exception) in all of the locations that you mentioned in your first comment in order to save the host process from termination. I missed that in my original reply and I apologize.
However, I’d like to point out that it’s difficult and costly to implement this strategy correctly and I still maintain that the approach I've been recommending is perfectly viable for most applications.
Jeremy: “Your recommendations will only work well for the people replying to your posts if you are interested in personally supporting their products when deployed in the field.”
Given the choice between an application that abuses catch (Exception) and another which catches only the exceptions that it can handle gracefully, I'd choose to support the second application in a heart-beat.
Jeremy: “Their customers and partners don't care about what FxCop might have recommended to the developer.”
Not directly, but they do care about software quality, reliability, and overall correctness. Let’s take two familiar examples: Microsoft Excel and the C# compiler. I don’t want Excel to crash, but it would be much worse if it ever produced an incorrect value in a spreadsheet that I use to manage my personal finances. Similarly, I prefer to see the compiler ICE (Internal Compiler Error) in favor of emitting bad code. Of course, in an ideal world, Excel would never crash and the compiler would never ICE, but I’ve seen both happen and I’m sure it was in my best interest even if it upset me at the time.
Niall: “I think you have to realise that exception handling is not just for the developer's benefit. The application is there to serve the user, not the developer, so simply bombing out because you can't display a new form is going to require the user to restart the application more than is really required.”
Yes, the application is there to serve the user, not the developer and I couldn’t agree more. In that situation, you have to ask yourself why you couldn’t display the form. If the only reason you can provide is that your catch (Exception) block was hit, then you really have no idea whether it’s in the best interest of the user to continue executing. For example, what if one of your assemblies just failed to load, or you’ve run out of memory, or the root cause is actually a corruption in shared data? What will go wrong next? On the other hand, if you catch (IOException) while trying to read the CSV file that will ultimately be used to populate the new form, then you know beyond a shadow of a doubt that the correct application behavior is to inform the user that the file could not be opened and allow them to continue to use the application which remains in a consistent state.
Niall: “In a previous project I worked on, we had a lot of places where exceptions were caught and error reports were sent back to the dev team. Therefore, the exceptions weren't just disappearing quietly, they were known to the developers. In many circumstances, the application would then continue. This is because a lot of our handling of exceptions would result in the form the exception came from closing or not appearing. Basically that element of the workflow did not occur. For almost all cases, this left the application in a state that allowed it to continue without problems. Of course, it wasn't 100% perfect. We did have a threshold of a certain number of exception reports in a certain amount of time would cause the handler to exit the app, though.”
Both exception logging and using an exception count threshold are popular approaches for applications which catch all exceptions in places. In fact, this is actually exactly how the current version of FxCop works. Our experience with FxCop’s current exception handling policy has not been positive at all and so I would not suggest this approach universally. Nevertheless, as the application developer, it’s up to you to choose the appropriate policy.
Niall: “Especially when the environment is such that you cannot deploy a fix at a day's notice, having the users restarting the application frequently when you could provide better is basically developing to suit the developers instead of the users.”
If it turns out that your application is crashing “frequently” enough that users have to restart it all the time, then it wasn’t ready to ship in the first place. By the time you’re ready to ship, you should have uncovered nearly all of the exceptions that you need to handle or prevent, and those that remain should be extremely uncommon. Besides careful testing on your end during the development cycle, it’s important to provide regular preview releases for your customers and partners to pound on. During that time, it will be invaluable that exception-related bugs are easy to diagnose and fix quickly. By the time you ship, you will have software which neither crashes nor swallows bugs!
Niall: “Of course, it can make diagnosing problems more difficult, but the role of the software is to make the user's job easier, not the developer's. Sometimes the developer just has to work a bit harder because otherwise the work goes on the user's plate.”
I never meant to imply that ease of diagnosis is an end unto itself. It is a means through which you can deliver better software with fewer defects, which most definitely serves to make the user’s life easier. Indeed, the developer often does have to work harder: uncovering the specific exceptions that need to be handled and the additional code that needs to be in place to prevent the others is often harder than slapping in an additional catch (Exception) block, but it’s the right thing to do…