Delay's Blog is the blog of David Anson, a Microsoft developer who works with C#, XAML, HTML, and Azure.
Previous posts introduced the Delay.FxCop custom code analysis assembly and demonstrated the benefits of automated code analysis for easily identifying problem areas in an assembly. The Delay.FxCop project included two rules, DF1000: Check spelling of all string literals and DF1001: Resources should be referenced - today I'm introducing another! The new rule follows in the footsteps of DF1001 by identifying unused parts of an assembly that can be removed to save space and reduce complexity. But while DF1001 operated on resources, today's DF1002: Uncalled methods should be removed analyzes the methods and properties of an assembly to help find those stale bits of code that aren't being used any more.
DF1000: Check spelling of all string literals
DF1001: Resources should be referenced
DF1002: Uncalled methods should be removed
Note: If this functionality seems familiar, it's because CA1811: Avoid uncalled private code is one of the standard FxCop rules. I've always been a big fan of CA1811, but frequently wished it could look beyond just private code to consider all code. Of course, limiting the scope of the "in-box" rule makes perfect sense from an FxCop point of view: you don't want the default rules to be noisy or else they'll get turned off and ignored. But the Delay.FxCop assembly isn't subject to the same restrictions, so I thought it would be neat to experiment with an implementation that analyzed all of an assembly's code.
Further note: One of the downsides of this increased scope is that DF1002 can't distinguish between methods that are part of a library's public API and those that are accidentally unused. As far as DF1002 is concerned, they're both examples of code that's not called from within the assembly. Therefore, running this rule on a library involves some extra overhead to suppress the warnings for public APIs. If it's just a little extra work, maybe it's still worthwhile - but if it's overwhelming, you can always disable DF1002 for library assemblies and restrict it to applications where it's more relevant.
Implementation-wise, DF1002: Uncalled methods should be removed isn't all that different from its predecessors - in fact, it extends and reuses the same assembly node enumeration helper introduced with DF1001. During analysis, every method of the assembly is visited and if it isn't "used" (more on this in a moment), a code analysis warning is output:
DF1002 : Performance : The method 'SilverlightApplication.MainPage.UnusedPublicMethod' does not appear to be used in code.
Of course, these warnings can be suppressed in the usual manner:
[assembly: SuppressMessage("Usage", "DF1001:ResourcesShouldBeReferenced",
MessageId = "app.xaml", Scope = "resource", Target = "SilverlightApplication.g.resources",
Justification = "Loaded by Silverlight for App.xaml.")]
It's interesting to consider what it means for a method or a property to be "used"... (Internally, properties are implemented as a pair of get/set methods.) Clearly, a direct call to a method means it's used - but that logic alone results in a lot of false positives! For example, a class implementing an interface must define all the relevant interface methods in order to compile successfully. Therefore, explicit and implicit interface method implementations (even if uncalled) do not result in a DF1002 warning. Similarly, a method override may not be directly called within an assembly, but can still be executed and should not trigger a warning. Other kinds of "unused" methods that do not result in a warning include: static constructors, assembly entry-points, and methods passed as parameters (ex: to a delegate for use by an event).
With all those special cases, you might think nothing would ever be misdiagnosed. :) But there's a particular scenario that leads to many DF1002 warnings in a perfectly correct application: reflection-based access to properties and methods. Granted, reflection is rare at the application level - but at the framework level, it forms the very foundation of data binding as implemented by WPF and Silverlight! Therefore, running DF1002 against a XAML application with data binding can result in warnings for the property getters on all model classes...
To avoid that problem, I've considered whether it would make sense to suppress DF1002 for classes that implement INotifyPropertyChanged (which most model classes do), but it seems like that would also mask a bunch of legitimate errors. The same reasoning applies to subclasses of DependencyObject or implementations of DependencyProperty (though the latter might turn out to be a decent heuristic with a bit more work). Another approach might be for the rule to also parse the XAML in an assembly and identify the various forms of data binding within. That seems promising, but goes way beyond the initial scope of DF1002! :)
Of course, there may be other common patterns which generate false positives - please let me know if you find one and I'll look at whether I can improve things for the next release.
[Click here to download the Delay.FxCop rule assembly, associated .ruleset files, samples, and the complete source code.]
For directions about running Delay.FxCop on a standalone assembly or integrating it into a project, please refer to the steps in my original post.
Unused code is an unnecessary tax on the development process. It's a distraction when reading, incurs additional costs during coding (ex: when refactoring), and it can mislead others about how an application really works. That's why there's DF1002: Uncalled methods should be removed - to help you easily identify unused methods. Try running it on your favorite .NET application; you might be surprised by what you find! :)
If you use .NET's Debug.Assert method, you probably write code to validate assumptions like so:
Debug.Assert(args == null);
If the expression above evaluates to false at run-time, .NET immediately halts the program and informs you of the problem:
The stack trace can be really helpful (and it's cool you can attach a debugger right then!), but it would be even more helpful if the message told you something about the faulty assumption... Fortunately, there's an overload of Debug.Assert that lets you provide a message:
Debug.Assert(args == null, "args should be null.");
The failure dialog now looks like this:
That's a much nicer experience - especially when there are lots of calls to Assert and you're comfortable ignoring some of them from time to time (for example, if a network request failed and you know there's no connection). Some code analysis tools (notably StyleCop) go as far as to flag a warning for any call to Assert that doesn't provide a custom message.
At first, adding a message to every Assert seems like it ought to be pure goodness, but there turn out to be some drawbacks in practice:
The comment is often a direct restatement of the code - especially for very simple conditions. Redundant redundancy is redundant, and when I see messages like that, I'm reminded of code comments like this:
i++; // Increment i
It takes time and energy to type out all those custom messages, and the irony is that most of them will never be seen at all!
Because the code for the condition is often expressive enough as-is, it would be nice if Assert automatically used the code as the message!
Aside: This is hardly a new idea; C developers have been doing this for years by leveraging macro magic in the preprocessor to create a string from the text of the condition.
Further aside: This isn't even a new idea for .NET; I found a couple places on the web where people ask how to do this. And though nobody I saw seemed to have done quite what I show here, I'm sure there are other examples of this technique "in the wild".
As it happens, displaying the code for a condition can be accomplished fairly easily in .NET without introducing a preprocessor! However, it requires that calls to Assert be made slightly differently so as to defer execution of the condition. In a normal call to Assert, the expression passed to the condition parameter is completely evaluated before being checked. But by changing the type of the condition parameter from bool to Func<bool> and then wrapping it in the magic Expression<Func<bool>>, we're able to pass nearly complete information about the expression into the Assert method where it can be used to recreate the original source code at run-time!
To make this a little more concrete, the original "message-less" call I showed at the beginning of the post can be trivially changed to:
DebugEx.Assert(() => args == null);
And the DebugEx.Assert method I've written will automatically provide a meaningful message (by calling the real Debug.Assert and passing the condition and a message):
The message above is identical to the original code - but maybe that's because it's so simple... Let's try something more complex:
DebugEx.Assert(() => args.Select(a => a.Length).Sum() == 10);
Assertion Failed: (args.Select(a => a.Length).Sum() == 10)
Wow, amazing! So is it always perfect? Unfortunately, no:
DebugEx.Assert(() => args.Length == 5);
Assertion Failed: (ArrayLength(args) == 5)
The translation of the code to an expression tree and back seems to have lost a little fidelity along the way; the compiler translated the Length access into an expression tree that doesn't map back to code exactly the same. Similarly:
DebugEx.Assert(() => 5 + 3 + 2 >= 100);
Assertion Failed: False
In this case, the compiler evaluated the constant expression at compile time (it's constant, after all!), and the information about which numbers were used in the computation was lost.
Yep, the loss of fidelity in some cases is a bit of a shame, but I'll assert (ha ha!) that nearly all the original intent is preserved and that it's still quite easy to determine the nature of the failing code without having to provide a message. And of course, you can always switch an ambiguous DebugEx.Assert back to a normal Assert and provide a message parameter whenever you want. :)
[Click here to download the source code for DebugEx.Assert and the sample application used for the examples above.]
DebugEx.Assert was a fun experiment and a great introduction to .NET's powerful expression infrastructure. DebugEx.Assert is a nearly-direct replacement for Debug.Assert and (similarly) applies only when DEBUG is defined, so it costs nothing in release builds. It's worth noting there will be a bit of extra overhead due to the lambda, but it should be negligible - especially when compared to the time you'll save by not having to type out a bunch of unnecessary messages!
If you're getting tired of typing the same code twice, maybe DebugEx.Assert can help! :)
The code for DebugEx.Assert turned out to be simple because nearly all the work is done by the Expression(T) class. The one bit of trickiness stems from the fact that in order to create a lambda to pass as the Func(T), the compiler creates a closure which introduces an additional class (though they're never exposed to the developer). Therefore, even simple statements like the original example become kind of hard to read: Assertion Failed: (value(Program+<>c__DisplayClass0).args == null).
Assertion Failed: (value(Program+<>c__DisplayClass0).args == null)
To avoid that problem, I created an ExpressionVisitor subclass to rewrite the expression tree on the fly, getting rid of the references to such extra classes along the way. What I've done with SimplifyingExpressionVisitor is simple, but seems to work nicely for the things I've tried. However, if you find scenarios where it doesn't work as well, I'd love to know so I can handle them too!