Reflections on coding style

Reflections on coding style

  • Comments 15

I've been monitoring my coding style for quite a while and I noticed that I often prefer chaining method calls to normal control flow. Now I think it might not always be such a good idea.

To give you an example, today I had a task of reflecting over constants in a class and stuffing them into a dictionary. Here's a sample to demonstrate my original approach. In this sample I replaced the actual domain model with a fake colors domain and the task is to implement two methods: FillColors and OutputColors.

using System;
using System.Collections.Generic;

static class Program
{
    public const string Red = "#FF0000";
    public const string Green = "#00FF00";
    public const string Blue = "#0000FF";

    static void Main(string[] args)
    {
        var dictionary = new Dictionary<string, string>();
        FillColors(dictionary.Add);
        OutputColors(dictionary);
    }

    static void FillColors(Action<string, string> add)
    {
        var fields = typeof(Program).GetFields();
        fields.ForEach(f => add(f.Name, f.GetRawConstantValue().ToString()));
    }

    static void OutputColors(Dictionary<string, string> dictionary)
    {
        dictionary.ForEach(p => Console.WriteLine(p.Key + " = " + p.Value));
    }

    static void ForEach<T>(this IEnumerable<T> list, Action<T> action)
    {
        foreach (var item in list)
        {
            action(item);
        }
    }
}

First, several interesting comments about the code.

  • constants are public because otherwise Type.GetFields won't find them. And for the life of me I couldn't figure out how the overload of GetFields works that accepts BindingFlags. I didn't want to spend more than 2 minutes on that so I just made the constants public.
  • there are no more access modifiers, because members are private by default and top-level types are internal by default if the access modifier is missing. Jon Skeet wrote in his C# in Depth book that he prefers omitting the modifier if it corresponds with the default visibility and I really like this style. Makes the programs shorter. Short programs are especially useful for us testers, where we need to find a minimal repro for bugs that we report.
  • FillColors accepts a delegate "add" which I like calling "the collector". It could have accepted the entire dictionary, but I only use the add functionality so to minimize coupling I only require what I actually consume. An alternative here would be to implement an iterator using yield return, but this would require returning a list of pairs of values, and Tuples were not in our TFS branch yet at the moment of writing :) Another reason for "the collector" pattern is that yield return doesn't compose well - flattening lists like Comega does is still not in C# as of 4.0. Contrary to that, passing a delegate (Continuation-Passing Style) composes nicely and is fully supported starting from C# 1.0.
  • OutputColors doesn't accept IEnumerable<KeyValuePair<string, string>> because I was to lazy to spell that out and it is not relevant to the current discussion :)
  • Also OutputColors doesn't accept Action<string> and hardcodes Console.WriteLine because of the KISS/YAGNI principle.
  • I don't know why KISS/YAGNI didn't apply to the FillColors method. It just didn't.

But I digress. Let's go back to the actual topic of me writing one long line:

fields.ForEach(f => add(f.Name, f.GetRawConstantValue().ToString()));

instead of:

foreach (var f in fields)
{
    add(f.Name, f.GetRawConstantValue().ToString());
}

Here are several observations that favor foreach against ForEach:

ForEach is not in the library

As of .NET 3.5 SP1, ForEach is not in the library so I had to implement it myself. For .NET 4.0 I've logged a suggestion with Mads to add it to the BCL, but I'm not sure if we're adding it for 4.0. A lot of people seem to add it to their libraries so I think it'll be useful to have regardless of the coding style it suggests. I'm sure it is useful at least sometimes.

ForEach hides state mutation

Eric Lippert told me that he's not a great fan of ForEach because it conceals state mutation - it takes an eager (non-lazy) control flow statement that mutates state and hides it behind a functionally-looking call.

You can't put a breakpoint inside ForEach "body"

If you use foreach, you can put a breakpoint on the add call, whereas you can't put a breakpoint on the add call in the first case. This is not a language problem, but rather a tooling issue, but still I don't see tools (yes, debugger, I'm looking at you!) becoming comfortable enough in this scenario anytime soon.

Stack is polluted

In the first case (ForEach), two additional frames are on the stack which looks kinda ugly during debugging:

image

vs. the original:

image

This may not have a good impact on the debugging story.

Performance is worse

Although I know that micro-managing performance is a bad thing to do, we still create an anonymous method (metadata bloat) and have two additional calls on the stack (I know that JIT will probably optimize them, but who knows). So the danger is that in critical codepaths (bottlenecks) the performance will degrade compared to using control flow (which the JIT will compile into a couple of nice fast JMP instructions).

Readability suffers

Readability is probably worse because ForEach is not as familiar as foreach. Also, the code lines tend to grow long with the first approach and I like my lines of code less than 80 characters.

ForEach is NOT functional style

I'd like to stress that I'm still a huge fan of functional style, immutability, composability, first-class functions, etc. The ForEach style described in this blog-post is NOT functional style, although it definitely looks like it.

  1. It is NOT immutable
  2. It is NOT idempotent
  3. It does not chain as Kevin mentions
  4. It is NOT lazy and doesn't have deferred execution like many LINQ operators which are truly functional

We had a hallway conversation with Chris Smith from F# team about this coding style topic today, and even he (the functional programming guy and a huge F# fan) admitted that sometimes C# should better use C# native constructs instead of borrowing from other (albeit awesome) languages like F#.

Summary

So I guess I'm going back to more using good old control flow statements again as they deserve it. They have been here for 30 years for a good reason and I don't think they're going away anytime soon.

  • I have to say, Eric has pretty much convinced me that being explicit about access modifiers is a good idea. I'm in two minds about the whole thing. I almost wish you *had* to be explicit...

  • I really like the ForEach() extension, even though I have to agree with your negative points above with one exception: readability. With everything else factored into lambdas, I find reading a ForEach at the end quite easy to read. Perhaps that's just me.

    I also wonder whether we can't revise the ForEach to implement the ExpressionVisitor and modify the underlying expression, which would then allow lazy load, immutability, etc. This would certainly work for LINQ to SQL and Entity Framework, as the ForEach is really just a batch UPDATE. For objects, well, I'm not sure what that would become other than a "foreach"-yielded monad. And now my head hurts.

    I'm planning to try this out when I get back to that extension in my current demo app. We'll see how that goes.

  • Some of these notes come from my own taste, some from best practices. I thought I'd share:

    > "I couldn't figure out how the overload of GetFields works that accepts BindingFlags"

    Not a good excuse if you're writing a library, but if you're writing code for your own uses it doesn't really matter.

    I would use a dictionary, though, since there's no reason for these to be constants (they're never referred to explicitly).

    > "there are no more access modifiers"

    Tch, I don't think this is a good thing. You should be as explicit as possible, since not everyone knows the defaults (and defaults may, at times, change).

    > "FillColors accepts a delegate "add" [...]"

    I would just return a dictionary and be done with it. That would coincide with your general view of immutability and chainability.

    > "You can't put a breakpoint inside ForEach "body""

    You can. Simply stand on the lambda body and press F9.

  • No access modifiers = nice. Like unnecesary type annotations, these just clutter things up.

    I don't see why the foreach loop construct is better than a function, except for debugging reasons. When I use C#, we end up having all sorts of addons that should have been in the library, such as tuples. (And, List<T> has ForEach already...)

    I don't see how it's hiding anything - it's pretty obvious that ForEach is eager. Things like "Enumerable.GroupBy" seem much scarier in that sense: Eager (although it's possible to have a non-eager one), but fits right in alongside all the lazy functions.

    ForEach is non-functional? Not purely functional, no. And perhaps not even technically functional, since "void" has no value. But practically, it acts as a higher order "function".

    Anyways, the real problem here is that you're doing this whole thing via mutation. Instead, the "Collecting" method should return an enumerable, and you can go put it into a dictionary. The real problem is that there's no AddRange of constructor that takes an IEnumerable.

    Even after we resign ourselves to making repeated Add calls, the next problem, as you noted, is that C# can't represent multiple values. Maybe C# 5 will fix that.

    Even so, IDictionary has the proper Add method(takes a KeyValuePair (much better name than Tuple, right?)) -- Dictionary just chooses to implement it explicitly (no idea why). Thus, simply return your enumerable of KeyValuePair and move your mutation one level up, right next to the constructor.

    If you want to avoid all mutation, use Enumerable.ToDictionary on the enumerable of KeyValuePair:

    IEnumerable<KeyValuePair<A,B>> getColors() { return typeof(X).GetFields().Select(f => new KVP<A,B>(...)); }

    var mydic = getcolors().ToDictionary(kvp => kvp.Key, kvp => kvp.Value);

    There, no mutation :).

    As far as "hey stick to C# when in C#" -- that's good advice. Not because it's going to be better code, but because C# can't swing it when it comes to what would otherwise be perfectly succint code. For instance, check the number of type parameters we had to manually specify and provide, due to C#'s minimal type inference, and lack of automatic generalization.

  • Nice summary. I was reading the post to see if you had the reason I personally prefer foreach and indeed you did: Easier Debugging (breakpoint + call stack)

  • Just to note that it is perfectly possible to set a breakpoint on the body of the ForEach if you rewrite it like this :

    fields.ForEach(f =>

               {

                   add(f.Name, f.GetRawConstantValue().ToString());

               });

  • Do you only use ForEach for simple statements?  By this I mean do you have 'continue' and 'break' implemented in your ForEach implementation or do you switch to foreach if the logic gets that "advanced".

  • Jon:

    ====

    if you convinced me that it's good to be implicit, and Eric convinced you that it's good to be explicit, the next logical step would be for me to convince Eric that it's good to be implicit ;-)

    Seriously, it's mostly a matter of taste and personal preference.

    Ryan:

    =====

    I agree, ForEach can be quite readable if used sensibly. I'm not saying I will get rid of ForEach completely. Moreover, it's ideal for short code like

    myStringList.ForEach(Console.WriteLine);

    And of course, you're totally free to use whatever you find appropriate.

    Omer:

    =====

    1. Agreed, in serious projects it is worth spending time to figure things out.

    2. Yes, if constants are not used explicitly, a Dictionary is better. In my case they are, but I didn't mention all the requirements :)

    3. It looks like both Eric and you are fans of explicit access modifiers. That's cool.

    4. Just returning a Dictionary works too, of course.

    5. I'm not sure if that's going to work if a lambda consists of one expression without the curlies. On an unrelated note, in VS 2008 SP1 we introduced a new feature called "Step Into Specific", but, again, I'd need to check if it's applicable here.

    MichaelGG:

    ==========

    I agree that ForEach is in some way functional (higher-order function). Sometimes this is really beneficial, I was just warning people to use it sensibly.

    You can add AddRange to any ICollection using extension methods.

    .NET 4.0 will have Tuples.

    Petar:

    ======

    You're right, thanks.

    Mark:

    =====

    Yes, I'm only using ForEach in the way it's implemented above. Making it any more complex will be a clear indication to preserve control flow inline. Otherwise the whole thing just grows unwieldly. Also, most of the existing ForEach implementations I've seen out there are exactly like this - just a wrapper around a foreach stament. It would be counter-intuitive to provide any different implementation.

    Thanks everyone for your thoughtful comments, this experience exchange is really helpful for me!

    Kirill

  • Welcome to the 48th Community Convergence. The C# team continues to work hard to get out the next version

  • Hey, Kirill -- long time, no see

    MichaelGG already stated what I was going to say -- stick as close to the IEnumerable<T>/ICollection<T> model as possible, and allow T to be, in this case, KVP<TKey, TValue>.

    And for what it's worth, var and implicit access modifiers are my game.  I'd be perfectly happy if we just get rid of the private keyword altogether.  But then, I'd also be happy to be rid of public fields, or fields as a whole, and move them to within the property:

    int Foo

    {

      storage _foo;

      get { return _foo; }

      set { _foo = value; }

    }

    .. this avoids cluttering up the private API, reducing the number of paths to mutate _foo.

    Of course, there might be a way to hide the private member from Intellisense... I just don't recall.

  • public constants are a TERRIBLE idea and should NEVER be used in code. I would really prefer if the compiler would throw an error if you tried to delcare a const public. The reason why they can't be found with GetFields if they are not public is becuase they DON'T exist, they are constants so the compiler does a replace as expected and when the assembly is generated the const's aren't there.

    These types of fields should be exposed as public readonly this way all derived assemblies do not need to be recompiled once the assembly with the const is updated.

    Oh but it's a const it will NEVER change... that's why we call is a const. In theory this could be correct, but in practice it is not.

  • In my recent post about coding styles one particular thing provoked the majority of feedback and discussions:

  • BindingFlags, in my recollection, were tricky.  I once wrote an object model in 1.1 with custom attributes for serialization to/from a fixed-length host mainframe message set (reflection cached singleton for perf reasons).  But that's not why I'm here.  Very interesting points you made a good few of which I'll take to heart.

    What are your thoughts on the performance implications of generics such as List<T>?  Should there be a concern with these when iterating or in other scenarios, I just have a slight uneasy feeling about the amount of reflection under the hood...

    TIA

  • Jim, List<T>, Dictionary<K,V> etc. are tightly optimized and I wouldn't worry about their performance. As far as I'm concerned, using these doesn't go through reflection at all.

    Have you used Reflector on List<T>? It's just a simple array underneath.

  • I appreciate the response.  Yes, now I see.  Thank you.

Page 1 of 1 (15 items)
Leave a Comment
  • Please add 3 and 2 and type the answer here:
  • Post