“foreach” vs “ForEach”

“foreach” vs “ForEach”

Rate This
  • Comments 45

A number of people have asked me why there is no Microsoft-provided “ForEach” sequence operator extension method. The List<T> class has such a method already of course, but there’s no reason why such a method could not be created as an extension method for all sequences. It’s practically a one-liner:

public static void ForEach<T>(this IEnumerable<T> sequence, Action<T> action)
{
  // argument null checking omitted
  foreach(T item in sequence) action(item);
}

My usual response to “why is feature X not implemented?” is that of course all features are unimplemented until someone designs, implements, tests, documents and ships the feature, and no one has yet spent the money to do so. And yes, though I have famously pointed out that even small features can have large costs, this one really is dead easy, obviously correct, easy to test, and easy to document. Cost is always a factor of course, but the costs for this one really are quite small.

Of course, that cuts the other way too. If it is so cheap and easy, then you can do it yourself if you need it. And really what matters is not the low cost, but rather the net benefit. As we’ll see, I think the benefits are also very small, and therefore the net benefit might in fact be negative. But we can go a bit deeper here. I am philosophically opposed to providing such a method, for two reasons.

The first reason is that doing so violates the functional programming principles that all the other sequence operators are based upon. Clearly the sole purpose of a call to this method is to cause side effects. The purpose of an expression is to compute a value, not to cause a side effect. The purpose of a statement is to cause a side effect. The call site of this thing would look an awful lot like an expression (though, admittedly, since the method is void-returning, the expression could only be used in a “statement expression” context.) It does not sit well with me to make the one and only sequence operator that is only useful for its side effects.

The second reason is that doing so adds zero new representational power to the language. Doing this lets you rewrite this perfectly clear code:

foreach(Foo foo in foos){ statement involving foo; }

into this code:

foos.ForEach((Foo foo)=>{ statement involving foo; });

which uses almost exactly the same characters in slightly different order. And yet the second version is harder to understand, harder to debug, and introduces closure semantics, thereby potentially changing object lifetimes in subtle ways.

When we provide two subtly different ways to do exactly the same thing, we produce confusion in the industry, we make it harder for people to read each other’s code, and so on. Sometimes the benefit added by having two different textual representations for one operation (like query expressions versus their underlying method call form, or + versus String.Concat) is so huge that it’s worth the potential confusion. But the compelling benefit of query expressions is their readability; this new form of “foreach” is certainly no more readable than the “normal” form and is arguably worse.

If you don’t agree with these philosophical objections and find practical value in this pattern, by all means, go ahead and write this trivial one-liner yourself.

UPDATE: The method in question has been removed from the "Metro" version of the base class library.

  • Hey Eric,

    Eoin here from Stackoverflow, cheers for posting the article.

    Just going back to my original question though, phrased slightly differently...

    Since the ForEach<T> construct has been implemented for List<T> and Array, does that mean that (from the philosophical point of view) that these DataStructures are considered/treated differently to the IEnumerable/IQueryable Collections in your mind

    Well, the VS Languages team does not have any influence on what goes into List<T>. I personally find the "ForEach" method on List<T> philosophically troubling for all the same reasons that I would find an extension method on IEnumerable<T> troubling. (And the VSL team does control that.) The one mitigating factor is that List<T> is clearly designed to be a mutable, not-side-effect-free data structure, so using expressions that mutate it seems slightly less bad. -- Eric

    I suppose my original query stemmed from the fact that is _already_ implemented in two places and not in the others. So perhaps the philosophical debate was had and these two were somehow considered different enough to deserve the out-of-the-box implementation...

    (That said, I have gone ahead and implemented them myself ;-) )

  • We've coded this up, and we like it. I think it leads to cleaner code if you often use these long sequences. Eg (off the top of my head);

       // take a code file, removes 
       // comments and blank lines
       // and prints them
       codeContent
        .Split('\n')
        .Where(line => !line.StartsWith('#'))
        .Where(line => !Regex.IsMatch("^ *$", line))
        .ForEach(Console.Writeline);

    I think that because it's always at the end of one of these sequences, and you don't assign the result to anything, it's clear what's going on.

    It also helps avoid the pattern where you just do this;

       sequence.ToList().ForEach(fnc);

    Which is what you're tempted to do without this procedure. That would force the evaluation of the entire sequence before any action is committed, and for some longer tasks, this is a significant delay. This makes your app 'choppier.'

    It can also use up significant memory. I saw this a lot in the 'oxford comma' code samples, where people were converting sequences to lists to count them, without concern for the size of the list in memory. A lack of ForEach leads to these kinds of shortcuts.

  • On the other hand, AsParallel().ForEach is extremely useful :)

  • Are there any performance differences between foreach and .ForEach?

    Hussein

  • Hussein: Of course. In ForEach there is the cost of an extra delegate call for each item. The order is the same - O(n), but the net cost of the ForEach is always higher than that of foreach. That cost is very small though - on almost all cases it would be insignificant compared to the action being called.

    Also, and this is purely hypothetical, foreach can allow for some compiler optimizations that ForEach would not allow, so the difference can be a bit bigger.

    That said, micro optimizations here are useless - the idea of ForEach in my mind is simply wrong for normal scenarios. For the example Steve Cooper gave though, I find that it is usually a much needed 'Expression'.

  • List<T>.ForEach is actually not the same as foreach-statement internally (you can check that with Reflector). Instead, it's a plain for-statement over list indices.

    The difference? When you use List<T>.Enumerator (as foreach-statement does), it performs a check to see if collection is modified on every iteration on the loop. A plain indexing for-statement won't do that, so it is slightly faster.

    In practice, though, the overhead of a virtual call via a delegate is higher than that of version checking.

  • This might be the first extension method I wrote. I don't think the closure piece is important; all extension methods have that. As for side effects, that it's void should make it pretty clear what it's doing. In general, if basically everyone wants it, it should just be part of the platform, for the same reason that so many other friendly, generally useful methods are.

  • Excellent post. As my response here suggests, I agree with you completely on the extension method issue:

    http://stackoverflow.com/questions/317874/existing-linq-extension-method-similar-to-parallel-for/318493#318493

  • Nice, I asked this question on Charlie Calvert's blog a while back.

    In between times I worked out an answer for myself, which was that the ForEach implicitly 'finalises' (I don't know what the correct term is?) the IEnumerable, and so breaks the functional model.

    I have preferred this solution, for logging/debug purposes (the name is clearer too):

           public static IEnumerable<T> SideEffects<T>(this IEnumerable<T> items, Action<T> pPerfomAction)

           {

               foreach (T item in items)

               {

                   pPerfomAction(item);

                   yield return item;

               }

           }

  • > When we provide two subtly different ways to do exactly the same thing, we produce confusion in the industry, we make it harder for people to read each other’s code, and so on.

    Of course, the same disadvantage appears when you leave it to everyone to create their own (almost?) identical method. Having trivial extension methods included in the BCL is not about the cost of creating them, it's all about creating a common language for all .NET developers, something that you can read without looking things up or guessing.

    (Generally speaking - I understand your specfic reasons for not implementing ForEach. But at the same time I follow Steve Cooper here - ForEach might be a nice thing to have in some situations, from a pragmatic PoV. Whether it should be used in a specific situation is a difficult question, but providing it is only a really bad idea if you believe in the concept of protecting developers from themselves.)

  • Ok, but when are you going to implement the pipe as seen in shell scripting and F#? I'm serious, that would be freaking useful! :-)

    Using extension method chaining is kinda like using the pipe which is one reason many people miss the ForEach() method I guess. I even implemented a Tee() method that does what ForEach() does except it also passes the values on to the next step in the chain.

    Regards

  • Yesterday, I converted bunch of code from using foreach to ForEach.

    Here's what I had originally:

    public class MetaDataFixer

    {

           public List<Mp3File> Mp3FileList;

           private void CleanupTitle()

           {

               foreach(Mp3File file in Mp3FileList)

                   //do clean up of title

           }

           private void CleanupComments()

           {

               foreach(Mp3File file in Mp3FileList)

                   //do clean up of comments

           }

           //Other similar methods:

           private void CleanupAlbum();

           private void CleanupArtist();

           public void Cleanup()

           {

               if(_options.CleanupTitle)

                   CleanupTitle();

               if(_options.CleanupComments)

                   CleanupComments();

           }

    }

    I converted that in to:

    public class MetaDataFixer

    {

           public List<Mp3File> Mp3FileList;

           private readonly Action<Mp3File> cleanupAlbum =

               delegate(Mp3File file) { file.Tag.Album = StringHelper.Cleanup(file.Tag.Album); };

           private readonly Action<Mp3File> cleanupTitle =

               delegate(Mp3File file) { file.Tag.Title = StringHelper.Cleanup(file.Tag.Title); };

           //more clean up methods.

           public void Cleanup()

           {

               List<Action<Mp3File>> actionsToPerform = GetActionsToPerform();

               Mp3FileList.ForEach(delegate(string fileName)

                           {

                               Mp3File file;

                               file = Mp3File.Create(fileName);

                               actionsToPerform.ForEach(delegate(Action<Mp3File> action) { action(file); });

                               file.Save();

                            }

           }

           private List<Action<Mp3File>> GetActionsToPerform()

           {

               List<Action<Mp3File>> actionsToPerform = new List<Action<Mp3File>>();

               if (_options.CleanupAlbum)

                   actionsToPerform.Add(cleanupAlbum);

               if (_options.CleanupArtist)

                   actionsToPerform.Add(cleanupArtist);

               //more options

           }

    }

    I feel that this code is cleaner than the earlier version.

    I understand your point about closure being a potential issue with ForEach but if my delegate code is simple, the ForEach looks cleaner.

    Eric, I would be interested in any comments you may have about this re factoring.

    RJ

  • @RJ

    Create actions that work on a single file list this;

       public static CleanupAlbum(Mp3File file)

       {

         ...

       }

    Then you can execute them like

       Mp3FileList.ForEach(CleanupAlbum);

    As an idea, you might also consider an 'action combiner' -- something like this (not compiled or tested, just for illustration)

       public Action<T> Combine<T>(params Action<T>[] actions)

       {

         return item =>

         {

           foreach(var actOn in actions)

           {

             actOn(item);

           }

         };

       }

    Then you could write;

       allCleanupActions = Combine(CleanupAlbum, CleanupArtist, CleanupComments);

    and

       Mp3FileList.ForEach(allCleanupActions);

  • I use an extension method like this fairly infrequently, but it's called "Update" and is used specifically for doing database updates using Linq to SQL, and also returns the number of entities (records) updated as a secondary effect.  It's a useful method but I think it's completely fair that I had to write it myself,

    I'm not particularly taken with the "code printing" example a few comments above.  First of all, I think it's more difficult to read than the vanilla non-Linq version, and secondly the chained "wheres" will cause the entire array to be iterated twice, unless the compiler performs some kind of optimization on those particular extension methods (I doubt it).  A simple foreach with a nested if statement would have been easier to write, easier to read, way easier to debug, and would perform noticeably better for large input.  I know it was an off-the-cuff, contrived example ("off the top of my head"), but it's nevertheless a perfect example of the inappropriate use of Linq - and having a "ForEach" extension would probably encourage this type of use.

  • ForEach((x,index) => ...) is useful.

Page 1 of 3 (45 items) 123