I am sure that most people who use C# regularly use the foreach construct. Intimately understanding this construct is not usually required for proper usage. The syntax is straightforward and it would seem that the semantics would also be straightforward; however, beginning in C# 2.0 there is a subtle gotcha that can cause seemingly odd behavior. Take this code for example:
using
You would expect that this program would just list the numbers zero through five on sequential lines. This is exactly what the program does; however, what will this slight modification do to the output:
So what is the output to this program. It just repeats the number five on six lines. Why? Well, there are two important concepts to grasp. First, how the foreach statement works and second how anonymous methods "capture" local variables. A quick glance at the ECMA spec tells us that the foreach statement is rewritten like this:
foreach
becomes (if collection implements IEnumerable<T> other situations are similar)...
IEnumerator
Note that the declaration of element is outside of the loop. It could have been in the loop and in most cases the placement does not matter, but the spec says that it is outside of the loop. Now we move onto anonymous methods.
Anonymous methods capture local variables by putting them on the heap instead of on the stack. A private nested class is generated by the compiler that contains the local variables that are captured. This class is then instantiated when the local variables come into scope. Since the CLR does not provide native support for anonymous methods, these methods are rewritten in constructs that do exist. Here is what the code looks like when the anonymous method and foreach statements are rewritten.
Since the locals are created outside of the loop, the same instance of locals1 is being used by all of the delegates. This leads to them all displaying the same value. How can we fix this? Here is how:
using System;class Program { private delegate void Proc(); static void Main(string[] args) { int[] numbers = { 0, 1, 2, 3, 4, 5 }; Proc[] actions = new Proc[numbers.Length]; int current = 0; foreach (int number in numbers) { int temp = number; actions[current++] = delegate { Console.WriteLine(temp); }; } foreach (Proc action in actions) { action(); } }}
With our newfound knowledge, we can implement a parallelized version of the Map and ForEach methods. First, lets start with the serial versions.
This displays the squares of the integer values zero through five inclusively. Now we can parallelize them.
Just for kicks, we might now write this code with the new C# 3.0 syntax.
using System;using System.Query;using System.Threading;using System.Collections.Generic;static class Extensions { class ThreadAdapter<T,U> { Thread thread; T input; U output; Func<T,U> func; public ThreadAdapter(Func<T,U> func, T input) { this.input = input; this.func = func; thread = new Thread(new ThreadStart(Execute)); } void Execute() { output = func(input); } public Thread Thread { get { return thread; } } public U Output { get { return output; } } } public static IEnumerable<U> Map<T,U>(this IEnumerable<T> list, Func<T,U> func) { var threads = new List<ThreadAdapter<T,U>>(); foreach (var item in list) { var thread = new ThreadAdapter<T,U>(func, item); threads.Add(thread); thread.Thread.Start(); } foreach (var thread in threads) { thread.Thread.Join(); yield return thread.Output; } } public static void ForEach<T>(this IEnumerable<T> list, Action<T> action) { var threads = new List<Thread>(); foreach (var item in list) { var temp = item; var thread = new Thread(new ThreadStart(() => { action(temp); })); threads.Add(thread); thread.Start(); } foreach (var thread in threads) { thread.Join(); } }}class Program { static void Main() { var numbers = new [] { 0, 1, 2, 3, 4, 5 }; numbers.Map(x => x * x).ForEach(x => { Console.WriteLine(x); }); }}
Enjoy!