All about Async/Await, System.Threading.Tasks, System.Collections.Concurrent, System.Linq, and more…
Back in the October 2007 issue of MSDN Magazine, we published an article on the beginning stages of what has become the Task Parallel Library (TPL) that's part of the Parallel Extensions to the .NET Framework. While the core of the library and the principles behind it have remained the same, as with any piece of software in the early stages of its lifecycle, the design changes frequently. In fact, one of the reasons we've put out an early community technology preview (CTP) of Parallel Extensions is to solicit your feedback on the APIs so that we know if we're on the right track, how we may need to change them further, and so forth. Aspects of the API have already changed since the article was published, and here we'll go through the differences.
First, the article refers to System.Concurrency.dll and the System.Concurrency namespace. We've since changed the name of the DLL to System.Threading.dll, and TPL is now contained in two different namespaces, System.Threading and System.Threading.Tasks. AggregateException, the higher-level Parallel class, and Parallel's supporting types (ParallelState and ParallelState<TLocal>) are contained in System.Threading, while all of the lower-level task parallelism types are in System.Threading.Tasks.
Next, the second page of the article states "if any exception is thrown in any of the iterations, ... the first thrown exception is rethrown in the calling thread." The semantics here have changed such that we now have a common exception handling model across all of the Parallel Extensions, including PLINQ. If an exception is thrown, we still cancel all unstarted iterations, but rather than just rethrowing one exception, we bundle all thrown exceptions into an AggregateException container exception and throw that new exception instance. This allows developers to see all errors that occurred, which can be important for reliability. It also preserves the stack traces of the original exceptions.
In the section on aggregation, the article talks about the Parallel.Aggregate API. Since the article was published, we've dropped this method from the Parallel class. Why? Because a) PLINQ already supports parallel aggregation through the ParallelEnumerable.Aggregate extension method, and b) because aggregation can be implemented with little additional effort on top of Parallel.For if PLINQ's support for aggregation isn't enough. Consider the example shown in the article:
int sum = Parallel.Aggregate(0, 100, 0, delegate(int i) { return isPrime(i) ? i : 0; }, delegate(int x, int y) { return x + y; });
We can implement this with PLINQ as follows:
int sum = (from i in ParallelEnumerable.Range(0, 99) where isPrime(i) select i).Sum();
Of course, this benefits from LINQ and PLINQ already supporting a Sum method. We can do the same thing using the Aggregate method for general reduction support:
int sum = (from i in ParallelEnumerable.Range(0, 99) where isPrime(i) select i).Aggregate((x,y) => x+y);
If we prefer not to use PLINQ, we can do a similar operation using Parallel.For (in fact, this is very similar to how Parallel.Aggregate was implemented internally):
int sum = 0;Parallel.For(0, 100, () => 0, (i,state)=>{ if (isPrime(i)) state.ThreadLocalState += i;},partialSum => Interlocked.Add(ref sum, partialSum));
Here, we're taking advantage of the overload of Parallel.For that supports thread-local state. On each thread involved in the Parallel.For loop, the thread-local state is initialized to 0 and is then incremented by the value of every prime number processed by that thread. After the thread has completed processing all iterations it's assigned, it uses an Interlocked.Add call to store the partial sum into the total sum. In fact, if you found yourself needing Aggregate functionality a lot, you could generalize this into your own ParallelAggregate method, something like the following:
static T ParallelAggregate<T>( int fromInclusive, int toExclusive, T seed, Action<int> selector, Action<T,T> aggregator){ T result = seed; object aggLock = new object(); Parallel.For(fromInclusive, toExclusive, () => initialValue, (i,state) => { state.ThreadLocalState = aggregator(state.ThreadLocalState, selector(i)); }, partial => { lock(aggLock) result = aggregator(partial, result); } ); return result;}
With this, you can use ParallelAggregate just as is done with Parallel.Aggregate in the article:
int sum = ParallelAggregate(0, 100, 0, delegate(int i) { return isPrime(i) ? i : 0; }, delegate(int x, int y) { return x + y; });
Moving on to the Task class, we've made some fairly substantial changes to the public facing API. Here's a summary of the differences from what's described in the article:
In addition to changes to the Task class, there have also been changes to the TaskManager class described in the article:
That sums up the changes we've made. Even with these changes, the article should still provide you with a good overview of the library and its intended usage. For more information, check out the documentation that's included with the CTP, and stay tuned to this blog! As mentioned before, we're very interested in your feedback on the API, so please let us know what you think (the good and the bad).
I've read some articles about parallel programming with C# in a Spanish magazine "Solo Programadores", but they are in Spanish :-)
I am from Spain. However, I found a very interesting article in Packt Publishing's website:
http://www.packtpub.com/article/simplifying-parallelism-complexity-c-sharp
The author is an Spanish well known writer. It seems he is begining working in english as well.
The article is a chapter of his "C# 2008 and 2005 Threaded Programming" book. Sounds very interesting.
I am currently reading Joe Duffy's "Concurrent Programming on Windows". Highly recommended.
To completement it, I bought "C# 2008 and 2005 Threaded Programming", some reviewers said it had funny examples.
I love exploiting multicore CPUs, it is incredible to see tasks finishing in less time using many cores.
Rednael,
Great post! Your article is an amazing piece of work! Highly recommended. I'll be working with your framework to make some tests.
Omar,
The article you are recommending is also very useful. I am also reading Joe Duffy's book
I found information about Hillar's book in www.multicoreinfo.com. It seems that he is a very important author in Spanish speaking world. I just know one word in Spanish: "Hola" means "Hello". :-)
I've read the book table of contents and it seems a good work. Worth reading it too.
I think we must read hard to understando multicore programming. Books are great resources and posts like Rednael's, too.