What’s new in Beta 1 for the Task Parallel Library? (Part 1/3)

What’s new in Beta 1 for the Task Parallel Library? (Part 1/3)

Rate This
  • Comments 41

Related Posts:

The Task Parallel Library (TPL) has come a long way since its inception.  Over the course of several CTPs, it has evolved to be an important and central new component in the .NET Framework.  Most recently, TPL was released as part of mscorlib.dll in the Visual Studio 2010 and .NET Framework 4.0 CTP around the October 2008 timeframe.   However, due to the nature of large software projects, that release was actually based on code from back in July!  Needless to say, since last summer, we’ve invested a lot of effort into making sure that the right functionality is exposed through the right set of APIs.  And as a result, TPL has changed considerably.  In this set of posts, we’ll walk through some of the changes so you’ll be ready for the next preview release of .NET 4.0 (no guarantees at this time regarding when that will be).  Of course, as with any software project, TPL may change even more between now and when it’s released, so we’re very interested in any feedback that you may have!

In this first post, we’ll talk about some changes under the covers, some redesigns in System.Threading.Parallel, and some new cancellation features (Tasks and Tokens).

Under the Covers

TPL now uses the .NET ThreadPool as its default scheduler.  As part of this effort, the ThreadPool has undergone a number of significant functional improvements:

·         Work-stealing queues were introduced internally to be used by TPL (see Daniel Moth's post on the new CLR 4 ThreadPool engine)

·         Hill-climbing algorithms were introduced to quickly determine and adjust to the optimal number of threads for the current workload.

·         Coordination and synchronization types such as SpinWait and SpinLock are now used internally.

Also, the whole of Parallel Extensions just emerged from a performance push.  In TPL, this included work to decrease the overheads of loops and the launching of Tasks.  We’re by no means done where performance is concerned, but you should notice improved performance for a variety of scenarios.

System.Threading.Parallel

The TPL feature crew spent many hours in design meetings, and this has resulted in quite a few changes for our Parallel APIs.  Here are the most significant ones.

ParallelOptions

A common request/question we’ve gotten based on previous CTPs is the ability to limit the concurrency level of parallel loops.  Folks would create a new TaskManager (specifying the number of processors and/or the number of threads per processor) just to achieve this scenario, and many were still unsuccessful.  We now provide a better, more intuitive solution.

The new ParallelOptions class contains properties relevant to the APIs in System.Threading.Parallel.  One of these properties is MaxDegreeOfParallelism, which does what it sounds like it does.  The default value, -1, causes a Parallel API to attempt to use all available cores, but this can be overridden.  For example, the following loop will run on no more than two cores regardless of how many exist in the machine:

var options = new ParallelOptions { MaxDegreeOfParallelism = 2 };

Parallel.For(0, 1000, options, i=>

{

    ...

});

 

By consolidating options into the ParallelOptions class, we were able to eliminate quite a few existing overloads and prevent exploding the number of overloads when adding new options.  Some new properties include a CancellationToken that will be monitored to determine whether a Parallel call should exit early, and a TaskScheduler that can be used to specify the scheduler on which to execute.  Both of these options are explored more in later sections.

Thread-local State

In previous releases, we supported thread-local state via a ParallelState<TLocal> class.  For example, to get the sum of 0-99:

int sum = 0;

Parallel.For(0, 100,

    // Initialize all thread-local states to 0.

    () => { return 0; },

 

    // Accumulate the iteration count.

    (int i, ParallelState<int> loopState) =>

    {

        loopState.ThreadLocalState += i;

    },

 

    // Accumulate all the final thread-local states.

    (int finalThreadLocalState) =>

    {

        Interlocked.Add(ref sum, finalThreadLocalState);

    });

In the above, loopState (a ParallelState<int> instance) stores each of the thread-local states in a property.  However, loopState would also be used to prematurely break out of the loop (using the Break or Stop methods).  For a cleaner design, we decided to separate these two functionalities by:

·         Renaming ParallelState to ParallelLoopState (used to break out of loops prematurely, check if a loop has been stopped by another iteration, etc.)

·         Removing ParallelState<TLocal> and baking thread-local state into the signatures of Parallel.For and ForEach overloads

To achieve the above scenario now, the body delegate would be:

    // Accumulate the iteration count.

    (int i, ParallelLoopState loopState, int threadLocalState) =>

    {

        return threadLocalState + i;

    },

Note that the body delegate is now a Func that returns a TLocal – in this case, an int.  Each iteration of the body is passed the current thread-local state and must return the possibly-updated state.

Tasks and Tokens

In the previous section, we saw that a CancellationToken may be used to cancel a Parallel call.  This token structure is actually part of a new unified cancellation model that is intended for eventual use throughout the .NET Framework.  As of Beta 1, it is supported by a few TPL APIs such as Wait:

Task t = ...

CancellationTokenSource tokenSource = new CancellationTokenSource();

CancellationToken token = tokenSource.Token;

 

try { t.Wait(token); }

catch (OperationCanceledException oce) { }

// Elsewhere...

tokenSource.Cancel();

 

The new cancellation model centers around two types: CancellationTokenSource and CancellationToken.  A CancellationTokenSource is used to cancel its member CancellationToken (accessible via the Token property).  A CancellationToken can only be used to check whether cancellation has been requested; separating the ability to cancel and the ability to check for cancellation requests is a key point in the new model.  In the above example, the Wait operation is passed a CancellationToken, and the associated CancellationTokenSource is used to cancel the operation; note that it is the Wait operation that gets canceled, not the Task.

As mentioned in the “What’s new in CDS” post, cancellation merits a dedicated post, so look for that one soon.

Check back soon for “What’s new in Beta 1 for TPL (Part 2/3)”!

Leave a Comment
  • Please add 5 and 1 and type the answer here:
  • Post
  • PingBack from http://software.intel.com/en-us/blogs/2009/03/27/task-parallel-library-beta-1/

  • PingBack from http://www.anith.com/?p=23716

  • Hi, thank you for all the posts, as it gives us a chance to contribute before the final version is out!

    Most of the information published about the library is geared towards loading all available cores to 100% with CPU intensive tasks. Can you post a bit about how I/O intensive tasks can be handled by the Parallel library?

    For example, processing of a large number of files in parallel and crawling a website with many concurrent connections could in theory benefit from a cleaner, task-oriented design, yet do not appear to be very suitable as it seems the library might not correctly guesstimate the optimal number of threads.

    Thanks!

  • i hope that we also get a way of specifying _minimum_ parallelism because i want to use this library not only for cpu intensive tasks but also for disk io or network calls. those have different optimal degrees of parallelism. when i want to crawl a website i found 32 threads to be optimal on my quad core machine. just 4 threads would not saturate the network.

    i know that is not your core scenario but i really hat to rewrite the stupid code to spin off exactly N threads over an over.

  • The Microsoft Parallel Team have posted a new entry which details the changes in the Task Parallel Library

  • Thank you for submitting this cool story - Trackback from DotNetShoutout

  • PingBack from http://aliwriteshere.wordpress.com/2009/03/30/171/

  • As Danny mentioned in this article, TPL now uses the CLR ThreadPool as its default work scheduler.  One of the main functions of the ThreadPool is to dynamically optimize the number of threads, automatically adapting to your particular workload.  In the CLR 4.0 we are working on an improved algorithm for determining the optimal thread count for a given workload, which is the "hill climbing" stuff mentioned here.  I hope to write something more about this on my blog soon.

    So in principle, I/O-heavy TPL workloads should "just work."  However, we are still sorting out exactly how the ThreadPool's concurrency optimization should interact with the Degree of Parallelism concepts in TPL.  Suggestions are welcome!

  • PingBack from http://teabreak.pk/244/18237/

  • What is the latest available version of this library?

  • Meanwhile, over in a corner of the internet; http://blogs.msdn.com/pfxteam/archive/2009/03/27/9514938.aspx

  • The latest available version is in the Visual Studio 2010 and .NET Framework 4.0 CTP, which installs as a VPC.

    If you would rather play with a standalone DLL, try the Parallel Extensions June 2008 CTP.

  • I wish you would do something about the performance for fields with  ThreadStatic attribute. My measurements show that accessing a field with the ThreadStatic attibute is comparable to aquiring and releasing a monitor.

    A high performance version of ThreadStatic would make parallel programming easier.

  • I wish you would do something about the performance for fields with  ThreadStatic attribute. My measurements show that accessing a field with the ThreadStatic attibute is comparable to aquiring and releasing a monitor.

    A high performance version of ThreadStatic would make parallel programming easier.

  • PingBack from http://www.bakdevelopment.com/wp/archives/1843

Page 1 of 3 (41 items) 123