PLINQ Queries That Run in Parallel in .NET 4.5

PLINQ Queries That Run in Parallel in .NET 4.5

Rate This
  • Comments 5

One interesting thing to know about PLINQ is that not all queries are guaranteed to execute in parallel (See PLINQ Queries That Run Sequentially for reference). You can think of the AsParallel method as a hint to run in parallel for query shapes that it believes will be faster. By default, PLINQ prefers to use a simple sequential algorithm over a potentially expensive parallel algorithm, and so some queries will execute sequentially, especially queries where the chain of operators do not lend themselves to parallelism. The bottom line is that PLINQ’s intent is to run your queries as fast as possible, so if there’s a good likelihood that the complexities of parallelism will slow it down, PLINQ will perform the query sequentially rather than in parallel.

In .NET 4.5, PLINQ’s algorithms have improved over .NET 4 such that its set of parallelizable queries has broadened and substantially fewer queries fall back to sequential execution. For example, the following query would execute sequentially in .NET 4 but now executes in parallel in .NET 4.5:

intArray.AsParallel()
    .Select(x => Foo(x))
    .TakeWhile(x => Filter(x))
    .ToArray();

Since the query executes sequentially in .NET 4, the query would not get any speedup even if the Foo() method is expensive. For some cases, the choice to fallback to a sequential implementation might be too conservative, and as a developer you might choose to override it. In order to force PLINQ to use a parallel algorithm in .NET 4, the user can call the WithExecutionMode operator:

intArray.AsParallel()
    .WithExecutionMode(ParallelExecutionMode.ForceParallelism)
    .Select(x => Foo(x))
    .TakeWhile(x => Filter(x))
    .ToArray();

In .NET 4.5, the behavior of the TakeWhile operator has changed, and so the query executes in parallel even without the WithExecutionMode hint.

Sequential Fallback in .NET 4 and .NET 4.5

One way to illustrate the improvements to the sequential fallback is to look at the list of operators that may cause the query to fallback to sequential. Whether or not the operator actually causes a sequential fallback depends on other operators in the query as well.

Operators that may cause sequential fallback in both .NET 4 and .NET 4.5 are marked in blue, and operators that may cause fallback in .NET 4 but no longer in .NET 4.5 are marked in orange.

image

So, in .NET 4.5, as long as the query does not contain positional operators or the ElementAt operator, it will not fall back to sequential.

The positional operators mentioned above are the operator overloads that pass the position of each element into the user delegate. So, while the ‘ordinary’ Select operator accepts a delegate of type Func<TInput, TOutput>, the positional Select accepts a delegate of type Func<TInput, int, TOutput>. The position of each element is passed into the delegate as the input second parameter.

The ForceParallelism execution mode is still available for those items called out above.

Potential Risks of the Change

The fact that some queries used to execute sequentially and now execute in parallel opens up the possibility of exposing more latent race conditions in users’ code.

Here is an example of a query that happens to work in .NET 4, but will not work in .NET 4.5:

List<int> list = new List<int>();
var q = src.AsParallel()
    .Select(x => { list.Add(x); return x; })
    .Where(x => true)
    .Take(100);

There is a bug in the query above: the delegate passed into the Select operator is not thread-safe. However, the query will still happen to work in .NET 4, because as an implementation detail PLINQ decides to execute the query sequentially. In .NET 4.5, the query will run in parallel, and so list.Add(x) could be called from multiple threads concurrently, potentially resulting in a corrupted List<T> instance.

However, the idea that user delegates may be called from multiple cores is a core premise of PLINQ, so hopefully no user code in the real world will be broken by the change.

Related Resources:

PLINQ Queries That Run Sequentially
MSDN Documentation for ParallelExecutionMode Enumeration
How to: Specify the Execution Mode in PLINQ
Understanding Speedup in PLINQ

Leave a Comment
  • Please add 3 and 1 and type the answer here:
  • Post
  • I was wondering: what are the semantics of First (or FirstOrDefault or Last, etc)? For example, can the following query return 2 instead of 1 (in a better example, 1 and 2 would represent some expensive operation):

    (new[] {1, 2}).AsParallel().First();

    Or is First order-preserving somehow (and if it is, what's the use of parallelizing it since you simply want to get the first element as fast as possible).

  • How we acheive Parallel programming with Database?.

    For Example  We have 10000 records to insert into Database  ,We are writng dotnet code to insert records .

    From dotnet perspective we are able to acheive .but for data base server will eats up more memory.

    Can some suggest how to acheive this.?

  • So you won't fix bugs like [1] and [2] in case they break some hypothetical code you've never seen, but you'll change the fundamental behaviour of PLINQ without adding an "old behaviour" switch and just hope it doesn't break anything?

    [1] connect.microsoft.com/.../colortranslator-fromhtml-throws-a-system-exception

    [2] connect.microsoft.com/.../orderbydescending-fails-in-linq-to-objects-when-a-comparer-returns-int-minvalue

  • Richard, thank you for the feedback.  Are you suggesting we not fix PLINQ's performance, or are you suggesting that those two bugs you link to should also be fixed?  It's not clear to me what you're hoping for.

    Part of the point with PLINQ is that its up to the heuristics used in PLINQ internally to determine how to execute a particular query, and those heuristics may evolve over time.  A developer explicitly opting-in to using PLINQ is doing so because they want their query to run in parallel, and has as such already affirmed it's ok for the system to do so (expected, in fact).  That PLINQ then was in some limited circumstances not running those queries in parallel is what was addressed here.

  • I was suggesting that these bugs, and probably many others, should be fixed. If you're willing to risk breaking backwards compatibility to improve performance, I don't understand why you won't risk it to fix something that's broken! :)

    The second bug is particularly apposite:

    "while well-behaved implementations ... should handle this fine, there may be some implementations ... that are not expecting us to reverse the order in which we supply these arguments for a given list"

    A developer writing an IComparer implementation which depends on the precise implementation of the sorting algorithm used internally is protected from breaking changes, whereas someone using PLINQ incorrectly isn't.

Page 1 of 1 (5 items)