New Feature? :: ThreadLocal<T>.Values

New Feature? :: ThreadLocal<T>.Values

  • Comments 10

We’ve been considering adding a Values property to System.Threading.ThreadLocal<T>.  Values would return a collection of all current values from all threads (e.g. what you’d get if you evaluated Value from each thread).  This would allow for easy aggregations, and in fact in our Parallel Extensions Extras we have a wrapper around ThreadLocal<T> called ReducationVariable<T> that exists purely to provide this functionality.  For example:

 

var localResult = new ThreadLocal<int>(() => 0);
Parallel.For(0, N, i =>
{
    localResult.Value += Compute(i);
});

int result = localResult.Values.Sum();


If you’re familiar with the Parallel Patterns Library (PPL) in Visual C++ 2010, this feature would make ThreadLocal<T> very similar in capability to the combinable class.

 

In .NET 4, it’s already possible to do aggregations using the thread-local data support that’s built in to the parallel loops.

 

Parallel.For(0, N,
    () => 0,
    (i, loopState, localResult) =>
    {
        return localResult + Compute(i);
    },
    localResult => Interlocked.Add(ref result, localResult));


This approach of using Parallel.For has less overhead than accessing the ThreadLocal<T> instance on each iteration, which is one of the reasons Parallel.For has the support built-in.  However, there are some advantages to also having the ThreadLocal<T> approach available:

  1. Fewer delegates to understand.  Wrapping your head around three different delegates (and how data is passed between them) in a single method call can be tough.  It may also be unintuitive that an interlocked operation is required for the final step (though this approach has performance benefits, as each thread gets to perform its final reduction in parallel).
  2. Certain scenarios may enjoy less overhead.  There is potentially a subtle performance issue with the Parallel.For approach, depending on why the local support is being used.  In an effort to be fair to other users of the ThreadPool, the Tasks that service a parallel loop will periodically (every few hundred milliseconds) retire and reschedule replicas of themselves.  In this way, the threads that were processing the loop’s tasks get a breather to optionally process work in other AppDomains if the ThreadPool deems it necessary.  The ThreadPool may also choose to remove the thread from active duty if it believes the active thread count is too high. Consequently, the number of Tasks created to service the loop may be greater than the number of threads.  In turn, the delegates that initialize and finalize/aggregate the local states will be executed more, because they are run for each new Task rather than each new Thread.  Of course, this would only be an issue if the initializer and finalizer delegates are very expensive, but it’s worth noting that the ThreadLocal<T> approach does not suffer from this.
  3. Usable in places where built-in local support isn’t available.  You can do aggregations where we do not have built-in thread-local data support.  For example, Parallel.Invoke does not provide local support.

Your input could help!  If you’ve got a minute, feel free to answer the following questions and/or provide other thoughts:

  1. Would you find writing aggregations easier with ThreadLocal<T>.Values compared to using the support built in to the parallel loops?  If so, does the convenience make the feature worthwhile?
  2. Do you need/want support for writing aggregations outside of parallel loops?
  3. When you do aggregations, are the routines for initializing and finalizing local state expensive?  Examples would be great.

Thanks!

Leave a Comment
  • Please add 2 and 2 and type the answer here:
  • Post
  • I do think this would be a very worthwhile addition. 1st point would make code easier to understand and that would be a welcomed addition in a type of code that is already hard to understand by default (parallel code), so giving easier constructs would help.

  • This seems like an incredibly useful addition to me.  I can think of multiple places where this would have made life simpler.

    1. At times.  I'm used to the 3 delegate approach, but I think using ThreadLocal<T> explicitly would be nicer (ie: more readable/maintainble) when the setup or reduction steps are complex.

    2. Yes.  There are times when Task/Task<T> is more appropriate, especially in order to have conditional continuations.  This would be very useful to handle those situations.  Also, there are times when I've had subclasses of a functor that are being used in a parallel loop, but where only certain (rare) subclasses require the aggregation portion.  Right now, it forces me to design with the 3 delegate approach in mind always, even though it's really only occasionally useful.  This would eliminate that need.

    3. Not typically, but it does happen.  For example, I have one parallel loop where I'm using thread local state to maintain a thread local cache, but the cache initialization is relatively expensive.

  • I really don't have any use for the parallel loop construct – it is far too high level for what we do. However I am looking forward to replacing our own implementation of ThreadLocal<T> with the .NET 4.0 version once we migrate to .NET 4.0.

    The TreadLocal<T>.Value property would be a very welcome addition to the class. In addition I would like a method that would loop through all thread local values and replace the old value with a new … in a thread safe way. It doesn’t need to be fast – it just needs to be safe.

    Here is two (relatively similar) ways of using ThreadLocal that I believe is not directly covert by the .NET Framework:

    1. Object pools with one large global pool and a lot of thread local pools to reduce contention on the global pool. We have a periodic timer that loops through all of the local pools to find local pools that have not been used for some time. Those pools are merged into the global pool.

    2. High performance statistics collection – when collecting statistics at very high frequency, then we put the collected data into thread local buckets. At regular intervals we gather the data from the ThreadLocal buckets and aggregate it into a global bucket. If a local bucket has not been used for some time, it is retired.

    Here is our code for it:

           /// <summary>

           /// Replaces all thread local values in a thread safe manner.

           /// </summary>

           /// <param name="replacer">Delegate that takes the current values and should return the new value.</param>

           public void ReplaceAllObjects(Func<T, T> replacer)

           {

               lock (_dataLock)

               {

                   for (int n = 0; n < _data.Length; n++)

                   {

                       InstanceData instanceData = _data[n];

                       if (instanceData == null)

                           continue;

                       instanceData.Value = replacer(instanceData.Value);

                   }

               }

           }

    (and yes I know that there is a lock in there ... the lock is not acquired for "normal" thread local access)

  • I think the proposed name would be very confusing. You have something called "ThreadLocal", but then you've got a property that, in a sense, is *not* thread-local -- it's reaching across *all* threads, and is no longer "local".

    The functionality might be useful, but I think the property name should be much more explicit in calling out the non-localness / globalness / all-possible-threadedness of its operation.

  • 1 Yes I think writing is easier with aggregations  ThreadLocal<T>.Values compared to using the built-in support for parallel loops. Yes convenience make useful feature.

    2 Yes I would like a writing aggregation outside the parallel loops. It is easier to understand the code.

    3 In fact, if the number of iterations is large,even if the initialization is expensive, it is negligible.

    To conclude, I use  sometimes PPL in C + + and I love the combinable class for aggregation.

  • çok çikin olmuş bokkkkk olmuş amcık olmuş yarrraak olmuş yaniiii

  • I agree with Joe: the name should be more obvious that it is not thread-local. Perhaps something like AllValues or AllThreadValues, or even just making ThreadLocal enumerable so we can use it in a foreach loop without explicit access to the values collection, though that might be hard to do in a thread-safe way.

  • that'd be very useful i think :)

  • I for one couldn't use ThreadLocal for my needs since it lacks this capability (or to put it in more positive terms, yes please add this feature!).

  • Based on the positive demand we received, this has indeed been added to .NET 4.5.  For more information, see blogs.msdn.com/.../10236000.aspx.

Page 1 of 1 (10 items)