Welcome to MSDN Blogs Sign in | Join | Help

More on parallel queries across containers in SSDS

Our current implementation of cross-Container queries follows a very common pattern and roughly looks like this (no exception handling shown for simplicity):

public List<T> CrossContainerSearch( string[] containerIds, 
                                     string query, 
                                     SearchDelegate searchDelegate )
{

    Event[] events = new Event[ containerIds.Count() ];
    Results<T> results = new Results<T>();
    for( int x = 0; x< containerIds.Count(); x++ )
    {
        State s = new State();
        Event[x] = s.Event = new Event();
        s.Query = Query;
        s.Results = results;
        s.ContainerId = containerIds[ x ];
        s.SearchDelegate = searchDelegate;
        ThreadPool.Queue( new WaitCallBack( worker ), s );
    }

    WaitHandle.WaitAll( events );
    return results;
}

The State object is used to pass parameters to the worker (e.g. the query to execute, the delegate that the worker method will call to actually perform the search, the event to signal when the query is done, etc.).

The search workers are queued in the ThreadPool, eventually a thread will pick them up and execute. WaitHandle.WaitAll simply blocks until all events are signaled by all scheduled callbacks. When this unblocks, it means that all queries have completed. This is equivalent to a WaitForMultipleObjects API call.

The worker method looks like this:

void Worker( object s ) 
{ 
   State state = (State) s; 
   state.Results.AddRange( state.searchDelegate( state.ContainerIdstate.Query ) ); 
   state.Event.Set(); 
} 

First we recover the state, the search delegate is called with the query, the results are stored, and the event is set to signal completion.

My friend Arvindra suggested using CCR (Concurrency and Coordination Runtime) for this. I'll definitely take a look at that.

In the meantime, I re-implemented the same code using .NET parallel extensions which simplifies things greatly. A lot of plumbing goes away, and it's much simpler and easier to read. It roughly looks like this:

 

public List CrossContainerSearch(string[] containerIds, 
                                 string query, 
                                 SearchDelegate searchDelegate )
{
   Results results = new Results();   
   Parallel.For(0, containerIds.Count(), i =>
   {
      results.AddRange( searchDelegate( containerIds[ i ], query) );
   }
   );
            
   return results;
}

 

Notice that the scheduling, join, state management, etc. it's all handled by Parallel.For. Hard to beat in simplicity, isn't it? If you still want some deeper control on task scheduling then Task & TaskCoordinator types are your friends.

I will continue some research on this and publish my findings. I'm quite intrigued with more performance information so I'll be working on more formal tests. All I need now is a machine with 16 cores ;-).

The Parallel Programming blog is here.

Published Tuesday, April 22, 2008 1:27 PM by eugeniop

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

# Microsoft news and tips &raquo; More on parallel queries across containers in SSDS

# You liked LitwareHR v1, You loved LitwareHR v2, You are going to die for LitwareHR 'cloud storage' edition :)

More seriously... Eugenio just announced and released on Codeplex the latest drop of LitwareHR. Although

Tuesday, May 06, 2008 6:19 PM by Gianpaolo's blog

# LitwareHR Sample Application May 2008 Just Released

Microsoft Architecture Strategy Team has just shipped a new version of their LitwareHR Sample Application

Tuesday, May 06, 2008 8:24 PM by Pablo Damiani: <br /><b>Lito's Blog</b>

Leave a Comment

(required) 
required 
(required) 
 
Page view tracker