The latest release of the Dynamics CRM 2011 SDK (v5.0.12) provides new guidance on improving service channel allocation performance. This guidance builds upon a previous recommendation to cache your references to service proxy objects such as OrganizationServiceProxy. While this recommendation still holds true, it presents an issue for another best-practice, implementing multi-threaded processes, because the proxy objects are not thread-safe.
Enter IServiceManagement<TService> and the ServiceConfigurationFactory.CreateManagement as well as new constructor overloads for the service proxy types. These capabilities now allow you to minimize expensive operations such as downloading endpoint metadata (WSDL) and authenticating (in claims/IFD/Online auth scenarios) by decoupling them from creating instances of the actual service proxy objects. Those operations are minimized because the references obtained can be safely cached and shared across threads. Below is an example based on the SDK documentation of how to use these new capabilities:
Read more about the latest CRM 2011 Development best-practices here: http://msdn.microsoft.com/en-us/library/gg509027#Performance
In addition to highlighting these exciting capabilities, I’d like to focus the remainder of this post on providing a few practical patterns for combining the above technique with implementing multi-threaded operations that interact with the CRM 2011 services.
These patterns will utilize both Data and Task Parallelism methods provided by the .NET 4.0 Task Parallel Library (TPL). The beauty of TPL is that the .NET framework provides a fluent means of adding parallelism/concurrency to applications while abstracting the developer from low-level implementation details. If you haven’t checked out this library and still implement your own ThreadPool mechanism, then you’re definitely missing out!
Read more about Parallel Programming in the .NET Framework here: http://msdn.microsoft.com/en-us/library/dd460693(v=vs.100)
To differentiate, Data Parallelism refers to performing the same operation concurrently against a partitioned source collection or arrays, whereas Task Parallelism refers to the queuing and dispatching of independent tasks that can be executed concurrently. Start to think about how typical operations that involve communication with the CRM 2011 services could fall into one of these two categories. When I think Data Parallelism, operations that immediately come to mind involve executing the same query over multiple paged result sets or Create/Update/Assign/Delete operations on a batch of entity records. For Task Parallelism, I think about scenarios that require retrieval of multiple entity types such as a cross-entity search or retrieving the true equivalent of a left-outer join query. These two scenarios both involve unique queries and handling of the result sets.
So, being able to leverage TPL, how do we keep our CRM service proxy objects aligned with individual threads when implementing concurrent operations? At first glance, it appears that service proxies would be instantiated and disposed of within the delegate method being executed. That isn’t exactly an optimal scenario since the same thread will likely perform multiple iterations thus creating and disposing of an unnecessary amount of proxy objects. Luckily, TPL provides the means to manage thread-local variables when implementing Data/Task Parallelism certain methods.
A quick aside, the patterns below use lambda expressions to define delegates in TPL. If you are not familiar with lambda expressions in C# or VB, review the following article before moving forward: http://msdn.microsoft.com/en-us/library/dd460699(v=vs.100)
Data Parallelism: Parallel.For<TLocal> and Parallel.ForEach<TSource, TLocal> with thread-local OrganizationServiceProxy variable
Our first two concurrent patterns will involve the Data Parallelism methods Parallel.For and Parallel.ForEach. These methods are similar in nature to standard for and foreach iteration statements respectively. The major difference is that the iterations are partitioned and executed concurrently in a manner that makes optimal use of available system resources.
Both methods offer generic constructor overloads that allow you to specify a thread-local type, define a delegate action to instantiate your thread-local variable, and another delegate action to handle any final operations that should be performed after the thread completes all of its iterations. To properly align CRM service proxy objects with threads we’ll incorporate proxy instantiation and disposal into this overload. (Note that the term “thread-local” is slightly inaccurate, because in some cases two partitions may be executed on the same thread. Again, TPL makes these optimization decisions on our behalf and is an acceptable scenario given that we’re still minimizing service proxy objects.)
Let’s start with the Parallel.For<OrganizationServiceProxy> loop pattern. The code sample below illustrates this pattern in the context of concurrently retrieving paged query results. The first delegate (Action) returns a new OrganizationServiceProxy, instantiated using our cached IServiceManagement<IOrganizationService> reference and potentially via pre-authenticated credentials.
The second delegate (Func<TResult>) accepts multiple parameters including a reference to our thread-local service proxy. The delegate body contains the primary operation to be executed and it is supplied with three parameters: index, loopState, and our thread-local proxy reference. In our example scenario, we’re handling paged queries, thus the index parameter can supply the necessary query PagingInfo and the proxy reference allows us to execute the RetrieveMultiple request. The proxy reference must be returned at conclusion of the operation so that it can be passed to the next iteration. This is useful to ensure any changes to the thread-local variable are carried forward to subsequent iterations, but is not directly relevant in our service proxy scenario other than satisfying the intended implementation.
Lastly, cleanup of the service proxy object is performed in the third delegate (Action<T>). The name of this delegate parameter, ‘localFinally’ suggests a try-finally statement where resources are assured release in the finally block. It indeed behaves similarly in that unhandled exceptions in the body delegate are wrapped into an AggregateException type and the localFinally delegate is always executed. Thus, we’ll use this delegate Action to dispose of our service proxy reference after the thread has completed all of its assigned work.
Read more about how to write a Parallel.For loop with thread-local variables here: http://msdn.microsoft.com/en-us/library/dd460703(v=vs.100).aspx
For our Parallel.ForEach<TSource, OrganizationServiceProxy> pattern, the below code sample illustrates using a thread-local proxy reference to submit Create requests for a list of new CRM Entity objects. Notice that the structure of Parallel.ForEach loop is very similar to Parallel.For loop from the perspective of handling a thread-local variable and defining the primary operation. The only difference is that we instead supply an IEnumerable<TSource> where each item is targeted as an iteration rather than performing a specified number of loops. We also have the ability to update the current enumerated item submitted to the loop body delegate. In the example below, we are assigning the Entity.Id property from the System.Guid value returned as the Create response.
Read more about how to write a Parallel.ForEach loop with thread-local variables here: http://msdn.microsoft.com/en-us/library/dd460703(v=vs.100)
Task Parallelism: Parallel.Invoke with ThreadLocal<T> OrganizationServiceProxy
Our third pattern involves Task Parallelism, meaning it’s appropriate when needing to execute multiple, independent operations concurrently. In this pattern we’ll focus on the use of Parallel.Invoke(Action actions) method. Nevertheless, TPL also provides the ability to create and run Tasks explicitly providing more control over nesting and chaining Tasks.
The Parallel.Invoke method only accepts an array of delegate actions, so we can utilize a ThreadLocal<T> instance to align our OrganizationServiceProxy with each concurrent thread. Be sure to instantiate the ThreadLocal<T> instance inside a using statement or explicitly call .Dispose() to release resources held by this reference. Since we must also dispose of our service proxy objects, we’ll track each instance created in a ConcurrentBag<T> that we can reference after executing our concurrent operations.
Within each delegate Action, we can safely access the ThreadLocal<T>.Value to get the reference to our service proxy object. Again, each of these actions should be independent of each other. There is not guarantee as to the execution order or even whether they execute in parallel.
Once all of the concurrent operations are complete, we revisit our proxyBag tracking object in the finally block to cleanup up each of the service proxy objects that were created. An more sophisticated approach involving a wrapper class to ThreadLocal<T> is provided here: http://stackoverflow.com/questions/7669666/what-is-the-correct-way-to-dispose-elements-held-inside-a-threadlocalidisposabl
Note that in .NET 4.5, a new collection property, ThreadLocal<T>.Values, has been added. If a new ThreadLocal<T> is constructed specifying ‘trackAllValues’ = true, then this collection can be accessed to obtain references to each thread-specific instance created over the life of the ThreadLocal<T> object.
Read more about how to use Parallel.Invoke to execute parallel operations here: http://msdn.microsoft.com/en-us/library/dd460705
Remember that any reference types accessed or assigned within the body of each parallel operation should be thread-safe. As noted in each of the patterns, certain thread-safe data structures for parallel programming such as ConcurrentBag<T> can prove helpful .
Hopefully these patterns have provided some practical guidance for combining documented SDK best-practices and lead to a positive impact on general performance within your custom solutions/extensions for Dynamics CRM. I also hope that the scenarios presented with each pattern provide a starting point to identify ways that these patterns can be applied. If you have any questions or suggestions, feel free to post in the comments below.
Microsoft Premier Field Engineer
let me ask if you have test PLINQ, for example using AsParallel
Jorge Gonzalez Segura
@Jorge - No, I haven't tested PLINQ primarily because that applies to in-memory objects (i.e. LINQ to Objects, LINQ to XML) rather than physical data access (i.e. LINQ to SQL, Entity Framework). In the context of Dynamics CRM, you'd be applying that to the CRM SDK's LINQ provider which is a translation mechanism to QueryExpression and RetrieveMultiple requests for physical data. This falls into the same category where the PLINQ engine has no ability parallelize query execution. Nevertheless, PLINQ could come into play if you plan to process/join the results of multiple, heterogeneous queries in-memory.
Hi Austin, great article! I'm running into the same problem in a project. We're trying to kick off a number of custom transactions from a workflow. This is done by looping over a list and creating a custom entity record for each item. The creation of the custom entity record triggers a plugin, which transactionally performs the required business logic (multiple creates and updates that should transactionally complete as a unit of work).
Since we were not too happy with the throughput we achieved using this architecture, we've had a MS Premier Field Engineer from the Netherlands try to pinpoint the bottleneck. The verdict: it's neither the hardware nor the configuration, but the workflow/plugin architecture. Since we're creating each of the custom entity records sequentially, we're not even close to fully utilizing resources.
I'm trying to figure out how I can increase the performance in the workflow. One of the options is to use the new ExecuteMultipleRequest (since Rollup 12), but we're still on RU11 and updating will take some time... I'll test this on an updated dev VM,
The other option would be to use TPL and that is how I landed here. Call me a happy browser already! Now the question arises, can I use the same technique (multiple thread local/safe organization services) in a workflow step? For instance, would it be safe to call IOrganizationServiceFactory.CreateOrganizationService on each thread to instantiate a thread local IOrganizationService? Or alternatively call it multiple times on the main thread and distribute the resulting OrganizationServices over the worker threads manually... I can't find conclusive "evidence" that this method will return a new OrganizationService each time.
Any feedback or ideas are highly appreciated!
I have been using the Task Parllel library as suggested by you to get data it to CRM. The one problem I noticed was with deletes. When I delete contacts concurrently, the message fail due to locking. The deletes go through fine if the keep the max thread count at 1.
This is just the way SQL Server works - you can try to stagger the deletes across entities. What you'll find is it will scale until you queue up deletes and at that point you'll reach the point of diminishing returns. Generally, if the deletes can keep up it will scale nicely. With this type of back log it most likely won't hurt to queue them up if you're in a maintenance window - if you have users pull back the # of threads to balance them with user requests.
@Thijs Kuipers: The ExecuteMultipleRequest likely won't benefit you based on the scenario you describe. It's primary purpose is to submit multiple requests in a single message. The requests themselves are still processed sequentially by the platform, thus the real savings is minimizing SOAP chatter over the wire which primarily benefits high latency client-server scenarios.
So far as I can tell, CreateOrganizationService() will return a new instance of an internal derivative of OrganizationServiceProxy. Note that when the internal instances are created, a CorrelationToken is supplied to help the platform handle transactional support across multiple event pipelines and for infinite loop detection. My suspicion is unintended consequences could arise from multiple, dispatched threads making concurrent, internal service requests sharing the same correlation details. Also, you may be further limited in your ability to dispatch threads and/or monitored more closely against certain thresholds if your activity/plug-in is being registered in isolation.
If you plan to continue invoking your sub-operations (custom entity) via workflow, I'd instead seek a design that spreads the primary operation across multiple workflow job instances (batches) and take advantage of the native multi-threaded/distributed design of the CRMAsyncService. Otherwise, you may move the primary operation that initiates the sub-operations external from CRM where you'll have more control over parallelization.
Hope this helps! If you do find that you're able to incorporate the above pattern into either a plug-in or workflow activity, please report back and update this conversation thread. Thanks!
I've been looking into a blocking issue on Case Pre-Create plugin in a production system. It shows that retrieving multiple records from a custom entity blocks the (out of box) auto case number generation in the SQL Profiler log. There is no regular updates to this custom entity since it's for configuration settings.
The server is on RU14 and the code is currently using Query Expression. Would rewriting code with LINQ or using TPL help with the situation?
Any thoughts would be appreciated!
I am looking at the first example using the Parallel.For() retrieving paged results.
I am struggling to see how to deal with an "unknown" number of pages.
In a single-threaded world , usually when paging we use the MoreRecords property on the resultset and iterate the PageNumber and fetch the next set.
I struggle to see how to do this if executing the page retrieval in parallel.
Am I missing something?