One question that comes up quite often is whether HMC Provisioning is "thread safe" i.e.. does it support multiple simultaneous provisioning actions. The answer to this question can be complicated, the simple answer is "yes, but there are caveats."
To understand the caveats better we have to understand the basic architecture of HMC Provisioning. For this discussion we will break the architecture into 3 blocks.
HMC and MPF Namespaces
Layered business/service logic defined as XML workflow descriptions.
MPF Engine
COM service that executes requests based on workflows defined in the namespaces. This is the component that provides transaction based compensation/rollback support.
Providers and Underlying Product APIs
DLLs that run under the context of the MPF Engine, these DLLs wrap the Product specific APIs that are used to perform provisioning actions.
Now to understand the impact each of these components can have on the HMC Concurrency story we will start from the bottom and work our way up the stack.
The core provisioning capabilities of HMC are generally defined at, and limited by the underlying Product APIs, and specifically the way in which the Provider DLLs expose the product APIs. This is particularly true when it comes to the concurrency characteristics of HMC. The easiest way to explain this is to look at two core Providers and their concurrency characteristics.
These kinds of variations in concurrency characteristics exist across most of the MPF providers, though the SQL provider is unique in that it is the only provider that supports transactional scoping all the way down to the underlying system. It is generally these variations that result in currency related failures within in HMC.
The MPF Engine was actually designed for high levels of concurrency. Each incoming request is processed on a separate "process controller" thread. Each process controller thread is fully isolated from other process controllers within the context of the MPF engine. Each process controller thread is also an MPF transaction; all actions performed within the context of a transaction are persisted and if a failure occurs, these actions will be rolled back. So while this component in and of itself enables high concurrency request processing, as a developer you must take into account that other components in the system specifically the other two blocks in the architecture Providers and Underlying Product APIs (discussed above), and HMC and MPF namespaces (discussed below) have a significant impact on the concurrency behavior of the overall system.
This is where the majority of the business or service logic is defined. Namespace logic is defined as multiple layers of named procedures that either call other named procedures or execute provider methods. The layering and orchestration capabilities of MPF are extensive and very powerful. Unfortunately, this is also the root of almost all concurrency related failures in HMC, some are easily avoided others require careful design consideration or in some cases external throttling and/or retry mechanisms. The following are some high level examples of concurrency related failures in HMC and MPF Namespaces.
MPF Requests that operate on global, or organization wide objects can cause failures under high concurrency
While this may seem fairly straight forward there are some corner scenarios where this can cause concurrency related problems. Let's take for example the scenario where a SharePoint site is being created at the exact same time as a separate SharePoint site belonging to the same organization is being deleted. One might not expect these two requests to have any impact on each other however both rely on the organization wide SharePointSites service pointer for tracking purposes. If the request to delete a SharePoint site removes one of the site pointers at the same time that request to create a SharePoint site is enumerating the list of SharePoint sites the request to create a SharePoint site might fail because the Servicepointer is overwritten by the delete operation. This issue was alluded to above in the discussion about the Active Directory Provider, it is important to note though that any provider that interacts with Active Directory or other similar systems, Resource Manager has similar characteristics, is susceptible to this kind of failure.
MPF Requests that bundle multiple "locking" SQL Provider requests
This issue typically occurs when an MPF Named Procedure bundles two or more HMC Named Procedures, from the Hosted Namespace layer, that write to or read from the PlanManager Database. The root of this issue is that the Managed Plans Namesapce API utilizes the SQL Provider to manipulate the PlanManager Database. Since the SQL Provider establishes and holds locks for the duration of a transaction, transactions or requests that bundle multiple Managed Plans named procedures introduce an increased risk of SQL Deadlocks. SQL Deadlocks result in one or more of the MPF Requests failing.
Steps have been taken within the PlanManager database to try and prevent these deadlock scenarios under the most common scenarios where a customer might want to bundle requests for efficiency. However there are still some scenarios where bundling of requests will result in this failure. For example a transaction that attempts to add or modify a customer plan then subsequently assign the plan to a customer will fail under concurrency.As a general rule one must take into account the cost of SQL transactions when designing highly orchestrated MPF named procedures, this goes along with considering the cost of rollback when bundling large numbers of procedure calls into a transaction.
OK now on to how do you design a custom namespace or process to facilitate successful bulk import or creation of organizations and users/mailboxes
As a general rule the HMC Provisioning System was designed to operate under high levels of concurrency. However, in a high scale, high volume HMC environment it may not be possible to avoid concurrency related issues completely. There are cases where a simple retry is the best solution to the problem. There are also cases however, where there are known failure scenarios and in these cases we strongly suggest that you take steps or put mechanisms in place to avoid concurrency, The most common of these cases are listed below
In general it is OK to bulk load multiple organizations in parallel however you should avoid the bulk provisioning of objects within a single organization in parallel. In other words requests to bulk load objects within a single organization boundary should be serialized.
Finally
While this covers some of the most common scenarios we see in support there are many others out there I am sure. Do you have a scenario not covered above which you are not sure if it is impacted by this discussion. Post a comment and I will be happy to expand the discussion
Until next time (I promise it will not be 2 ½ years)
Mike