The Microsoft Dynamics CRM Blog
News and views from the Microsoft Dynamics CRM Team

When do asynchronous jobs fail, suspend or retry?

When do asynchronous jobs fail, suspend or retry?

  • Comments 9

When the CRM Asynchronous Processing Service gets an error, there are three possible actions it will take depending on the type of error:

  • Fail: Job cannot be resumed.
  • Retry: Job will be paused and retried after a period of time.
  • Suspend: Job will be suspended until it is manually resumed.

The entire error handling mechanism is rather complex but I thought of writing down some general rules that will help understanding what the outcome action will be depending on the error that occurs inside the asynchronous job:

Scenario

Async job

result action

Workflow

result action

Error code

An SDK call fails: Infinite loop detected

Fail

80044182

An SDK call fails: Organization disabled

Retry

8004A104 / 8004A107

An SDK call fails: Server is busy

Retry

8004A001

An SDK call fails: Other

Fail

Suspend

 

SQL exception is thrown

Retry

80040216

Workflow system is paused

N/A

Suspend

80045017

Network error

Retry

80044306

Record associated with workflow cannot be found

N/A

Suspend

80045031

The HTTP response fails with code HttpStatusCode.Unauthorized

Suspend

80044306

The HTTP response fails

Retry

80044306

Plugin or workflow activity throws an InvalidPluginExecutionException

Fail

80040265

Anything else

Fail

 

Please note that the table above can be used as a general guide but might not cover all scenarios, some exceptions to these actions might apply and it might go outdated in the future.

Why do workflows have a different behaviour than other asynchronous jobs when an SDK call fails?

Because the user might be able to fix the problem and resume the workflow. For example, a workflow step sends an email to an account. If the account has no email address, the workflow will suspend with error message "This message cannot be sent to all selected recipients. The e-mail address for one or more recipients is either blank or not a valid e-mail address".

The user can add the email address to the account and resume the workflow. The reason why other asynchronous jobs fail instead is because while workflows are sometimes manipulated by end users, other asynchronous jobs are more oriented towards the system administrator or customizer.

When the result action is Retry, for how long will the job pause before automatically retrying and how much time is there between retries?

The "PostponeUntil" attribute of the asynchronous operation corresponds to the next time it will be retried. The “PostponeUntil” attribute can be retrieved using the SDK. The amount of time to wait until the next retry is calculated considering some deployment settings and grows exponentially on the number of retries. The calculation uses a complex algorithm but these are some default outputs as a function of the RetryCount (the number of times the operation has been retried before):

RetryCount

Time to wait (seconds)

0

36

1

43

2

52

3

62

4

75

>= 5

Suspend

Note that by default, any asynchronous operation retrying 5 or more times will be suspended.

When a job has statusreason "Waiting" and statecode "Suspended", how do I know if it will retry or if it is suspended until it is manually resumed?

You can check the "PostponeUntil" attribute of the asynchronous operation to see the time and date in which it will be automatically resumed. If this value is equal to 9999-12-30 23:59:59 (maximum DateTime value) it means that it is waiting to be manually resumed.

How can I retrieve the “PostponeUntil” attribute of the asynchronous operations?

Because the “PostponeUntil” attribute is not available from the entity form or advanced find, you will need to use the SDK to retrieve this value. The following code sample prints the date and time at which each suspended asynchronous job will resume and the number of times it has been retried.

   1: static void Main(string[] args)
   2: {
   3: CrmAuthenticationToken token = new CrmAuthenticationToken();
   4:     token.AuthenticationType = 0;
   5:     token.OrganizationName = "AdventureWorksCycle";
   6:     CrmService service = new CrmService();
   7:     service.Url = "http://crmserver/mscrmservices/2007/crmservice.asmx";
   8:     service.CrmAuthenticationTokenValue = token;
   9:     service.Credentials = System.Net.CredentialCache.DefaultCredentials;
  10:  
  11:     QueryByAttribute query = new QueryByAttribute();
  12:     query.Attributes = new string[] { "statecode" };
  13:     query.ColumnSet = new ColumnSet(new string[] { "postponeuntil", "retrycount" });
  14:     query.EntityName = EntityName.asyncoperation.ToString();
  15:     query.Values = new object[] { (int)AsyncOperationState.Suspended };
  16:     BusinessEntityCollection bec = svc.RetrieveMultiple(query);
  17:     foreach (BusinessEntity be in bec.BusinessEntities)
  18:     {
  19:         asyncoperation op = (asyncoperation)be;
  20:         string result = String.Format("Operation id={0} PostponeUntil={1} RetryCount={2}",
  21:              op.asyncoperationid.Value,
  22:              op.postponeuntil.UniversalTime,
  23:              op.retrycount.Value);
  24:         Console.WriteLine(result);
  25:     }
  26: }

Cheers,

Gonzalo Ruiz

  • PingBack from http://blog.a-foton.ru/index.php/2009/03/25/when-do-asynchronous-jobs-fail-suspend-or-retry/

  • Thanks for taking the Asynchronous thought to the next level ;)

  • Gonzalo,

    We have some Workflows that fail with the SQL Exception yet we do not have the resume option (via System Jobs). Is there any approach to allow this as we have some workflows that have 10 stages/tasks within them.

    PS...We have tried splitting the workflows down BUT within sub-workflow  this does not retain the context to the entity that started the parent workflow.

    Thanks

    Tom

  • Gonzalo,

    We have some Workflows that fail with the SQL Exception yet we do not have the resume option (via System Jobs). Is there any approach to allow this as we have some workflows that have 10 stages/tasks within them.

    PS...We have tried splitting the workflows down BUT within sub-workflow  this does not retain the context to the entity that started the parent workflow.

    Thanks

    Tom

  • Any way to control from inside a plug-in (running async) weather it goes into complete, fail, suspend or retry?  Would love to benefit from retry logic when an external system is temporarily down.

  • You can control whether the async plugin fails (throw InvalidPluginExecutionException) or succeeds. Unfortunatelly, you can't force it to suspend/retry from within the plugin unless maybe you are doing some SQL operation that throws a SQL exception.

  • Workflows should not fail because of SQL exception. Look at the error code of the operation, maybe there is an additional failure that cannot be re-tried.

  • Could u pls provide me the code to cancel suspended jobs. We have 20k + suspended jobs. We need to cancel these and then delete to clear asyncoperationbase table.

    Thanks in advance

    Regards,

    Ashwini

  • I want to resume systemjob its status is waiting by the program. who can help me how to use SDK function to do it.

    Thank you,

    Josh

Page 1 of 1 (9 items)
Leave a Comment
  • Please add 7 and 6 and type the answer here:
  • Post