AsiaTech: Microsoft APGC Internet Developer Support Team

We focus on various troubleshooting plan and solution on IIS web platform and distributed applications

Unable to suspend or terminate active service instance in Biztalk

Unable to suspend or terminate active service instance in Biztalk

  • Comments 2

 

Symptom

 

When try to suspend or terminate a active service instance, the instance can not be suspended or terminated for a long time, it keeps active with Pending Job of the instance is set to Suspend or Terminate

 

Analysis

 

A service instance is in Active status means that it is still actively running in a host instance and hasn’t reached the next persistence point. Suspend and Terminate operations are designed as operation which will only be executed at the next persistable point. The Suspend or Terminate operation will be put into the pending operation table when the target service instance is in Active status. Please note BizTalk only hold one pending operation for a single instance. The instance will keep Active status with the queued pending operation until the next persistence point is reached. Let’s use a simple orchestration example to demo the behavior.

biztalk1.JPG

biztalk2.jpg

 

The code is in Expression_1 is as the below which is used to simulate long time processing in a host instance or hang in a host instance.

 

System.Threading.Thread.Sleep(5*60000);

 

When drop a testing message to activate the above simple orchestration, it will keep Active status for about 5 minutes and then an output message will be sent out. If try to suspend the active instance when the instance is sleeping in the host instance,  the instance will keep in Active status for a quite while with Pending Job is set to Suspend until the next persistence point - send shape is reached. One interesting thing is that the orchestration instance will be started from the last persistence point to continue the execution if the running host instance is restarted, our simple orchestration will start from the beginning to re-execute the whole orchestration code again if the host instance is restarted during System.Threading.Thread.Sleep(),we will see the instance will keep Active status with the pending operation Suspend for another 5 minutes. Now we got a problem, If the code in the expression shape is changed as the below to simulate a real hang situation.

 

while(1==1){System.Threading.Thread.Sleep(5*60000);}

 

We will find we can’t suspend the Active instance. The orchestration instance will keep Active with the pending operation Suspend forever even if the running host instance is restarted in BTS 2K6 and 2K6R2.

 

The ways to handle a long time active or hang instance as the above

 

1.  The format way should be to find where is the instance active or hang. The HAT debugging or a hang dump file for the running host instance can be used to find out where the instance processing is blocked. If the processing block or the hang can be fixed, then the instance can quickly move to the next persistable point, the pending operation or the other operations can get a chance to execute.

 

2. If you don’t want to spend the time to figure out where the blocking is and just want to simply clean out the instance from the MessageBox, use the tool Terminator to terminate these instance hardly. You can download the Terminator from the below link. The following is the captured screen for Terminate Instance (Hard) for the reference.

 

http://blogs.msdn.com/biztalkcpr/pages/biztalk-terminator-download-install-info.aspx

biztalk3.JPG

3. As the above, if just want to suspend or terminate the instance, besides the Terminator, you also can call the internal store procedure int_AdminSuspendInstance_<host> or int_AdminTerminateInstance_<host> directly to suspend or terminate the instance in MessageBox database. The following are the two SQL script sample to use the two store procedures.

 

Hard Suspend:

 

declare @ApplicationName nvarchar(128)

declare @uidInstanceID uniqueidentifier

declare @uidServiceID uniqueidentifier

declare @fKnownInstance int

declare @nvcErrorString nvarchar(512)

declare @dtTimeStamp datetime

declare @spname nvarchar(512)

Begin Tran

set @ApplicationName='testhost'

set @uidInstanceID='xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'

set @uidServiceID='xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'

set @dtTimeStamp=GETUTCDATE()

select @nvcErrorString = nvcError FROM dbo.LocalizedErrorStrings WHERE nID = 4

set @spname= 'int_AdminSuspendInstance_' + @ApplicationName

exec @spname @uidInstanceID, @uidServiceID, N'0xC0C01B50', -1, @nvcErrorString, 1, null, @dtTimeStamp, null, null, @fKnownInstance OUTPUT

DELETE FROM InstancesPendingOperations WITH (ROWLOCK) WHERE uidInstanceID = @uidInstanceID OPTION (KEEPFIXED PLAN)

Commit Tran

 

Hard Terminate:

 

declare @ApplicationName nvarchar(128)

declare @uidInstanceID uniqueidentifier

declare @uidServiceID uniqueidentifier

declare @fKnownInstance int

declare @spname nvarchar(512)

Begin Tran

set @ApplicationName='testhost'

set @uidInstanceID=' xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx '

set @uidServiceID=' xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx '

set @spname= 'int_AdminTerminateInstance_' + @ApplicationName

exec @spname @uidInstanceID, @uidServiceID, @fKnownInstance OUTPUT

DELETE FROM InstancesPendingOperations WITH (ROWLOCK) WHERE uidInstanceID = @uidInstanceID OPTION (KEEPFIXED PLAN)

Commit Tran

 

*note*    

(1) The two store procedures are used by BizTalk internally to suspend or terminate an instance, the implementation and the SP interface could be changed in future. The above sample scripts are based on BTS2K6 and BTS2K6R2

(2) You must modify the script to provide the BizTalk Host name, the uidInstanceID and uidServiceID of the target service instance to the variables @ApplicationName, @uidInstanceID and @uidServiceID before execute the script.

(3) Stop the running host instance before execute the script in order to avoid any inconsistence error between the status of the running instance in the host process memory and the status of the instance persisted in the database.

*note* In multiple boxes BizTalk environment, the host instance which an active instance is running in can be found by looking “Processing Server” and “Host” columns of the service instance in BizTalk admin console. The columns for “Processing Server” and “Host” are not listed in the BizTalk group hub UI by default and they can be added manually.

(4) As you can see in Terminator when use Terminate Instance (Hard), the above script should also be used as a last resort. You should be very clear what you want to do and what the impact could be in business before decide to use the script.

 

4. In BTS2K6 R2 + KB969987 (http://support.microsoft.com/kb/969987) and BTS2K9, it is easier to terminate or suspend such active instance. If you see the instance keep Active with the Pending Operation set after terminate or suspend an active instance in BTS2K6 R2 + KB968897 and BTS2K9, simply restart the host instance which the target service instance is running with, you will see the Pending operation(Terminate or Suspend) got executed immediately after the host instance is restarted. This is because that when a host instance is restarted in BTS2K6 R2 + KB969987 or in BTS2K9, it will check if there is a Pending Operation for every service instance which was owned by this host instance previously, the Pending Operation for an instance will get executed immediately if it is found.

 

Regards,

 

XiaoDong Zhu

 

Leave a Comment
  • Please add 1 and 4 and type the answer here:
  • Post
  • Hi,

    i do agree with the last step as it is happening with me the same. if we restart the host instance the active instance completes the suspend mode. But my orchestration every time its hanging up in active mode and i want to process the instance. Can anyone help me what is the root cause for it and how can i resolve this.

  • I also have the same problem, some service instances keep processing message, but without complete it. As the sendport is set as "ordered delivery", all messages are pending before all "in process" messages are handled.

    Any idea of the root cause and how to solve it?

    Thanks.

Page 1 of 1 (2 items)