All About Automatically Monitoring Replication Agent Failures

Vijay Rodrigues
Microsoft SQL Server Support

SQL Server Management Studio provides a way to monitor events, such as replication agent events, using alerts. SQL Server includes a set of predefined alerts for replication agents. If you would like to start the monitoring for replication agent failures, please follow below steps. The below is configured on the distribution instance (for both push or pull subscriptions). The below example is for Transactional Replication.

1. In SQL Server Management Studio, expand SQL Server Agent and then expand Alerts

image

2. Right click ‘Replication: agent failure’ and click on Properties.

clip_image004

3. Note in the above screen shot that this alert is not enabled by default. Check the box to Enable the alert then click OK (no other changes necessary on this screen unless you would like to customize). The type “SQL Server event alert” tell SQL Server to log the message in the Microsoft Windows event log.

4. You should now see the alert enabled (without the red arrow) in SQL Server Management Studio indicating SQL Server will write next agent failure to event log

image 

Example of such a failure in application event log:

clip_image008

About SCOM monitoring for SQL replication: If the only requirement is to monitor replication alert, you don’t need the SQL management pack (which is available in SCOM catalog). The replication alert mentioned can be monitored using custom SCOM's management pack. Using SCOM and below email, we can get an alert generated for this event recorded in the event log.

Here is the TechNet link to create a Unit monitor to check when an event raised in the Windows Application Event log.

http://technet.microsoft.com/en-us/library/bb309597.aspx

About replication agent message IDs: The above section mentions how to enable alerts for replication agent failures. The message ID for such a failure is 14151.

image

The following alerts are installed when a computer is configured as a Distributor:

Message ID

Predefined alert

Condition causing the alert to fire

Enters additional information in sysreplicationalerts

14150

Replication: agent success

Agent shuts down successfully.

Yes

14151

Replication: agent failure

Agent shuts down with an error.

Yes

14152

Replication: agent retry

Agent shuts down after unsuccessfully retrying an operation (agent encounters an error such as server not available, deadlock, connection failure, or time-out failure).

Yes

14157

Replication: expired subscription dropped

Expired subscription was dropped.

No

20572

Replication: Subscription reinitialized after validation failure

Response job 'Reinitialize subscriptions on data validation failure' reinitializes a subscription successfully.

No

20574

Replication: Subscriber has failed data validation

Distribution or Merge Agent fails data validation.

Yes

20575

Replication: Subscriber has passed data validation

Distribution or Merge Agent passes data validation.

Yes

20578

Replication: agent custom shutdown

   
 
Example of an ALERT recorded in Windows Application Event Log

clip_image010

More information:

Using Alerts for Replication Agent Events
http://msdn.microsoft.com/en-us/library/ms151752(SQL.90).aspx

How to: Configure Predefined Replication Alerts (SQL Server Management Studio)
http://msdn.microsoft.com/en-us/library/ms152496(v=SQL.90).aspx