Holger Linke Microsoft SQL Server Escalation Services
Provides details and explanations why SQL Server Merge Replication does not support central subscriber topologies. This fact was originally stated in SQL Server Books Online and further documented in KB article KB 2750005, but is still causing some misconceptions and confusion.
This blog article will provide additional thoughts and background information about why configuring central subscribers with Merge Replication is not a good idea.
The Central Subscriber model
In a central subscriber model, the subscriber database synchronizes with two or more publishing sources. The subscriber’s publishing sources will consist of one of the following combinations:
Practical examples for these topologies are:
SQL Server will allow you to configure such a topology, because there is no good way to check for the conditions when creating the subscriptions. In most cases, it will work directly after setting it up, and you actually might never have a problem later on.
The inherent problems with this kind of topology usually start to surface if you want to change some aspects of it, like removing and re-adding articles, subscriptions, or publications. This might become necessary after some system failure or accident, or if you need to change the publications or subscriptions to address application design improvements.
Merge Replication Design Considerations
Merge Replication had been designed to deliver the following feature set:
The typical scenario to be addressed by Merge replication was the disconnected sales person, who would be synchronizing on-demand without a specific schedule. The Central Subscribers feature was never on the design list, and therefore wasn’t reflected in the initial implementation. It was also never properly tested
The first version of SQL Server Merge replication had been released with SQL Server 7.0 back in 1998. It has received major feature upgrades in SQL Server 2000 and 2005, and has further evolved to SQL Server 2012. The core metadata tables have remained practically unchanged though throughout the versions.
Synchronization of metadata
The Merge Replication metadata of each replica database is stored in a single set of metadata tables. If several subscriptions are created in the same database, the metadata of all subscriptions will be stored in the same set of system tables. The replication metadata is therefore shared by multiple merge agents, all reading and writing from/to the same system tables.
Each time a merge agent synchronizes the publisher with the subscriber database, it will synchronize publication and subscription information between their configured synchronization partners. For example, rows from the subscriber’s sysmergesubscriptions system table are uploaded to the publisher, the publisher’s rows are downloaded to the subscriber. This is necessary for the conflict handling in Republishing topologies: the main publisher and each republisher needs to know about each node in the topology, in order to resolve conflicts according to the replica priorities you had configured. If a new republisher or subscriber is added on one branch of the topology, the metadata for it needs to be uploaded to the main publisher first before it can be downloaded into the other branches.
And it has to be done on each synchronization: if for example you have to restore a node from an older backup, the node will always receive current information from their synchronization partners.
So what’s the problem then?
If you have configured a central subscriber topology, you will have two or more merge agents synchronizing the subscriber metadata with its publisher partners. The potential disaster comes from the fact that each merge agent exchanges not only the metadata for its own publication and subscription, but also for the unrelated other publications and subscriptions.
If e.g. you drop a subscription, the metadata information will be removed immediately from the publisher and the subscriber, but not from the other replicas in the topology. In a central subscriber scenario, if the other merge agents at that subscriber synchronize with their partners, they will re-introduce the outdated subscription information back into replicas, and may remove the information about the deleted subscription at their partner. Because of the disconnected nature of the subscribers, the incorrect metadata might start to oscillate through the topology.
Another aspect is related to running several merge agents in parallel, servicing the same subscriber database at the same time. The potential issues usually start to show if the topology reaches a certain size and complexity. Contributing factors could be the number of publishers and subscribers, the number of data partitions for dynamic filters, the amount of metadata stored in the change tracking system tables, and the complexity of the dynamic filter design.
We have seen support cases with specific symptoms, where the only common denominator were the central subscribers. It included scalability problems due to blocking, or due to the way how the change evaluation is handled, or unexpected conflicts and retries. These issues are typically impossible to reproduce in a test environment, because in most cases you can’t duplicate the complexity and timing of your production environment. So you usually can’t scientifically prove that the central subscriber is the root cause. But you still may see that tweaking the agent schedules or removing the central subscribers from the topology canlessen or avoid those issues.
The resulting issues might be unpredictable, intermittent, and unexpected. They may appear at a much later time than the step that actually caused the symptom.
The following examples are the simplest and most obvious ways to demonstrate metadata issues resulting from a central subscriber topology. The first would also cause direct errors, as they are documented in article KB 2750005. Variations of these steps may lead to other symptoms or subtle differences.
Example 1: Publications in separate publication databases are synchronizing into the same subscriber database
Consider the following configuration and steps:
After reaching this point, your topology has been compromised. The incorrect PUB1/GUID1 combination will continue to oscillate through the system, and there is no practical way to get this subscriber back into the topology.
Example 2: Two or more publications in the same publisher database are synchronizing into the same subscriber database
After reaching this point, your topology maintains information about PUB1/GUID1 and PUB1/GUID3. This doesn’t present an immediate issue, as the orphaned old metadata and the current new metadata appear to be able coexist. If the steps are repeated in large topologies, the metadata tables will be bloated unnecessarily. The effects on Republishing topologies haven’t been fully understood because of the increasing complexity of the agent interactions.
A central subscriber might work for you immediately after configuring the topology. It might continue to work, as long as you can avoid applying changes to your topology, like removing and re-adding articles, subscriptions, or publications.
But if you get into issues that are related to the central subscriber topology, there is nothing that may help you except for dropping the entire topology and starting over from scratch. This usually implies a major administrative challenge, especially if you need to recover from a failure in an emergency situation.
Regarding enhancing Merge replication to make central subscribers a supported scenario: This would require a major redesign of the core of Merge replication and the way how the metadata is stored and handled. It would effectively mean the development of a completely new type of replication. If you take into account that Merge replication has evolved over several versions of SQL Servers, starting with SQL Server 7.0 backin 1998, you may see that this just wouldn’t be a simple step in the next Cumulative Update or Service Pack.
So the strong recommendation is to avoid central subscriber topologies in Merge Replication, and choose a different configuration instead. An alternate solution could be a single central merge publication that provides data partitions based on a parameterized row filter for the individual subscriber databases. Or use transactional replication instead of merge replication, if you really have to rely on this type of topology.
Article KB 2750005 in the KnowledgeBase: Merge replication does not support centralized subscriber topologies
Books Online for SQL Server 2012: Publish Data and Database Objectshttp://msdn.microsoft.com/en-us/library/ms152559.aspx(Note: Refer to the topic “Considerations for Publishing” - "Publishing Tables in More Than One Publication" in this article.
Books Online for SQL Server 2008 R2: Integrating Data from Multiple Sites (Client)http://msdn.microsoft.com/en-us/library/ms151790(v=sql.105).aspx (Note: Refer to the topic “The type of replication to use for this scenario” in this article)