If you have researched about replication enhancement, for sure you would have heard about this Parameter which is applicable for both snapshot agent and distribution agent.
I'm going to share some useful information regarding this parameter which you can leverage in your replication environment.
What is MaxBCPThread all about?
When used with the Snapshot Agent, MaxBCPThreads affects the time it takes to generate a snapshot.
When used with the Distribution Agent, MaxBCPThreads affects the time it takes to apply the snapshot at the Subscriber.
So this parameter will be useful to expedite the snapshot delivery process for a huge publication.
You might be aware that first step in replication is creating a snapshot (though there are ways to initialize replication without a snapshot). Snapshot process exports the schema, index, triggers and BCP Out the data from the articles (tables) part of the publication as flat files in the snapshot folder
If you look at the snapshot folder, you will see one file for an article for each of these categories (not limited to):
Tip: T-SQL to get the snapshot folder for the distributor:
DECLARE @distributorsrv sysname, @snapshotfolder nvarchar(255)EXEC sp_helpdistributor @distributor = @distributorsrv OUTPUT, @directory = @snapshotfolder OUTPUTSELECT @@SERVERNAME as 'Publisher', @distributorsrv AS 'Distributor' , @snapshotfolder AS 'Snapshot Folder for the distributor'
Now when you add -MaxBCPThreads 5 to your snapshot agent and run snapshot agent, there will be 10 BCP files created in the snapshot folder.
If you query sysprocesses during the snapshot generation, you will notice that all the threads spawned will show program name as Snapshot BCP thread #n for publication xxx
The value 5 here is just an example. Actual value should not exceed the number of processors in the publisher server that gives you a chance to try out multiple value to find the optimal value for your publication.
The number of BCP files generated is always 2 * MaxBCPThread value.
Now that we have mulitple BCP files, distribution agent can INSERT the data from these files in parallel, if you specify the MaxBCPThread value for the Distribution agent equal to the number of BCP files created by the snapshot agent. Specifying a high number for MaxBCPThreads can overburden the system, because the system must spend too much time managing threads. So make sure that this value does not exceed the number of processors in the Distributor and Subscriber while configuring for Distribution agent.
MaxBcpThreads must have a value greater than 0 and has no hard-coded upper limit.
The default is 2 times the number of processors, up to a maximum value of 8 which is calculated internally but the parameter passed will -MaxBCPThreads 1
Using more threads than the total number of articles provides no additional benefit.
When applying a snapshot that was generated at the Publisher using the concurrent snapshot option, one thread is used, regardless of the number you specify for MaxBcpThreads.
Some more suggestions to improve snapshot delivery process:
1. Pre-configure subscriber database with proper initial size so that we will not encounter autogrowth penalty (Remember Instant File Initiation is not applicable for LDF files). Rule of thumb is to set Database and Log files to the same size as of Publication database
2. Configure faster throughput disks where snapshot folder and subscriber database files are stored
3. Using a separate disk drive for the transaction log for all databases involved in replication
4. Running the Snapshot Agent only when necessary and at off-peak times
5. Placing the snapshot folder on a drive not used to store database or log files
6. Reducing the verbose level of replication agents to 0, except during initial testing, monitoring, or debugging
7. Specify UseInprocLoader for the distribution agent - Dist Agent will use the in process BULK INSERT operation, decreasing the amount of time taken to apply the snapshot. It is recommended that you test the performance gain using your publication because this parameter depends on the line quality & speed, available memory, number of articles in your publication etc.
This article is written with inputs from http://blogs.msdn.com/b/repltalk
SQL Server Support, Microsoft
Thanks for the tip. While upgrading a distributor databases from SQL2K to SQL2K8, I observed while delivering snapshots for a new publication that the distribution agent kept delivering the snapshot multiple times until it failed with an error complaining about not finding one of the replication stored procedures even though they were on the subscriber. The publications were removed and recreated on the SQL2K8 server with scripts and I only got this to work by manually re-creating the publication and adding articles.
Any idea what might be the cause of this?
By any chance, did you change the collation of the databases involved?
I didn't. We've always used SQL_LATIN1_GENERAL_CP1_CI_AS. It's the first time I've observed this behaviour even though I've done several upgrades similar to this.
Just to give a background, we upgraded from
SQL2K(Publisher) => SQL2K (Distributor) => SQL2K5 (subscriber)
SQL2K (Publisher) => SQL2K8R2 (Distributor) => SQL2K5 (Subscriber)