We are getting some questions on this through this blog and our codeplex site and I thought that this subject needs a detailed blog post, so here is some information on running RBS maintainer.

 

Connection Strings

 

RBS maintainer takes connection strings from a CLR config file (Microsoft.Data.SqlRemoteBlobs.Maintainer.exe.config) that is present in the same directory as the maintainer executable. You will need to add your connection strings to this file. You will need to add one connection string per database that you want to run RBS maintainer on. It is recommended that the connection strings be encrypted in case you are using SQL authentication (since the password is part of the connection string). If you are using Windows Authentication, you dont need to encrypt your connection strings.

 

The config file created by RBS installer contains the connection string in encrypted form. CLR config files have the limitation that all the connection strings need to be either encrypted or plaintext - you cannot have a combination of some connection strings encrypted and some plaintext. So, if you want to add more connection strings to the config file already created by RBS installer, you will need to either write a program to encrypt them or use a utility (aspnet_regiis.exe) to do it for you. See [1] for more details on how to do this. If you do not want to encrypt your connection strings, feel free to delete the encrypted connection string already present in the config file and add your connection strings in plaintext.

 

Command Line Parameters

 

Once you have all the connection strings in the config file, you can run maintainer executable with command line parameters telling it which database to run on and what set of tasks to do. Here is a brief description of the command-line parameters that you can get by running maintainer without any parameters:

Usage (Available Options):

ConnectionStringName    - This parameter is the name of the connection string.

                                It takes a single argument: <Connection String Name>.

                                The connection string needs to be present in the CLR xml config file.

Operation               - This parameter is the operation(s) to perform.

                                The parameter takes between 1 and 4 arguments: <Operation1 [Operation2 [Operation3 [Operation4]]]>.

                                Operations must be <ConsistencyCheck, GarbageCollection, Maintenance, ConsistencyCheckForStores, ForceFinalize>.

                                ForceFinalize cannot be specified in combination with any other operation.

GarbageCollectionPhases - This parameter is the phase(s) of garbage collection to run or finalize.

                                The parameter takes a single argument: <Phase(s) of garbage collection to run or finalize>.

                                This argument needs to be 1 or more letters from <r, d, o>.

                                r: Reference Scan, d: Delete Propagation, o: Orphan Cleanup.

ConsistencyCheckMode    - This parameter is the mode for consistency checks.

                                The parameter takes a single argument: <Consistency Check Mode>.

                                This argument needs to be <c, r, b>.

                                c: Check only, r: Check and attempt to repair any issues found,

                                b: Check, repair and re-build RBS internal information without losing customizations.

ConsistencyCheckExtent  - This parameter is the extent for consistency checks.

                                The parameter takes a single argument: <Consistency Check Extent>.

                                This argument needs to be <m, c>.

                                m: Metadata only, c: Complete check.

TimeLimit               - This parameter is the limit on the amount of time to run specified operation(s).

                                The parameter takes a single argument: <Time Limit in Minutes>.

                                This argument needs to be a positive integer.

ConsistencyCheckForStores - This parameter enables selecting blob stores on which to run store specific consistency check operations.

                                This parameter has a default value of All, and can take the following blob store arguments: <BlobStoreName1 [BlobStoreName2 [...]] Default:All>

                                Takes the names of blob stores as arguments.

 

The different operations available in maintainer are:

1.       GarbageCollection. See [2] for details on the different phases of garbage collection. Pool Slicing is a new feature added in RBS 2008 R2 that allows GC of pools to happen incrementally, i.e. one slice at a time instead of the whole pool at one go. The slicing is done based on the create timestamp of the blob. This allows making incremental progress on garbage collecting huge pools that contain hundreds of millions of blobs without requiring a long maintenance window. Slicing applies only to the Orphan Cleanup phase and is used only if the provider implements time filtered enumeration (EnumerationOptimizationLevel is higher than Basic). The RBS Filestream provider implements this. Config keys that can be used to tune Garbage Collection operation are: delete_scan_period, orphan_scan_period, garbage_collection_time_window. If you want to completely remove all deleted blobs immediately, set all the 3 config items above to 'time 00:00:00' and run maintainer for GC phases RS and DP (rd). This is useful if you want to unregister a blob store / uninstall a provider.

The command line switch GarbageCollectionPhases is required for this operation. Examples (these examples assume that you have a connection string in your config file named RBSMaintainerConnection, whch is the default string added by the RBS installer):

·         Microsoft.Data.SqlRemoteBlobs.Maintainer.exe -ConnectionStringName RBSMaintainerConnection   -Operation GarbageCollection -GarbageCollectionPhases rdo

o   This runs the 3 phases of garbage collection.

 

2.       ConsistencyCheck. This does a consistency check on the RBS internal tables and can optionally try to repair any issues found. If you specify this operation, you also need to specify ConsistencyCheckMode. Optionally, you can also specify ConsistencyCheckExtent to choose whether to check metadata only or check for the validity of each BlobID (default is metadata). Config keys that can be used to tune it are: max_consistency_issues_found, max_consistency_issues_returned. Examples:

·         Microsoft.Data.SqlRemoteBlobs.Maintainer.exe -ConnectionStringName RBSMaintainerConnection -Operation ConsistencyCheck -ConsistencyCheckMode c

o   This does consistency check only (no repair) on the RBS metadata.

·         Microsoft.Data.SqlRemoteBlobs.Maintainer.exe -ConnectionStringName RBSMaintainerConnection -Operation ConsistencyCheck -ConsistencyCheckMode r -ConsistencyCheckExtent c

o   This does consistency check and attempts repair on RBS metadata as well as each BlobId.

 

3.       ConsistencyCheckForStores. This does consistency check on blob stores associated with the RBS database. For this option to work, the provider needs to implement consistency checks (its ConsistencyCheckLevel must not be None). This is a new feature added in RBS 2008 R2 and has been implemented by the RBS Filestream Provider. Config keys that can be used to tune it are: . Examples:

·         Microsoft.Data.SqlRemoteBlobs.Maintainer.exe -ConnectionStringName RBSMaintainerConnection -Operation ConsistencyCheckForStores

o   This runs consistency check on all the registered stores if their providers support it.

·         Microsoft.Data.SqlRemoteBlobs.Maintainer.exe -ConnectionStringName RBSMaintainerConnection -Operation ConsistencyCheckForStores -ConsistencyCheckForStores MyBlobStore1 MyBlobStore2

o   This runs consistency check on the two specified blob stores (if their providers support it).

 

4.       Maintenance. This does some maintenance on RBS internal tables, which mainly consists of reorganizing indexes. It is a good idea to run maintainer with this operation once in a while to ensure good performance. Examples:

·         Microsoft.Data.SqlRemoteBlobs.Maintainer.exe -ConnectionStringName RBSMaintainerConnection -Operation Maintenance

o   This does maintenance of RBS indexes.

 

5.       ForceFinalize. RBS Maintainer is designed to make incremental progress on garbage collection operations. If a phase of GC is not completed before the allotted time is up, the progress is saved and it is picked up again the next time RBS Maintainer is run for that operation. This mechanism is explained in [2] below as well. Sometimes it may be desired to “forget” this saved information about in-progress garbage collections and start fresh. The ForceFinalize operation serves that purpose: it removes saved information about one or more GC phases. The next time maintainer is run with those GC phases, they start from the beginning. The command line switch GarbageCollectionPhases is required for this operation. Examples:

·         Microsoft.Data.SqlRemoteBlobs.Maintainer.exe -ConnectionStringName RBSMaintainerConnection -Operation ForceFinalize -GarbageCollectionPhases rd

o   This does ForceFinalize for the RS and DP phases of garbage collection.

 

Options:

TimeLimit. Currently this is the only option available. If this is specified, RBS maintainer will try to stop after the specified duration even if all its tasks are not complete. If it is not specified, maintainer will continue to run until all the specified tasks (operations) are completed. This option allows scheduling RBS maintainer to run during regular maintenance windows (e.g. from midnight to 2 AM) and stop at a predictable time. If maintainer had to stop before the tasks were completed, they will be picked up the next time maintainer is run with those operations specified. [2] talks about how this option interacts with some config keys.

 

You can combine multiple operations from above in one command line. You can specify upto 3 operations at one time (the usage message pasted above says upto 4, but there is a bug currently which limits it to 3) Examples:

·         Microsoft.Data.SqlRemoteBlobs.Maintainer.exe -ConnectionStringName RBSMaintainerConnection   -Operation GarbageCollection ConsistencyCheck  -GarbageCollectionPhases rdo -ConsistencyCheckMode r

o   This does the 3 phases of garbage collection and consistency check on RBS metadata only with attempt to repair.

·         Microsoft.Data.SqlRemoteBlobs.Maintainer.exe -ConnectionStringName RBSMaintainerConnection   -Operation GarbageCollection ConsistencyCheck ConsistencyCheckForStores  -GarbageCollectionPhases rdo -ConsistencyCheckMode r

o   This does the 3 phases of garbage collection, consistency check on RBS metadata only with attempt to repair and consistency check on all registered stores whose providers support consistency check.

·         Microsoft.Data.SqlRemoteBlobs.Maintainer.exe -ConnectionStringName RBSMaintainerConnection   -Operation GarbageCollection ConsistencyCheck Maintenance  -GarbageCollectionPhases rdo -ConsistencyCheckMode r

o   This does the 3 phases of garbage collection, consistency check on RBS metadata only with attempt to repair and maintenance of RBS indexes.

·         Microsoft.Data.SqlRemoteBlobs.Maintainer.exe -ConnectionStringName RBSMaintainerConnection   -Operation GarbageCollection ConsistencyCheck Maintenance  -GarbageCollectionPhases rdo -ConsistencyCheckMode r -TimeLimit 120

o   This does the 3 phases of garbage collection, consistency check on RBS metadata only with attempt to repair and maintenance of RBS indexes and puts a time limit of 2 hours for processing all these things. If they don’t complete in approximately 2 hours, maintainer saves its progress and stops.

 

 

 

Monitoring Progress

 

The view mssqlrbs.rbs_blob_details shows a list of all the blobs that RBS thinks are currently in use by the application. Running the Reference Scan (r) phase of GC moves blobs out of these and marks them internally for deletion during the Delete Propagation (d) phase after the backup/restore SLA time window (garbage_collection_time_window) has elapsed. The view mssqlrbs.rbs_consistency_issues shows the list of consistency issues found by the maintainer. For measuring performance, troubleshooting or just for curiosity, you can look at views mssqlrbs.rbs_history and mssqlrbs.rbs_counters to see the progress of maintainer tasks.

 

Scheduling RBS Maintainer

 

RBS does not come with its own way of scheduling maintainer. RBS maintainer is a standalone executable and you will need to schedule it yourself. One way to do that is to use Window Task Scheduler - this is what the RBS installer brings up if you select that option while installing RBS. [1] has some sample steps to schedule it using Windows Task Scheduler. Another way to schedule it is to have a job in SQL Server Agent. You are also free to use your own mechanisms to do the scheduling.

 

References

 

[1] http://sqlrbs.codeplex.com/Thread/View.aspx?ThreadId=204627 - this thread has sample steps to add encrypted connection strings to the maintainer config file and to schedule maintainer task using Windows Task Scheduler. Running maintainer once will only perform the tasks on one database. So if you have multiple databases, you will need to run maintainer multiple times (once for each database).

 

[2] http://blogs.msdn.com/sqlrbs/archive/2008/08/08/rbs-garbage-collection-settings-and-rationale.aspx has more information on the different phases of garbage collection and the TimeLimit option.

 

 

As always, feel free to ask questions if you have any – either through comments on this blog or by creating a new discussion thread on the discussions page on our codeplex site.

 

- Pradeep.