Hi all, this topic is an area that has caused me much pain and work. My goal for this was to follow the recommended SQL guidelines while minimizing the impact that these maintenance jobs have on Crawling and Queries. We know from the SQL Monitoring an I/O post that Search is extremely I/O intensive . As it turns out so is all of the regular maintenance that SQL recommends, so finding the right balance between the two is an interesting scheduling task.
As a starting point much information about SQL maintenance and MOSS is covered in the following paper:
There are some key areas from the above paper that I would like to augment here.
DBCC CHECKDB
DBCC CHECKDB is a command used to check the logical and physical integrity of all the objects in a database. SQL Best practices recommend that you run DBCC CHECKDB periodically. For a Search deployment we would recommend that you run DBCC CHECKDB WITH PHYSICAL_ONLY on a regular basis. The PHYSICAL_ONLY option will reduce the overhead of the command. However, due to the cost of running this you should schedule it during off-peak times. The frequency of execution depends on your business needs, but a good place to start is once a week just prior to your back-up. You still need to run DBCC CHECKDB, but less frequently also based on business needs. Perhaps every other or every third back-up.
When running these commands make sure that you have a monitoring process in-place. DBCC only reports errors, it does not fix them unless explicitly specified by other options. You either want to archive the output of the DBCC command for post processing or make sure you have event log monitoring set-up (for example MOM) to check for DBCC errors.
In very large environments you can run DBCC on an off-line (sandbox) copy of the database. This will be less intrusive to end-users and the crawl. In this scenario you would restore your back-up to a separate sandbox and run DBCC CHECKDB in the restored environment.
Fragmentation and index statistics freshness
We started with the proc_DefragIndexes script mentioned above. After running it became obvious that the script was just too expensive to run on a regular basis. To reduce the load placed on the I/O system we took a look at all of our indexes in the Search DB and defragged them one-by-one to measuring query performance along the way. Doing this we were able to identify the indices that provided a performance benefit to the system when they were defragmented. These indexes are listed below:
Optionally there are two additional indexes that you may want to include in your defrag maintenance plan. These indexes do not see much use in typical out of box situations and are commented out in the script. But if your environment is built on a custom UI or makes extensive use of the Advanced Search UI you will see improvements in query latencies if you defrag them.
Once we knew which indexes to defrag we looked at the duration it took for the index to reach a 10% defragmentation rate. From this we adjusted the FILLFACTOR so we could maintain a longer period of time between actually needing a defrag. At this point we are seeing a duration somewhere around 2+ weeks between defrags. Do note that by increasing the FILLFACTOR we did grow the size of the database slightly, the growth rate on SearchBeta was not that large.
We then looked at the cost/benefit of doing a Reorganize versus a Rebuild. This was a interesting discovery for us. Initially we had a script in place similar to proc_DefragIndexes that would choose to Reorganize or Rebuild based on percent fragmentation with 30% being the decision point (IE greater than 30% would do a Rebuild). What we found was a Reorganize was taking over 8 hours with a 10% fragmentation rate and during this time end-user queries suffered dramatically. Out of curiosity and desperation we tried a Rebuild which is supposed to be the more expensive of the two operations. The Rebuild operation is completing in approximately 1 hour while the Reorganize takes as long as 8 hours. The Rebuild operation is more expensive in the sense that you will see some failed queries during the hour that it runs, where as the Reorganize doesn't have as drastic of an effect on the queries, but the overall cost is much higher since you have an 8 hour window where the query performance is degraded. UPDATE STATISTICS: In the experiments we ran we found that simply doing the rebuild (which also updates statistics) that it was not necessary to regularly use this command.
Finally we deployed the script into an environment that utilized SQL Mirroring. Unfortunately this didn't work out very well. The mirror got so far behind that we eventually had to disconnect the mirror and stop the defrag. Going through an analysis of this it became clear that the root cause was that the environment was heavily I/O bound and the defrag script generated more I/O than the system could keep up with. While the mirror was behind end-user query latencies suffered dramatically. To recover from this we ultimately had to improve the hardware by increasing the number of spindles.
To mitigate this we have added a parameter to the script that allows you to reduce the MAXDOP used in the index rebuild. Setting this parameter to 1 on a SQL box that is minimally I/O bound helps, but it may not be enough depending on how constrained the system is. If you are in an environment that is I/O bound (with or without SQL Mirroring) we strongly recommend that you go through a test of the defrag before you go live with the deployment. The easiest thing to try is the following SQL statement:
ALTER INDEX IX_MSSDocProps ON [dbo].[MSSDocProps] REBUILD WITH (MAXDOP = 1, FILLFACTOR = 80, ONLINE = OFF)
The statement above rebuilds the largest index using the lowest possible MAXDOP, this index must be rebuilt OFFLINE so you will need to run this on a test system or during a maintenance window. While this command is running keep an eye on the state of your mirroring with:
Bottom line we feel the rebuild is a much better operation to run and recommend that you:
Stored Procedure syntax:
exec proc_DefragSearchIndexes [MAXDOP value], [fragmentation percent]
-Thanks
Dan Blood Senior Test Engineer Microsoft Corp