Welcome to MSDN Blogs Sign in | Join | Help

SharePoint Portal Server 2003 Crawl Performance Part 1

From time to time, I get asked about how to determine why the SPS 2003 crawler/gatherer take a long time to crawl data and build the index. There are many reasons for this but it takes some digging to really determine the exact cause. In this series I will endeavor to explain steps to isolate and improve the performance of the gatherer.

Hot Fixes 

We need to start with the low hanging fruit first. We need to ensure that your farm is in a good state prior to performance tuning the gatherer. The first part of this is to ensure that you are running a current hotfix level for SPS and WSS 2003. While these hotfixes will not fix all problems they will help the overall health of the gatherer in your farm. I recommend that you at least apply the following hotfix packages:

  1. Windows SharePoint Services Service Pack 2
  2. SharePoint Portal Server Service Pack 2
  3. Post SP2 Hotfix Roll-up Package KB900929
    • This fix corrects a problem where the CPU usage on the server that is running Microsoft SQL Server consumes 100 percent of the CPU time, therefore users and the gatherer cannot access the server.
  4. Post SP2 Hotfix Roll-up Package KB925380 
    • This fix corrects a problem where pages can take 30 to 50 seconds to display. This was primarily noticed by users reporting a long time to render but it also impacted the gatherer.
  5. Post SP2 Hotfix Roll-up Package KB924934
    • This fix corrects a problem that was found in the tquery.dll that was causing mssearch.exe to crash. This is localized and the hotfix that is referenced here is for English. If you are not running English, you can contact Microsoft to request the corresponding hotfix for your desired language.

Registry Key Changes

There are some registry key changes that you may want to include as a part of the environmental configuration. These are not required but they make it easier for you to review the data. If you choose to perform these registry key changes you will need to do them on the server(s) that is/are acting as the Index server(s) in the SPS Farm. Since we are talking about registry keys, it is important to reiterate the words from Microsoft regarding updates to the registry.

Warning Serious problems might occur if you modify the registry incorrectly by using Registry Editor or by using another method. These problems might require that you reinstall your operating system. Microsoft cannot guarantee that these problems can be solved. Modify the registry at your own risk.

  1. This registry key provides the default thousands separator in the data where appropriate.
  2. This registry key displays the Process ID with the Process Names where appropriate.
  3. This registry key displays the Thread ID associated with Thread counters where appropriate.

Setting up Perfmon

When troubleshooting performance problems with a crawl, it is essential that you collect the right performance counters on the Index server(s). Here is a list of counters that you should collect for these types of problems.

  • \ASP.NET\*
  • \ASP.NET Applications(*)\*
  • \Cache\*
  • \LogicalDisk(*)\*
  • \Memory\*
  • \NBT Connection(*)\*
  • \Network Interface(*)\*
  • \Objects\*
  • \Paging File(*)\*
  • \PhysicalDisk(*)\*
  • \Process(*)\*
  • \Processor(*)\*
  • \Redirector\*
  • \Server\*
  • \Server Work Queues(*)\*
  • \Search\*
  • \Search Archival Plugin(*)\*
  • \Search Catalogs(*)\*
  • \Search Gatherer\*
  • \Search Gatherer Projects(*)\*
  • \Search Indexer Catalogs(*)\*
  • \Search Schema Plugin(*)\*
  • \Search Topic Assistant(*)\*
  • \SharePoint Portal Alerts Notification Service(*)\*
  • \SharePoint Portal Server Alert Manager(*)\*
  • \SharePoint Portal Server Alerts Plug In(*)\*
  • \System\*
  • \Thread(*)\*
  • \Web Service(*)\*
  • \Web Service Cache\*

If you are not familiar with the (*)\* and the \* designations they mean the following

  • \*   all counters
  • (*)  all instances
  • (*)\*  all counters and all instances

These counters are specific to SharePoint Portal Server 2003. These counters do not include any perf counters for SQL. The sample interval for these counters will depend on how long it takes to reproduce the problem. If it is a long running crawl you might want to set it to 15 secs per sample. It could be argued higher or lower but this should give you a reasonable picture of what is going on at the time of the crawl.

In a later post we will discuss the counters and what they indicate.

 

Published Tuesday, October 24, 2006 9:07 PM by tonymcin

Comments

Friday, October 27, 2006 8:20 PM by Keith Richie

# Welcome Tony McIntyre to the blogging community!

Tony McIntyre from my old team has started blogging!!!! Check out his first post titled SharePoint Portal

Anonymous comments are disabled
 
Page view tracker