SharePoint Portal Server 2003 Crawl Performance Part 3
To continue the discussion of Crawl Performance, we will look at some additional counters that help with further diagnosing the state of the crawler.
Here we will expand our subset a little bit and see what information can be collected from them.
What are the counters trying to convey?
In the Search Gatherer object some additional counters offer a glimpse into the bigger picture.
- Search Gatherer\Filter Process Created
- This counter will show you the number of filtering processes or daemons (MSSDmn.exe) that have been created since the last time MSSearch.exe has been recycled or the machine has been rebooted. This is an interesting number to watch because it can be an indicator of the health of the IFilter's on your system. For example, if you have an IFilter that is prone to crashing, it will take the filter process down and MSSearch will need to start a new one. This is bad when you have a lot of content that needs to be crawled and the daemons keep crashing. When I say high for this counter, I am referring to over 300-400 have been created in a 24 hour period. It is possible that you have a lot of different crawl schedules that will cause daemons to start and stop and if this is the case then you don't necessarily have a problem. If you feel the number is too high, check the application event log. If the daemons are crashing, they will typically leave an event id 1000 in the event log. If you have the Dr. Watson reporting features enabled, you will also see a corresponding event id 1001 logged after each event id 1000. These application event IDs 1000 and/or 1001 are helpful in determining the IFilter that is crashing.
- Search Gatherer\Filter Processes
- This counter will show you the number of filtering processes or daemons (MSSDmn.exe) that are currently running on your machine. This is helpful to reference when you are trying to get a feel for the amount of effort the MSSearch process believes that your machine should exert to perform the crawl. This number is somewhat tied to the Search Gatherer\Performance Level counter that we will discuss later in some depth.
- Search Gatherer\Filter Processes Max
- This counter will show you the maximum number of filtering processes or daemons (MSSDmn.exe) that have been started since the last reboot or the last recycle of the MSSearch process. I really only use this counter as a reference.
- Search Gatherer\Filtering Threads
- This counter combined with the next counter will directly impact the performance of your crawl. Each thread listed here is the equivalent to one IFilter and one main document to be crawled. If this number is low (2-5) then you will have a longer crawl than if this number were higher(14-30+). Think of each of these threads as a line at the grocery store. If there are thousands of people in the store and only 2 people that can exchange money at the check out station, then all of the people will stack up and wait their turn. In there are 30+ people in the store that can exchange money at the check out station, then the lines will get cleared out more quickly.
- Search Gatherer\Idle Threads
- This counter indicates the number of filtering threads that are idle. So if you have 32 Filtering Threads and 0-2 of them are idle, then you are still filtering 30-32 documents at one time. If all 32 are idle and you have documents to be crawled, it is possible that you have a problem that needs to be investigated further. This investigation would start in the application event log and then move to the diagnostic logging provided by the product. These diagnostic logs are accessed via the Central Administration page and are under the Diagnostic Logging link at the bottom.
In a later post we will discuss additional counters and what they indicate.