Hello again, Dan Blood here. As I layout some of the lessons I have learned hosting SearchBeta I thought it would be beneficial to let you all know what kind of hardware I am using to support this environment. Be aware that I am not using the most optimal hardware for the task and in some instances I have too much hardware for the job. You should not take the hardware I have listed below verbatim and implement your solution on top of it. If I were to rebuild SearchBata from scratch with purpose purchased hardware I would do it differently. I've highlighted some of the changes I would make below. As these postings progress it is my intent that you will be able to use all of them as a starting point for hardware and monitoring decisions.
SearchBeta is a 3 box farm with one server each for the three main roles: Indexer, SQL and a machine with the Query and Web Front End roles combined . The first thing I would change about this configuration is the number of boxes. We should really have 2, if not 3 Query/Web Front End machines to allow for fail-over and high availability. As it stands now the service is unavailable when I apply OS updates or other server maintenance activities. The second thing I would change is to mirror SQL with a second machine, allowing periodic maintenance and updates without any impact to the service.
This farm is currently running MOSS bits, however it is only using Search functionality. There is no content on the farm, nor do I have Usage Analysis or People import features enabled in the farm. As a result the SQL box is optimized for the MSS feature set, below I have called out how I would change this if I were taking full advantage of the MOSS feature set.
The machines in the farm are defined below:
Query/Web Front End
*SearchBeta is running with a pre-release configuration allowing two SQL file groups to be used. One supports the Crawling tables, while the other contains the tables that are used during the end-user Query. Do not try to do this on your own. Wait for us to publish guidance on how to explicitly do this.
How big is the Data on SearchBeta?
Content Crawled (~28 million documents total)
Even though I am running both the WFE and the Query role on this box it is still has excess CPU capacity. If I were to replace this box, I would go with the exact same config, except I would only use 4 cores. Because the sole purpose of this machine is to respond to Search queries I am able to get away with only 8GB of RAM; using the farm with full MOSS features would require the RAM to be increased.
If I were to replace this box I would not change much. I could live with a little less RAM (12GB ) and I would like to see how it performs with 8 cores versus 4, but this out of curiosity not necessity. The majority of the time this box is under 70% CPU utilization. There are cases when the filter daemon (mssdmn.exe) consumes a lot of CPU and the box spikes at 100%, but this is rare. Adding more CPU capacity may improve crawl speeds.
The performance of this machine does vary quite a bit based on the type of content you are indexing. Specifically around the file format and filter you are using. Deb Haldar covers a lot of details about the different Filters available on his blog (http://blogs.msdn.com/ifilter/default.aspx.) I recommend reading through this if you are installing the Filter Pack or a third party filter. You may want to consider going with 8 cores on the Indexer if you are using some of the more expensive filters, but you will want to investigate and validate how expensive the filter is with your content before making this decision.
Finally the content that SearchBeta crawls is primarily English, we know that Japanese, other non-white space breaking languages and German word breakers consume a lot of additional CPU. Keep this in mind and consider 8 cores if the majority of your content is in one of these locales.
This is the machine that I would change quite a bit if I were to replace it. Regular operation reveals that our initial disk configuration could be improved. Both the Crawl and Query file groups are overly I/O bound and we know that the bottle-neck seen on the Crawl file-group limits the I/O pressure on the Query file group. We want to bump the spindle count up to 10 for both the Crawl and the Query File groups, but there is a concern that by unblocking the Crawl spindles we may need to increase the spindle count even more for the Query drive.
Note also that the "other" drive mention above contains the SharedService_DB, Config, Admin Content, and the corresponding log files for these DBs. This is not an optimal config and a reasonable I/O load on these databases will not perform well. However, this works well on SearchBeta because it is not hosting content, Usage Analysis, People Import or other MOSS features. If I were to host the MOSS feature set on the site I would need to run with 8 cores, 32 GB of Ram, and build-out an additional R1+0 drive for the SharedService_DB.
Look for another post to detail SQL optimization, planning and maintenance in the future...
There is one final note around hardware and backup that I would like to call out. During a backup the crawls are paused, the backup for SearchBeta takes approximately 14 hours which is a significant chunk of time out of your active crawl times. So anything you can do to optimize the speed of the backup and reduce its duration will directly benefit the freshness of the data in the index. SearchBeta is currently backed up to a remote file share, so both the databases and Index need to be written across the network before backup completes. A more optimal solution is to backup to a drive that is local to the SQL box, allowing the biggest chunk of data to bypass the network. Ultimately this will reduce the duration of the backup providing more time to crawl your content.
Network Gigabit versus Megabit
SearchBeta is running on a Megabit network. In general we do not see the network continuously bottle-necked. There are a small number of peak periods where the Indexer is bounded by the network during a crawl (network card performance counter showing "Output Queue Length" greater than 2 ). We might see a slight crawl performance improvement in our crawls if we upgraded to a Gigbit network. This improvement would be very difficult to measure and would not be enough to justify re-cabling your lab to do so. However, as mentioned above reducing the backup duration is something you should pursue. SearchBeta is definitely Network bound when it is backing up.
Thanks and I look forward to speaking with you all soon. The next posting is targeted at building out Crawl schedules and maximizing the crawling that you are doing. If you have a specific topic that you would like to see more information about please post comments to the thread and we will look at getting it into a future article.
Dan Blood Senior Tester Microsoft Corp